微信连接本地 deepseek 实现自动聊天

参考内容

【AI客服deepseek】deepseek接入微信（本地部署deepseek集成微信自动收发消息）

黑马程序员AI大模型开发零基础到项目实战全套视频教程，基于DeepSeek和Qwen大模型，从大模型私有化部署、运行机制到独立搭建聊天机器人一套通关_哔哩哔哩_bilibili

本地部署

Ollama 安装

Ollama 是一个开源的本地大语言模型运行框架，专为在本地机器上便捷部署和运行大型语言模型（LLM）而设计。

Ollama 官网

模型安装

在 Ollama Search 中选择想要下载的模型，使用 cmd 终端运行相应代码即可。

需要注意的是 ollama 默认安装在 C 盘目录 AppData 下的 ollama 文件夹中，如果内存不够可以在环境变量中配置 OLLAMA_MODELS，路径自选。

可视化页面

Chatbox 安装

Chatbox AI 是一款 AI 客户端应用和智能助手，支持众多先进的 AI 模型和 API，可在 Windows、MacOS、Android、iOS、Linux 和网页版上使用。

Chatbox 官网

安装 chatbox 可以在窗口应用中使用相应本地模型，代替了在 cmd 终端中运行。

Ollama Web UI Lite 安装

Ollama Web UI Lite 是 Ollama Web UI 的简化版本，旨在提供简化的用户界面，具有最少的功能和更低的复杂性。该项目的主要重点是通过完整的 TypeScript 迁移实现更清晰的代码，采用更加模块化的架构，确保全面的测试覆盖率，并实施强大的 CI/CD 管道。

ollama-webui/ollama-webui-lite: This repo is no longer maintained, please use our main Open WebUI repo.

本地解压后，在终端中打开，使用 npm ci 安装依赖，然后使用 npm run dev 即可运行。

自动聊天

实现原理

通过本地代码监测微信聊天窗口，获取到消息后调用本地 deepseek 模型，并将 deepseek 的回复发送到聊天窗口。

需要安装 wxauto 自动控制微信界面实现收发信息。

此时我使用的 Python 版本是 3.10.6，微信版本为 3.9.12.51，与视频演示中的版本不同。

发送消息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


from wxauto import Wechat

wx= Wechat()
# 指定人/群发消息
wx.sendMsg(msg="你好"，who="用户ID")

# 群消息+@指定人
# wx.sendMsg(msg="你好"，who="..."，at=[“...",“...”])

# wx.sendMsg(filepath="F:\xxx.png"，who="...")
# wx.sendMsg(filepath=["文件A",“文件B",“F:\xxx.png"]，who=“...”)

接收所有消息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


from wxauto import wechat

WX = Wechat()

# {
#     "用户昵称A": [消息对象1, 消息对象2, 消息对象3],
#     "用户昵称B": [消息对象1, 消息对象2, 消息对象3]
# }

while True:
	# 等待接受收到的最新消息
	msg_dict= wx.GetNextNewMessage
	for username, msg_list in msg_dict.items():
		print("昵称:", username)

	# [消息对象1，消息对象2，消息对象3]
	for msg in msg_list:
		print("\t消息", msg.type, msg.content)

接收指定用户消息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


from wxauto import wechat

wx = Wechat()

wx.AddListenchat(who='用户ID")
wx.AddListenchat(who="...")
                 
# {
#     "用户A聊天窗口": [消息对象1, 消息对象2, 消息对象3],
#     "用户B聊天窗口": [消息对象1, 消息对象2, 消息对象3]
# }
                 
while True:	
	listen dict = wx.GetListenMessage()
	
	for chat win, message list in listen_ dict.items():
		#用户或群名
		chat user=chat win.who
                 
		#[消息对象1,消息对象2,消息对象3]
		for msg in message_list:
			if msg.type != "friend":
                 continue
			print(chat_user, msg.content)

		# 回复消息
		# chat_win.sendMsg("回复的内容")

	time.sleep(5)

整体实现

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90


import os
import time
import json
import requests
from wxauto import WeChat

# 加载历史消息记录
DB = {}
if os.path.exists("db.json"):
    fp = open("db.json", encoding='utf-8', mode='r')
    DB = json.load(fp)
    fp.close()

# MONITOR_LIST = []
# fp = open("users.txt", encoding='utf-8', mode='r')
# for line in fp:
#     # 读取users.txt加载监听用户
#     # print(line)
#     MONITOR_LIST.append(line)
# fp.close()

# 加载监听用户列表
MONITOR_LIST = []
if os.path.exists("users.txt"):
    with open("users.txt", encoding='utf-8', mode='r') as fp:
        for line in fp:
            line = line.strip()  # 去除首尾空白字符和换行符
            if line:  # 忽略空行
                MONITOR_LIST.append(line)
else:
    print("警告：未找到 users.txt 文件，将监听默认用户列表")
    MONITOR_LIST = ["沙系"]  # 默认监听用户（示例）

# 打开微信
wx = WeChat()

# 监听账户列表
# MONITOR_LIST = ["沙系"]
for ele in MONITOR_LIST:
    wx.AddListenChat(who = ele)

# 监听消息
while True:
    listen_dict = wx.GetListenMessage()

    for chat_win, message_list in listen_dict.items():
        # print(chat_win.who)
        chat_user = chat_win.who

        # 获取最新聊天消息
        interval_list = []
        for msg in message_list:
            if msg.type != "friend":
                continue
            interval_list.append({"role": "user", "content": msg.content})

        if not interval_list:
            continue

        # 拼接历史聊天记录
        print("微信消息：")
        for interval in interval_list:
            print(interval)

        history_list = DB.get(chat_user, [])
        history_list.extend(interval_list)

        # 调用本地 deepseek 模型
        res = requests.post(
            url="http://localhost:11434/api/chat",
            json={
                "model": "deepseek-r1:8b",
                "message": history_list,
                "stream": False
            }
        )
        data_dict = res.json()
        res_msg_dict = data_dict['message']

        # 获取 deepseek 回复内容，微信回复
        res_content = res_msg_dict['content']
        chat_win.SendMsg(res_content)
        print("deepseek回复：")
        print(res_content)

        # 保存沟通记录
        history_list.append(res_msg_dict)
        DB[chat_user] = history_list
        with open("db.json", encoding='utf-8', mode='w') as fp:
            json.dump(DB, fp, ensure_ascii=False, indent=2)

代码运行后能够成功监听到微信好友发送的消息，且能够读取到消息信息，但是在回复时一直报错，显示回复时间超时，报错信息如下。

1
2
3
4
5
6


Traceback (most recent call last):
  File "D:\PycharmProject\WebAutoTest\Auto_Chat\Auto_Chat.py", line 82, in <module>
    chat_win.SendMsg(res_content)
  File "D:\PycharmProject\WebAutoTest\venv\lib\site-packages\wxauto\elements.py", line 250, in SendMsg
    raise TimeoutError(f'发送消息超时 --> {self.who} - {msg}')
TimeoutError: 发送消息超时 --> 沙系 -

询问 AI 修改后代码如下，但仍未能解决问题。

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132


import os
import time
import json
import requests
from wxauto import WeChat
import win32gui
import win32con


# ======================== 工具函数 ========================
def activate_window(window_title):
    """强制激活微信窗口（需管理员权限）"""
    try:
        hwnd = win32gui.FindWindow(None, window_title)
        if hwnd:
            win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)
            win32gui.SetForegroundWindow(hwnd)
    except Exception as e:
        print(f"窗口激活失败（不影响发送）: {str(e)}")


def safe_send_msg(chat_win, msg, max_retries=3):
    """带重试机制的消息发送"""
    for attempt in range(max_retries):
        try:
            # 尝试激活窗口（提高成功率）
            if hasattr(chat_win, 'who'):
                activate_window(f"{chat_win.who} - 微信")
                time.sleep(1)

            chat_win.SendMsg(msg)
            return True
        except Exception as e:
            print(f"发送失败（尝试 {attempt + 1}/{max_retries}）: {str(e)}")
            time.sleep(2)
    return False


# ======================== 主逻辑 ========================
# 加载历史消息记录
DB = {}
if os.path.exists("db.json"):
    try:
        with open("db.json", encoding='utf-8', mode='r') as fp:
            DB = json.load(fp)
    except Exception as e:
        print(f"历史记录加载失败: {str(e)}")

# 加载监听用户列表
MONITOR_LIST = []
if os.path.exists("users.txt"):
    with open("users.txt", encoding='utf-8', mode='r') as fp:
        for line in fp:
            line = line.strip()
            if line and line not in MONITOR_LIST:  # 去重
                MONITOR_LIST.append(line)
else:
    print("警告：未找到 users.txt 文件，将监听默认用户")
    MONITOR_LIST = ["沙系"]

# 初始化微信
wx = WeChat()
for user in MONITOR_LIST:
    try:
        wx.AddListenChat(who=user)
        print(f"已监听用户: {user}")
    except Exception as e:
        print(f"监听用户 {user} 失败: {str(e)}")

# 主循环
while True:
    try:
        listen_dict = wx.GetListenMessage()

        for chat_win, message_list in listen_dict.items():
            chat_user = chat_win.who

            # 过滤非好友消息
            interval_list = [
                {"role": "user", "content": msg.content}
                for msg in message_list if msg.type == "friend"
            ]
            if not interval_list:
                continue

            print(f"\n收到来自 {chat_user} 的消息:")
            for msg in interval_list:
                print(f"  {msg['content']}")

            # 合并历史记录
            history_list = DB.get(chat_user, [])
            history_list.extend(interval_list)

            # 调用模型API
            try:
                res = requests.post(
                    url="http://localhost:11434/api/chat",
                    json={
                        "model": "deepseek-r1:8b",
                        "message": history_list,
                        "stream": False
                    },
                    timeout=30
                )
                res.raise_for_status()
                reply = res.json()['message']
                reply_content = reply['content']
            except Exception as e:
                print(f"模型调用失败: {str(e)}")
                reply_content = "（思考中...请稍后再试）"

            # 发送回复
            print(f"\n回复 {chat_user}:")
            print(f"  {reply_content}")

            if not safe_send_msg(chat_win, reply_content):
                print(f"⚠️ 最终发送失败: {reply_content[:50]}...")
            else:
                # 保存成功记录
                history_list.append(reply)
                DB[chat_user] = history_list
                with open("db.json", encoding='utf-8', mode='w') as fp:
                    json.dump(DB, fp, ensure_ascii=False, indent=2)

        time.sleep(1)  # 降低CPU占用

    except KeyboardInterrupt:
        print("\n用户终止程序")
        break
    except Exception as e:
        print(f"主循环异常: {str(e)}")
        time.sleep(5)  # 防止频繁报错

此时检查可能影响因素，ollama 正常运行，能够访问到 api 地址 http://localhost:11434/api/tags，管理员身份运行 pycharm，模型加载正常，暂时未发现其他问题。

终端信息输出如下。

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204


time=2025-06-07T11:16:44.115+08:00 level=WARN source=ggml.go:152 msg="key not found" key=general.alignment default=32
time=2025-06-07T11:16:44.142+08:00 level=WARN source=ggml.go:152 msg="key not found" key=general.alignment default=32
time=2025-06-07T11:16:44.155+08:00 level=WARN source=ggml.go:152 msg="key not found" key=general.alignment default=32
time=2025-06-07T11:16:44.156+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.vision.block_count default=0
time=2025-06-07T11:16:44.156+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.attention.key_length default=128
time=2025-06-07T11:16:44.156+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.attention.value_length default=128
time=2025-06-07T11:16:44.156+08:00 level=INFO source=sched.go:754 msg="new model will fit in available VRAM in single GPU, loading" model=C:\Users\Ysam\.ollama\models\blobs\sha256-6340dc3229b0d08ea9cc49b75d4098702983e17b4c096d57afbbf2ffc813f2be gpu=GPU-189333b1-98d9-8621-adb5-344761953fc5 parallel=2 available=7349653504 required="6.5 GiB"
time=2025-06-07T11:16:44.177+08:00 level=INFO source=server.go:106 msg="system memory" total="31.3 GiB" free="16.5 GiB" free_swap="22.9 GiB"
time=2025-06-07T11:16:44.177+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.vision.block_count default=0
time=2025-06-07T11:16:44.177+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.attention.key_length default=128
time=2025-06-07T11:16:44.177+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.attention.value_length default=128
time=2025-06-07T11:16:44.177+08:00 level=INFO source=server.go:139 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[6.8 GiB]" memory.gpu_overhead="0 B" memory.required.full="6.5 GiB" memory.required.partial="6.5 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[6.5 GiB]" memory.weights.total="4.3 GiB" memory.weights.repeating="3.9 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
llama_model_loader: loaded meta data with 28 key-value pairs and 292 tensors from C:\Users\Ysam\.ollama\models\blobs\sha256-6340dc3229b0d08ea9cc49b75d4098702983e17b4c096d57afbbf2ffc813f2be (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 Distill Llama 8B
llama_model_loader: - kv   3:                           general.basename str              = DeepSeek-R1-Distill-Llama
llama_model_loader: - kv   4:                         general.size_label str              = 8B
llama_model_loader: - kv   5:                          llama.block_count u32              = 32
llama_model_loader: - kv   6:                       llama.context_length u32              = 131072
llama_model_loader: - kv   7:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   8:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   9:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  10:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  11:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  12:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  13:                          general.file_type u32              = 15
llama_model_loader: - kv  14:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  15:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  16:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  17:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  18:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  19:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  20:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  21:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  22:                tokenizer.ggml.eos_token_id u32              = 128001
llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 128001
llama_model_loader: - kv  24:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  25:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  26:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  27:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   66 tensors
llama_model_loader: - type q4_K:  193 tensors
llama_model_loader: - type q6_K:   33 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 4.58 GiB (4.89 BPW)
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 256
load: token to piece cache size = 0.7999 MB
print_info: arch             = llama
print_info: vocab_only       = 1
print_info: model type       = ?B
print_info: model params     = 8.03 B
print_info: general.name     = DeepSeek R1 Distill Llama 8B
print_info: vocab type       = BPE
print_info: n_vocab          = 128256
print_info: n_merges         = 280147
print_info: BOS token        = 128000 '<｜begin▁of▁sentence｜>'
print_info: EOS token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOT token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOM token        = 128008 '<|eom_id|>'
print_info: PAD token        = 128001 '<｜end▁of▁sentence｜>'
print_info: LF token         = 198 'Ċ'
print_info: EOG token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOG token        = 128008 '<|eom_id|>'
print_info: EOG token        = 128009 '<|eot_id|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-06-07T11:16:44.398+08:00 level=INFO source=server.go:410 msg="starting llama server" cmd="C:\\Users\\Ysam\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model C:\\Users\\Ysam\\.ollama\\models\\blobs\\sha256-6340dc3229b0d08ea9cc49b75d4098702983e17b4c096d57afbbf2ffc813f2be --ctx-size 8192 --batch-size 512 --n-gpu-layers 33 --threads 12 --no-mmap --parallel 2 --port 57197"
time=2025-06-07T11:16:44.401+08:00 level=INFO source=sched.go:452 msg="loaded runners" count=1
time=2025-06-07T11:16:44.401+08:00 level=INFO source=server.go:589 msg="waiting for llama runner to start responding"
time=2025-06-07T11:16:44.401+08:00 level=INFO source=server.go:623 msg="waiting for server to become available" status="llm server error"
time=2025-06-07T11:16:44.420+08:00 level=INFO source=runner.go:853 msg="starting go runner"
load_backend: loaded CPU backend from C:\Users\Ysam\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4070 Laptop GPU, compute capability 8.9, VMM: yes
load_backend: loaded CUDA backend from C:\Users\Ysam\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll
time=2025-06-07T11:16:44.744+08:00 level=INFO source=ggml.go:103 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(clang)
time=2025-06-07T11:16:44.744+08:00 level=INFO source=runner.go:913 msg="Server listening on 127.0.0.1:57197"
llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce RTX 4070 Laptop GPU) - 7056 MiB free
time=2025-06-07T11:16:44.903+08:00 level=INFO source=server.go:623 msg="waiting for server to become available" status="llm server loading model"
llama_model_loader: loaded meta data with 28 key-value pairs and 292 tensors from C:\Users\Ysam\.ollama\models\blobs\sha256-6340dc3229b0d08ea9cc49b75d4098702983e17b4c096d57afbbf2ffc813f2be (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 Distill Llama 8B
llama_model_loader: - kv   3:                           general.basename str              = DeepSeek-R1-Distill-Llama
llama_model_loader: - kv   4:                         general.size_label str              = 8B
llama_model_loader: - kv   5:                          llama.block_count u32              = 32
llama_model_loader: - kv   6:                       llama.context_length u32              = 131072
llama_model_loader: - kv   7:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   8:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   9:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  10:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  11:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  12:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  13:                          general.file_type u32              = 15
llama_model_loader: - kv  14:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  15:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  16:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  17:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  18:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  19:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  20:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  21:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  22:                tokenizer.ggml.eos_token_id u32              = 128001
llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 128001
llama_model_loader: - kv  24:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  25:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  26:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  27:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   66 tensors
llama_model_loader: - type q4_K:  193 tensors
llama_model_loader: - type q6_K:   33 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 4.58 GiB (4.89 BPW)
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 256
load: token to piece cache size = 0.7999 MB
print_info: arch             = llama
print_info: vocab_only       = 0
print_info: n_ctx_train      = 131072
print_info: n_embd           = 4096
print_info: n_layer          = 32
print_info: n_head           = 32
print_info: n_head_kv        = 8
print_info: n_rot            = 128
print_info: n_swa            = 0
print_info: n_swa_pattern    = 1
print_info: n_embd_head_k    = 128
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 4
print_info: n_embd_k_gqa     = 1024
print_info: n_embd_v_gqa     = 1024
print_info: f_norm_eps       = 0.0e+00
print_info: f_norm_rms_eps   = 1.0e-05
print_info: f_clamp_kqv      = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale    = 0.0e+00
print_info: f_attn_scale     = 0.0e+00
print_info: n_ff             = 14336
print_info: n_expert         = 0
print_info: n_expert_used    = 0
print_info: causal attn      = 1
print_info: pooling type     = 0
print_info: rope type        = 0
print_info: rope scaling     = linear
print_info: freq_base_train  = 500000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn  = 131072
print_info: rope_finetuned   = unknown
print_info: ssm_d_conv       = 0
print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 8B
print_info: model params     = 8.03 B
print_info: general.name     = DeepSeek R1 Distill Llama 8B
print_info: vocab type       = BPE
print_info: n_vocab          = 128256
print_info: n_merges         = 280147
print_info: BOS token        = 128000 '<｜begin▁of▁sentence｜>'
print_info: EOS token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOT token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOM token        = 128008 '<|eom_id|>'
print_info: PAD token        = 128001 '<｜end▁of▁sentence｜>'
print_info: LF token         = 198 'Ċ'
print_info: EOG token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOG token        = 128008 '<|eom_id|>'
print_info: EOG token        = 128009 '<|eot_id|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors: offloading 32 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 33/33 layers to GPU
load_tensors:        CUDA0 model buffer size =  4403.49 MiB
load_tensors:          CPU model buffer size =   281.81 MiB
llama_context: constructing llama_context
llama_context: n_seq_max     = 2
llama_context: n_ctx         = 8192
llama_context: n_ctx_per_seq = 4096
llama_context: n_batch       = 1024
llama_context: n_ubatch      = 512
llama_context: causal_attn   = 1
llama_context: flash_attn    = 0
llama_context: freq_base     = 500000.0
llama_context: freq_scale    = 1
llama_context: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_context:  CUDA_Host  output buffer size =     1.01 MiB
init: kv_size = 8192, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 32, can_shift = 1
init:      CUDA0 KV buffer size =  1024.00 MiB
llama_context: KV self size  = 1024.00 MiB, K (f16):  512.00 MiB, V (f16):  512.00 MiB
llama_context:      CUDA0 compute buffer size =   560.00 MiB
llama_context:  CUDA_Host compute buffer size =    24.01 MiB
llama_context: graph nodes  = 1094
llama_context: graph splits = 2
time=2025-06-07T11:16:45.904+08:00 level=INFO source=server.go:628 msg="llama runner started in 1.50 seconds"
[GIN] 2025/06/07 - 11:16:45 | 200 |    1.8019804s |       127.0.0.1 | POST     "/api/chat"

代码修改

问答过程

提问1

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323


import os
import time
import json
import requests
from wxauto import WeChat

# 加载历史消息记录
DB = {}
if os.path.exists("db.json"):
    fp = open("db.json", encoding='utf-8', mode='r')
    DB = json.load(fp)
    fp.close()

# MONITOR_LIST = []
# fp = open("users.txt", encoding='utf-8', mode='r')
# for line in fp:
#     # 读取users.txt加载监听用户
#     # print(line)
#     MONITOR_LIST.append(line)
# fp.close()

# 加载监听用户列表
MONITOR_LIST = []
if os.path.exists("users.txt"):
    with open("users.txt", encoding='utf-8', mode='r') as fp:
        for line in fp:
            line = line.strip()  # 去除首尾空白字符和换行符
            if line:  # 忽略空行
                MONITOR_LIST.append(line)
else:
    print("警告：未找到 users.txt 文件，将监听默认用户列表")
    MONITOR_LIST = ["沙系"]  # 默认监听用户（示例）

# 打开微信
wx = WeChat()

# 监听账户列表
# MONITOR_LIST = ["沙系"]
for ele in MONITOR_LIST:
    wx.AddListenChat(who = ele)

# 监听消息
while True:
    listen_dict = wx.GetListenMessage()

    for chat_win, message_list in listen_dict.items():
        # print(chat_win.who)
        chat_user = chat_win.who

        # 获取最新聊天消息
        interval_list = []
        for msg in message_list:
            if msg.type != "friend":
                continue
            interval_list.append({"role": "user", "content": msg.content})

        if not interval_list:
            continue

        # 拼接历史聊天记录
        print("微信消息：")
        for interval in interval_list:
            print(interval)

        history_list = DB.get(chat_user, [])
        history_list.extend(interval_list)

        # 调用本地 deepseek 模型
        res = requests.post(
            url="http://localhost:11434/api/chat",
            json={
                "model": "deepseek-r1:8b",
                "message": history_list,
                "stream": False
            }
        )
        data_dict = res.json()
        res_msg_dict = data_dict['message']

        # 获取 deepseek 回复内容，微信回复
        res_content = res_msg_dict['content']
        chat_win.SendMsg(res_content)
        print("deepseek回复：")
        print(res_content)

        # 保存沟通记录
        history_list.append(res_msg_dict)
        DB[chat_user] = history_list
        with open("db.json", encoding='utf-8', mode='w') as fp:
            json.dump(DB, fp, ensure_ascii=False, indent=2)

这段代码在运行后，能够检测到当前聊天用户，并且能够获取到信息，但是无法回复

pycharm 报错如下：

初始化成功，获取到已登录窗口：清雨
微信消息：
{'role': 'user', 'content': '测试'}
Traceback (most recent call last):
  File "D:\PycharmProject\WebAutoTest\Auto_Chat\Auto_Chat.py", line 82, in <module>
    chat_win.SendMsg(res_content)
  File "D:\PycharmProject\WebAutoTest\venv\lib\site-packages\wxauto\elements.py", line 250, in SendMsg
    raise TimeoutError(f'发送消息超时 --> {self.who} - {msg}')
TimeoutError: 发送消息超时 --> 沙系 - 

在终端中可以看到如下信息

C:\Users\Ysam>ollama serve
2025/06/07 12:01:11 routes.go:1233: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\Ysam\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-06-07T12:01:11.166+08:00 level=INFO source=images.go:463 msg="total blobs: 10"
time=2025-06-07T12:01:11.166+08:00 level=INFO source=images.go:470 msg="total unused blobs removed: 0"
time=2025-06-07T12:01:11.167+08:00 level=INFO source=routes.go:1300 msg="Listening on 127.0.0.1:11434 (version 0.6.8)"
time=2025-06-07T12:01:11.167+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-06-07T12:01:11.167+08:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-06-07T12:01:11.167+08:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=12 efficiency=0 threads=24
time=2025-06-07T12:01:11.758+08:00 level=WARN source=amd_windows.go:138 msg="amdgpu is not supported (supported types:[gfx1030 gfx1100 gfx1101 gfx1102 gfx1151 gfx906])" gpu_type=gfx1036 gpu=0 library=C:\Users\Ysam\AppData\Local\Programs\Ollama\lib\ollama\rocm
time=2025-06-07T12:01:11.761+08:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-189333b1-98d9-8621-adb5-344761953fc5 library=cuda variant=v12 compute=8.9 driver=12.8 name="NVIDIA GeForce RTX 4070 Laptop GPU" total="8.0 GiB" available="6.9 GiB"
time=2025-06-07T12:01:29.732+08:00 level=WARN source=ggml.go:152 msg="key not found" key=general.alignment default=32
time=2025-06-07T12:01:29.754+08:00 level=WARN source=ggml.go:152 msg="key not found" key=general.alignment default=32
time=2025-06-07T12:01:29.767+08:00 level=WARN source=ggml.go:152 msg="key not found" key=general.alignment default=32
time=2025-06-07T12:01:29.768+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.vision.block_count default=0
time=2025-06-07T12:01:29.768+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.attention.key_length default=128
time=2025-06-07T12:01:29.768+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.attention.value_length default=128
time=2025-06-07T12:01:29.768+08:00 level=INFO source=sched.go:754 msg="new model will fit in available VRAM in single GPU, loading" model=C:\Users\Ysam\.ollama\models\blobs\sha256-6340dc3229b0d08ea9cc49b75d4098702983e17b4c096d57afbbf2ffc813f2be gpu=GPU-189333b1-98d9-8621-adb5-344761953fc5 parallel=2 available=7317225472 required="6.5 GiB"
time=2025-06-07T12:01:29.777+08:00 level=INFO source=server.go:106 msg="system memory" total="31.3 GiB" free="18.0 GiB" free_swap="22.8 GiB"
time=2025-06-07T12:01:29.777+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.vision.block_count default=0
time=2025-06-07T12:01:29.777+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.attention.key_length default=128
time=2025-06-07T12:01:29.777+08:00 level=WARN source=ggml.go:152 msg="key not found" key=llama.attention.value_length default=128
time=2025-06-07T12:01:29.778+08:00 level=INFO source=server.go:139 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[6.8 GiB]" memory.gpu_overhead="0 B" memory.required.full="6.5 GiB" memory.required.partial="6.5 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[6.5 GiB]" memory.weights.total="4.3 GiB" memory.weights.repeating="3.9 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
llama_model_loader: loaded meta data with 28 key-value pairs and 292 tensors from C:\Users\Ysam\.ollama\models\blobs\sha256-6340dc3229b0d08ea9cc49b75d4098702983e17b4c096d57afbbf2ffc813f2be (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 Distill Llama 8B
llama_model_loader: - kv   3:                           general.basename str              = DeepSeek-R1-Distill-Llama
llama_model_loader: - kv   4:                         general.size_label str              = 8B
llama_model_loader: - kv   5:                          llama.block_count u32              = 32
llama_model_loader: - kv   6:                       llama.context_length u32              = 131072
llama_model_loader: - kv   7:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   8:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   9:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  10:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  11:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  12:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  13:                          general.file_type u32              = 15
llama_model_loader: - kv  14:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  15:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  16:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  17:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  18:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  19:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  20:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  21:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  22:                tokenizer.ggml.eos_token_id u32              = 128001
llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 128001
llama_model_loader: - kv  24:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  25:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  26:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  27:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   66 tensors
llama_model_loader: - type q4_K:  193 tensors
llama_model_loader: - type q6_K:   33 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 4.58 GiB (4.89 BPW)
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 256
load: token to piece cache size = 0.7999 MB
print_info: arch             = llama
print_info: vocab_only       = 1
print_info: model type       = ?B
print_info: model params     = 8.03 B
print_info: general.name     = DeepSeek R1 Distill Llama 8B
print_info: vocab type       = BPE
print_info: n_vocab          = 128256
print_info: n_merges         = 280147
print_info: BOS token        = 128000 '<｜begin▁of▁sentence｜>'
print_info: EOS token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOT token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOM token        = 128008 '<|eom_id|>'
print_info: PAD token        = 128001 '<｜end▁of▁sentence｜>'
print_info: LF token         = 198 'Ċ'
print_info: EOG token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOG token        = 128008 '<|eom_id|>'
print_info: EOG token        = 128009 '<|eot_id|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-06-07T12:01:29.965+08:00 level=INFO source=server.go:410 msg="starting llama server" cmd="C:\\Users\\Ysam\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model C:\\Users\\Ysam\\.ollama\\models\\blobs\\sha256-6340dc3229b0d08ea9cc49b75d4098702983e17b4c096d57afbbf2ffc813f2be --ctx-size 8192 --batch-size 512 --n-gpu-layers 33 --threads 12 --no-mmap --parallel 2 --port 59718"
time=2025-06-07T12:01:29.969+08:00 level=INFO source=sched.go:452 msg="loaded runners" count=1
time=2025-06-07T12:01:29.969+08:00 level=INFO source=server.go:589 msg="waiting for llama runner to start responding"
time=2025-06-07T12:01:29.969+08:00 level=INFO source=server.go:623 msg="waiting for server to become available" status="llm server error"
time=2025-06-07T12:01:29.988+08:00 level=INFO source=runner.go:853 msg="starting go runner"
load_backend: loaded CPU backend from C:\Users\Ysam\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4070 Laptop GPU, compute capability 8.9, VMM: yes
load_backend: loaded CUDA backend from C:\Users\Ysam\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll
time=2025-06-07T12:01:30.307+08:00 level=INFO source=ggml.go:103 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(clang)
time=2025-06-07T12:01:30.308+08:00 level=INFO source=runner.go:913 msg="Server listening on 127.0.0.1:59718"
llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce RTX 4070 Laptop GPU) - 7056 MiB free
llama_model_loader: loaded meta data with 28 key-value pairs and 292 tensors from C:\Users\Ysam\.ollama\models\blobs\sha256-6340dc3229b0d08ea9cc49b75d4098702983e17b4c096d57afbbf2ffc813f2be (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 Distill Llama 8B
llama_model_loader: - kv   3:                           general.basename str              = DeepSeek-R1-Distill-Llama
llama_model_loader: - kv   4:                         general.size_label str              = 8B
llama_model_loader: - kv   5:                          llama.block_count u32              = 32
llama_model_loader: - kv   6:                       llama.context_length u32              = 131072
llama_model_loader: - kv   7:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   8:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   9:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  10:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  11:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  12:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  13:                          general.file_type u32              = 15
llama_model_loader: - kv  14:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  15:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  16:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  17:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  18:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  19:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  20:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  21:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  22:                tokenizer.ggml.eos_token_id u32              = 128001
llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 128001
llama_model_loader: - kv  24:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  25:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  26:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  27:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   66 tensors
llama_model_loader: - type q4_K:  193 tensors
llama_model_loader: - type q6_K:   33 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 4.58 GiB (4.89 BPW)
time=2025-06-07T12:01:30.471+08:00 level=INFO source=server.go:623 msg="waiting for server to become available" status="llm server loading model"
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 256
load: token to piece cache size = 0.7999 MB
print_info: arch             = llama
print_info: vocab_only       = 0
print_info: n_ctx_train      = 131072
print_info: n_embd           = 4096
print_info: n_layer          = 32
print_info: n_head           = 32
print_info: n_head_kv        = 8
print_info: n_rot            = 128
print_info: n_swa            = 0
print_info: n_swa_pattern    = 1
print_info: n_embd_head_k    = 128
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 4
print_info: n_embd_k_gqa     = 1024
print_info: n_embd_v_gqa     = 1024
print_info: f_norm_eps       = 0.0e+00
print_info: f_norm_rms_eps   = 1.0e-05
print_info: f_clamp_kqv      = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale    = 0.0e+00
print_info: f_attn_scale     = 0.0e+00
print_info: n_ff             = 14336
print_info: n_expert         = 0
print_info: n_expert_used    = 0
print_info: causal attn      = 1
print_info: pooling type     = 0
print_info: rope type        = 0
print_info: rope scaling     = linear
print_info: freq_base_train  = 500000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn  = 131072
print_info: rope_finetuned   = unknown
print_info: ssm_d_conv       = 0
print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 8B
print_info: model params     = 8.03 B
print_info: general.name     = DeepSeek R1 Distill Llama 8B
print_info: vocab type       = BPE
print_info: n_vocab          = 128256
print_info: n_merges         = 280147
print_info: BOS token        = 128000 '<｜begin▁of▁sentence｜>'
print_info: EOS token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOT token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOM token        = 128008 '<|eom_id|>'
print_info: PAD token        = 128001 '<｜end▁of▁sentence｜>'
print_info: LF token         = 198 'Ċ'
print_info: EOG token        = 128001 '<｜end▁of▁sentence｜>'
print_info: EOG token        = 128008 '<|eom_id|>'
print_info: EOG token        = 128009 '<|eot_id|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors: offloading 32 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 33/33 layers to GPU
load_tensors:        CUDA0 model buffer size =  4403.49 MiB
load_tensors:          CPU model buffer size =   281.81 MiB
llama_context: constructing llama_context
llama_context: n_seq_max     = 2
llama_context: n_ctx         = 8192
llama_context: n_ctx_per_seq = 4096
llama_context: n_batch       = 1024
llama_context: n_ubatch      = 512
llama_context: causal_attn   = 1
llama_context: flash_attn    = 0
llama_context: freq_base     = 500000.0
llama_context: freq_scale    = 1
llama_context: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_context:  CUDA_Host  output buffer size =     1.01 MiB
init: kv_size = 8192, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 32, can_shift = 1
init:      CUDA0 KV buffer size =  1024.00 MiB
llama_context: KV self size  = 1024.00 MiB, K (f16):  512.00 MiB, V (f16):  512.00 MiB
llama_context:      CUDA0 compute buffer size =   560.00 MiB
llama_context:  CUDA_Host compute buffer size =    24.01 MiB
llama_context: graph nodes  = 1094
llama_context: graph splits = 2
time=2025-06-07T12:01:37.233+08:00 level=INFO source=server.go:628 msg="llama runner started in 7.26 seconds"
[GIN] 2025/06/07 - 12:01:37 | 200 |     7.682587s |       127.0.0.1 | POST     "/api/chat"

如何解决该问题，请提供完整代码

回答1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


已深度思考（用时 14 秒）

根据问题描述，代码在运行过程中出现了`TimeoutError: 发送消息超时 --> 沙系 -`的错误。这个错误通常发生在使用`wxauto`库发送消息时，由于某种原因消息未能成功发送。可能的原因包括：

\1. 微信窗口未激活或未在前台。
\2. 微信窗口标题不匹配（例如，窗口标题可能包含其他信息，如“沙系 - 微信”）。
\3. 发送消息时微信窗口被其他窗口遮挡或最小化。
\4. 发送消息的速度过快，微信来不及响应。

为了解决这个问题，我们可以采取以下措施：

\- 在发送消息前，确保微信窗口是激活状态（即在前台）。
\- 添加重试机制，在发送失败时进行多次尝试。
\- 在发送消息前，先尝试将微信窗口置顶。

由于`wxauto`库的`SendMsg`方法在发送消息时要求窗口处于激活状态，我们可以使用`win32gui`来激活窗口。

以下是修改后的代码，主要增加了以下功能：

\1. 使用`win32gui`来查找并激活微信窗口。
\2. 在发送消息时加入重试机制。
\3. 在发送消息前添加适当的延迟，确保窗口已经激活。

注意：需要安装`pywin32`库，可以使用`pip install pywin32`安装。

修改后的完整代码如下：

根据您的问题描述和错误信息，主要问题在于微信消息发送超时。以下是完整的解决方案代码：

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129


import os
import time
import json
import requests
import win32gui
import win32con
from wxauto import WeChat

def activate_window(window_title):
    """激活指定标题的窗口"""
    try:
        hwnd = win32gui.FindWindow(None, window_title)
        if hwnd:
            win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)
            win32gui.SetForegroundWindow(hwnd)
            return True
    except Exception as e:
        print(f"窗口激活失败: {str(e)}")
    return False

def safe_send_msg(chat_win, msg, max_retries=3):
    """带重试机制的安全发送消息"""
    for attempt in range(max_retries):
        try:
            # 激活微信窗口
            window_title = f"{chat_win.who} - 微信"
            if activate_window(window_title):
                time.sleep(0.5)  # 等待窗口激活
            
            chat_win.SendMsg(msg)
            return True
        except Exception as e:
            print(f"发送失败（尝试 {attempt + 1}/{max_retries}）: {str(e)}")
            time.sleep(2)  # 等待后重试
    return False

# 加载历史消息记录
DB = {}
if os.path.exists("db.json"):
    try:
        with open("db.json", encoding='utf-8', mode='r') as fp:
            DB = json.load(fp)
    except Exception as e:
        print(f"历史记录加载失败: {str(e)}")

# 加载监听用户列表
MONITOR_LIST = []
if os.path.exists("users.txt"):
    with open("users.txt", encoding='utf-8', mode='r') as fp:
        for line in fp:
            line = line.strip()
            if line:
                MONITOR_LIST.append(line)
else:
    print("警告：未找到 users.txt 文件，将监听默认用户")
    MONITOR_LIST = ["沙系"]

# 初始化微信
wx = WeChat()
for user in MONITOR_LIST:
    try:
        wx.AddListenChat(who=user)
        print(f"已监听用户: {user}")
    except Exception as e:
        print(f"监听用户 {user} 失败: {str(e)}")

# 主循环
while True:
    try:
        listen_dict = wx.GetListenMessage()
        
        for chat_win, message_list in listen_dict.items():
            chat_user = chat_win.who
            
            # 过滤非好友消息
            interval_list = [
                {"role": "user", "content": msg.content}
                for msg in message_list if msg.type == "friend"
            ]
            if not interval_list:
                continue

            print(f"\n收到来自 {chat_user} 的消息:")
            for msg in interval_list:
                print(f"  {msg['content']}")

            # 合并历史记录
            history_list = DB.get(chat_user, [])
            history_list.extend(interval_list)

            # 调用模型API
            try:
                res = requests.post(
                    url="http://localhost:11434/api/chat",
                    json={
                        "model": "deepseek-r1:8b",
                        "messages": history_list,
                        "stream": False
                    },
                    timeout=30
                )
                res.raise_for_status()
                reply = res.json()['message']
                reply_content = reply['content']
            except Exception as e:
                print(f"模型调用失败: {str(e)}")
                reply_content = "⚠️ 处理请求时出错，请稍后再试"

            # 发送回复
            print(f"\n回复 {chat_user}:")
            print(f"  {reply_content}")
            
            if not safe_send_msg(chat_win, reply_content):
                print(f"⚠️ 最终发送失败: {reply_content[:50]}...")
            else:
                # 保存成功记录
                history_list.append(reply)
                DB[chat_user] = history_list
                with open("db.json", encoding='utf-8', mode='w') as fp:
                    json.dump(DB, fp, ensure_ascii=False, indent=2)

        time.sleep(1)  # 降低CPU占用

    except KeyboardInterrupt:
        print("\n用户终止程序")
        break
    except Exception as e:
        print(f"主循环异常: {str(e)}")
        time.sleep(5)  # 防止频繁报错

主要改进：

窗口激活功能：
- 添加了 activate_window() 函数，使用 win32gui 确保微信窗口在前台
- 在发送消息前激活目标聊天窗口
消息发送重试机制：
- 创建了 safe_send_msg() 函数实现带重试的消息发送
- 默认3次重试，每次间隔2秒
错误处理增强：
- 添加了模型调用异常处理
- 增加了主循环异常捕获
- 添加了键盘中断处理（Ctrl+C）
日志优化：
- 更详细的调试信息输出
- 消息发送状态实时反馈
性能优化：
- 添加了适当的休眠时间
- 减少不必要的资源占用

使用说明：

安装依赖：
1

pip install wxauto requests pywin32
创建用户列表：在脚本同目录下创建 users.txt 文件，每行一个要监听的微信联系人名称
运行准备：
- 确保微信客户端已登录并处于前台
- 确保Ollama服务已运行（ollama serve）
- 确保已安装模型（ollama pull deepseek-r1:8b）
运行脚本：
1

python your_script_name.py

注意事项：

首次运行时微信可能会有安全提示，需要手动允许
确保微信窗口标题格式为 “联系人名称 - 微信”
如果仍遇到问题，可以尝试：
- 增加重试次数（修改 max_retries 参数）
- 延长窗口激活后的等待时间（调整 time.sleep(0.5)）

这个解决方案解决了消息发送超时的核心问题，同时增强了程序的健壮性和用户体验。

提问2

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


Traceback (most recent call last):
  File "D:\PycharmProject\WebAutoTest\Auto_Chat\AI_Auto.py", line 62, in <module>
    wx = WeChat()
  File "D:\PycharmProject\WebAutoTest\venv\lib\site-packages\wxauto\wxauto.py", line 41, in __init__
    self._show()
  File "D:\PycharmProject\WebAutoTest\venv\lib\site-packages\wxauto\wxauto.py", line 88, in _show
    win32gui.SetWindowPos(self.HWND, -1, 0, 0, 0, 0, 3)
pywintypes.error: (5, 'SetWindowPos', '拒绝访问。')

进程已结束,退出代码

回答2

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162


已深度思考（用时 65 秒）

这个错误是由于尝试设置窗口位置时被拒绝访问导致的。可能的原因是程序没有足够的权限，或者微信窗口的某些保护机制阻止了操作。我们可以尝试以下方法：

\1. 以管理员身份运行Python脚本。
\2. 避免在初始化时调用`_show`方法（即避免设置窗口位置）。

在wxauto的源代码中，`__init__`方法会调用`_show`方法，而`_show`方法中使用了`SetWindowPos`。我们可以通过继承`WeChat`类并重写`_show`方法，避免这个操作。

但是，我们也可以尝试在创建`WeChat`对象时不自动调用`_show`（查看wxauto源码，似乎没有提供参数来控制）。因此，我们可以修改为捕获异常，或者使用其他方式。

另一种思路：在初始化`WeChat`对象时，我们并不需要立即显示窗口（因为后续我们通过激活窗口来操作）。所以，我们可以考虑修改wxauto的代码？但这不是好的做法。

实际上，错误信息中提到的`SetWindowPos`调用是为了将窗口置于最前面。我们可以跳过这一步，因为后续在发送消息时我们会激活窗口。

由于我们无法直接修改库，我们可以尝试捕获这个异常并继续运行，只要后续操作不受影响。

修改初始化部分：

\```python
try:
    wx = WeChat()
except Exception as e:
    print(f"初始化微信时出错: {str(e)}")
    print("尝试继续运行...")
# 这里需要确保wx对象仍然被创建，但捕获异常后可能无法创建，所以我们需要重新初始化但不调用_show方法？
# 实际上，wxauto在初始化时一定会调用_show，所以我们需要另一种方式
\```

由于这个错误发生在`_show`方法中，我们可以尝试在创建`WeChat`实例之前，先找到微信窗口，然后尝试激活它，这样在`_show`中设置位置时可能就不会出错？或者，我们考虑使用另一个库（如`pywinauto`）来操作微信窗口？

但是，为了快速解决问题，我们可以考虑修改`wxauto`库的源代码。找到`wxauto.py`文件，定位到`_show`方法（大约第88行），将：

\```python
win32gui.SetWindowPos(self.HWND, -1, 0, 0, 0, 0, 3)
\```

改为：

\```python
try:
    win32gui.SetWindowPos(self.HWND, -1, 0, 0, 0, 0, 3)
except Exception as e:
    print(f"设置窗口位置失败: {e}")
\```

这样，即使设置窗口位置失败，也不会抛出异常，程序可以继续运行。

如果你不想修改库的代码，我们可以尝试在初始化之前先激活微信窗口，看是否能避免这个错误。

以下是修改后的初始化代码，尝试先激活微信主窗口：

\```python
# 尝试先激活微信主窗口
def activate_wechat_main_window():

hwnd = win32gui.FindWindow("WeChatMainWndForPC", None)

if hwnd:
    win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)
    win32gui.SetForegroundWindow(hwnd)

time.sleep(1)

activate_wechat_main_window()

# 初始化微信对象
wx = WeChat()
\```

微信主窗口的类名是"WeChatMainWndForPC"，我们尝试先激活它，然后再初始化`WeChat`对象。

让我们整合到代码中：

\```python
import os
import time
import json
import requests
import win32gui
import win32con
from wxauto import WeChat


def activate_window(window_title):
    """激活指定标题的窗口"""
    try:
        hwnd = win32gui.FindWindow(None, window_title)
        if hwnd:
            win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)
            win32gui.SetForegroundWindow(hwnd)
            return True
    except Exception as e:
        print(f"窗口激活失败: {str(e)}")
    return False


def safe_send_msg(chat_win, msg, max_retries=3):
    """带重试机制的安全发送消息"""
    for attempt in range(max_retries):
        try:
            # 激活微信窗口
            window_title = f"{chat_win.who} - 微信"
            if activate_window(window_title):
                time.sleep(0.5)  # 等待窗口激活

            chat_win.SendMsg(msg)
            return True
        except Exception as e:
            print(f"发送失败（尝试 {attempt + 1}/{max_retries}）: {str(e)}")
            time.sleep(2)  # 等待后重试
    return False


# 加载历史消息记录
DB = {}
if os.path.exists("db.json"):
    try:
        with open("db.json", encoding='utf-8', mode='r') as fp:
            DB = json.load(fp)
    except Exception as e:
        print(f"历史记录加载失败: {str(e)}")

# 加载监听用户列表
MONITOR_LIST = []
if os.path.exists("users.txt"):
    with open("users.txt", encoding='utf-8', mode='r') as fp:
        for line in fp:
            line = line.strip()
            if line:
                MONITOR_LIST.append(line)
else:
    print("警告：未找到 users.txt 文件，将监听默认用户")
    MONITOR_LIST = ["沙系"]

# 初始化微信
try:
    wx = WeChat()
except Exception as e:
    print(f"初始化微信时出错: {str(e)}")
    # 如果初始化失败，尝试再次激活主窗口并重试一次
    print("尝试再次激活微信主窗口并重试初始化...")
    activate_wechat_main_window()

# 再试一次，如果还失败则退出
wx = WeChat()  

for user in MONITOR_LIST:
    try:
        wx.AddListenChat(who=user)
        print(f"已监听用户: {user}")
    except Exception as e:
        print(f"监听用户 {user} 失败: {str(e)}")
    
# ... 其余代码保持不变 ...
\```

这样，在初始化`WeChat`对象之前，我们先激活微信主窗口，可能会避免权限问题。

如果问题仍然存在，请考虑以管理员身份运行脚本。

注意：如果微信没有运行，`activate_wechat_main_window`函数将找不到窗口，所以确保微信已经启动。

这个错误发生在初始化 WeChat 对象时，具体是在尝试设置微信窗口位置时出现了访问被拒绝的问题。让我们解决这个问题：

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161


import os
import time
import json
import requests
import win32gui
import win32con
from wxauto import WeChat

def activate_window(window_title):
    """激活指定标题的窗口"""
    try:
        hwnd = win32gui.FindWindow(None, window_title)
        if hwnd:
            win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)
            win32gui.SetForegroundWindow(hwnd)
            return True
    except Exception as e:
        print(f"窗口激活失败: {str(e)}")
    return False

def safe_send_msg(chat_win, msg, max_retries=3):
    """带重试机制的安全发送消息"""
    for attempt in range(max_retries):
        try:
            # 激活微信窗口
            window_title = f"{chat_win.who} - 微信"
            if activate_window(window_title):
                time.sleep(0.5)  # 等待窗口激活
            
            chat_win.SendMsg(msg)
            return True
        except Exception as e:
            print(f"发送失败（尝试 {attempt + 1}/{max_retries}）: {str(e)}")
            time.sleep(2)  # 等待后重试
    return False

# 加载历史消息记录
DB = {}
if os.path.exists("db.json"):
    try:
        with open("db.json", encoding='utf-8', mode='r') as fp:
            DB = json.load(fp)
    except Exception as e:
        print(f"历史记录加载失败: {str(e)}")

# 加载监听用户列表
MONITOR_LIST = []
if os.path.exists("users.txt"):
    with open("users.txt", encoding='utf-8', mode='r') as fp:
        for line in fp:
            line = line.strip()
            if line:
                MONITOR_LIST.append(line)
else:
    print("警告：未找到 users.txt 文件，将监听默认用户")
    MONITOR_LIST = ["沙系"]

# 初始化微信 - 安全方式
wx = None
try:
    wx = WeChat()
except Exception as e:
    print(f"初始化微信时出错: {str(e)}")
    print("尝试手动获取微信窗口...")
    
    # 手动查找微信窗口
    wechat_hwnd = win32gui.FindWindow("WeChatMainWndForPC", None)
    if wechat_hwnd:
        print("找到微信主窗口，尝试激活...")
        win32gui.ShowWindow(wechat_hwnd, win32con.SW_RESTORE)
        win32gui.SetForegroundWindow(wechat_hwnd)
        time.sleep(1)
        
        # 再次尝试初始化
        try:
            wx = WeChat()
            print("微信初始化成功")
        except Exception as e2:
            print(f"再次初始化失败: {str(e2)}")
            print("请确保微信已启动并处于前台")
            exit(1)
    else:
        print("未找到微信窗口，请确保微信已启动")
        exit(1)

# 监听账户列表
for user in MONITOR_LIST:
    try:
        wx.AddListenChat(who=user)
        print(f"已监听用户: {user}")
    except Exception as e:
        print(f"监听用户 {user} 失败: {str(e)}")

# 监听消息
while True:
    try:
        listen_dict = wx.GetListenMessage()
        if not listen_dict:
            time.sleep(1)
            continue
        
        for chat_win, message_list in listen_dict.items():
            chat_user = chat_win.who
            
            # 获取最新聊天消息
            interval_list = []
            for msg in message_list:
                if msg.type != "friend":
                    continue
                interval_list.append({"role": "user", "content": msg.content})

            if not interval_list:
                continue

            print(f"\n收到来自 {chat_user} 的消息:")
            for msg in interval_list:
                print(f"  {msg['content']}")

            # 合并历史记录
            history_list = DB.get(chat_user, [])
            history_list.extend(interval_list)

            # 调用本地 deepseek 模型
            try:
                res = requests.post(
                    url="http://localhost:11434/api/chat",
                    json={
                        "model": "deepseek-r1:8b",
                        "messages": history_list,
                        "stream": False
                    },
                    timeout=30
                )
                res.raise_for_status()
                reply = res.json()['message']
                reply_content = reply['content']
            except Exception as e:
                print(f"模型调用失败: {str(e)}")
                reply_content = "⚠️ 处理请求时出错，请稍后再试"

            # 发送回复
            print(f"\n回复 {chat_user}:")
            print(f"  {reply_content}")
            
            if not safe_send_msg(chat_win, reply_content):
                print(f"⚠️ 最终发送失败: {reply_content[:50]}...")
            else:
                # 保存成功记录
                history_list.append(reply)
                DB[chat_user] = history_list
                with open("db.json", encoding='utf-8', mode='w') as fp:
                    json.dump(DB, fp, ensure_ascii=False, indent=2)

        time.sleep(1)  # 降低CPU占用

    except KeyboardInterrupt:
        print("\n用户终止程序")
        break
    except Exception as e:
        print(f"主循环异常: {str(e)}")
        time.sleep(5)  # 防止频繁报错

主要修复：

微信初始化错误处理：
- 添加了完整的异常捕获和处理机制
- 当 WeChat() 初始化失败时，手动查找并激活微信窗口
- 使用微信窗口的类名 "WeChatMainWndForPC" 进行查找
增加健壮性：
- 检查 GetListenMessage() 返回是否为空
- 添加了更多的错误处理和日志输出
- 确保程序在遇到错误时不会完全崩溃
权限问题解决：
- 确保以管理员身份运行脚本（右键 -> 以管理员身份运行）
- 如果仍然遇到权限问题，尝试在微信设置中关闭 “以管理员身份运行此程序” 的选项

使用说明：

首次运行时：
- 确保微信已启动并处于前台
- 右键点击脚本 -> 以管理员身份运行
如果仍然遇到问题：
- 检查微信是否设置了 “以管理员身份运行”：
  - 右键微信快捷方式 -> 属性 -> 兼容性
  - 取消勾选 “以管理员身份运行此程序”
- 尝试更新 wxauto 库：
  1
  
  pip install --upgrade wxauto

替代方案：如果以上方法无效，可以尝试使用更稳定的 UIAutomation 替代方案：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


# 替代初始化方法
from wxauto import WeChat
from uiautomation import WindowControl

# 查找微信窗口
wx_win = WindowControl(ClassName="WeChatMainWndForPC")
if wx_win.Exists():
    wx_win.SetActive()
    time.sleep(1)

wx = WeChat()

这个解决方案应该能解决 “拒绝访问” 的错误，并确保微信自动化功能正常工作。

实现情况

运行 ollama 程序，或者在终端使用 ollama serve 运行 ollama。

确保登录微信，且微信不能使用管理员身份运行。

在 pycharm 中存有两个文件，其中 users.txt 用于保存聊天对象的名称。

1

沙系

db.json 用来保存聊天对象以及，每一次聊天的数据。这是因为调用本地模型的 chat 接口时，本地模型只能够对当前信息作出回复，没有之前聊天的记录，为了让本地模型能够进行多次对话，在代码中将每次对话的内容存储在 db.json 文件中，每次聊天将之间所有的消息记录一起发送。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28


{
  "沙系": [
    {
      "role": "user",
      "content": "import os\nimport time\nimport json\nimport requests\nfrom wxauto import WeChat\n\n# 加载历史消息记录\nDB = {}\nif os.path.exists(\"db.json\"):\n    fp = open(\"db.json\", encoding='utf-8', mode='r')\n    DB = json.load(fp)\n    fp.close()\n\n# MONITOR_LIST = []\n# fp = open(\"users.txt\", encoding='utf-8', mode='r')\n# for line in fp:\n#     # 读取users.txt加载监听用户\n#     # print(line)\n#     MONITOR_LIST.append(line)\n# fp.close()\n\n# 加载监听用户列表\nMONITOR_LIST = []\nif os.path.exists(\"users.txt\"):\n    with open(\"users.txt\", encoding='utf-8', mode='r') as fp:\n        for line in fp:\n            line = line.strip()  # 去除首尾空白字符和换行符\n            if line:  # 忽略空行\n                MONITOR_LIST.append(line)\nelse:\n    print(\"警告：未找到 users.txt 文件，将监听默认用户列表\")\n    MONITOR_LIST = [\"沙系\"]  # 默认监听用户（示例）\n\n# 打开微信\nwx = WeChat()\n\n# 监听账户列表\n# MONITOR_LIST = [\"沙系\"]\nfor ele in MONITOR_LIST:\n    wx.AddListenChat(who = ele)\n\n# 监听消息\nwhile True:\n    listen_dict = wx.GetListenMessage()\n\n    for chat_win, message_list in listen_dict.items():\n        # print(chat_win.who)\n        chat_user = chat_win.who\n\n        # 获取最新聊天消息\n        interval_list = []\n        for msg in message_list:\n            if msg.type != \"friend\":\n                continue\n            interval_list.append({\"role\": \"user\", \"content\": msg.content})\n\n        if not interval_list:\n            continue\n\n        # 拼接历史聊天记录\n        print(\"微信消息：\")\n        for interval in interval_list:\n            print(interval)\n\n        history_list = DB.get(chat_user, [])\n        history_list.extend(interval_list)\n\n        # 调用本地 deepseek 模型\n        res = requests.post(\n            url=\"http://localhost:11434/api/chat\",\n            json={\n                \"model\": \"deepseek-r1:8b\",\n                \"message\": history_list,\n                \"stream\": False\n            }\n        )\n        data_dict = res.json()\n        res_msg_dict = data_dict['message']\n\n        # 获取 deepseek 回复内容，微信回复\n        res_content = res_msg_dict['content']\n        chat_win.SendMsg(res_content)\n        print(\"deepseek回复：\")\n        print(res_content)\n\n        # 保存沟通记录\n        history_list.append(res_msg_dict)\n        DB[chat_user] = history_list\n        with open(\"db.json\", encoding='utf-8', mode='w') as fp:\n            json.dump(DB, fp, ensure_ascii=False, indent=2)\n\n这段代码在运行后，能够检测到当前聊天用户，并且能够获取到信息，但是无法回复\n\npycharm 报错如下：\n\n初始化成功，获取到已登录窗口：清雨\n微信消息：\n{'role': 'user', 'content': '测试'}\nTraceback (most recent call last):\n  File \"D:\\PycharmProject\\WebAutoTest\\Auto_Chat\\Auto_Chat.py\", line 82, in <module>\n    chat_win.SendMsg(res_content)\n  File \"D:\\PycharmProject\\WebAutoTest\\venv\\lib\\site-packages\\wxauto\\elements.py\", line 250, in SendMsg\n    raise TimeoutError(f'发送消息超时 --> {self.who} - {msg}')\nTimeoutError: 发送消息超时 --> 沙系 -"
    },
    {
      "role": "assistant",
      "content": "<think>\n嗯，我看到用户在运行这段代码的时候遇到了问题。代码本身是用来自动回复微信消息的，但是在运行后，它检测到当前聊天用户，并且能够获取信息，但是无法回复。而在PyCharm中，运行这段代码会报错，提示发送消息超时，错误信息是“发送消息超时 --> 沙系 -”。\n\n首先，我要分析一下这个问题。看起来，当代码试图发送回复消息时，遇到了超时错误。这可能是因为微信自动响应或者网络延迟的问题，导致发送消息的操作无法及时完成。\n\n接下来，我检查一下代码结构。用户提供的代码中，有一个`wx = WeChat()`这行，用来初始化微信自动化模拟器。然后，加载历史消息记录到DB字典中。如果没有`users.txt`文件，就使用默认用户“沙系”。接着，循环添加监听聊天窗口。\n\n在监听消息的部分，当收到新的消息时，会调用本地深度求模型进行回复，然后通过`wx.SendMsg(res_content)`发送回复。但是，在执行这行代码的时候，却抛出了一个TimeoutError，这意味着在发送消息时超时了。\n\n我觉得可能有几个原因导致这个问题：\n\n1. **超时设置不正确**：在`WeChat`类中，发送消息的操作可能有默认的超时时间，如果这个时间太短，就会导致超时错误。用户需要检查一下`wx.SendMsg`的参数，看看是否可以调整超时时间。\n\n2. **网络问题**：如果深度求模型的回复太大，或者服务器响应过慢，也可能导致发送消息的操作超时。不过，由于是本地服务，这个可能性比较低，但也不排除。\n\n3. **多线程或多任务问题**：在代码中，`wx.SendMsg`方法是否在主线程执行。如果在其他线程发送消息，可能会出现UI更新的问题，导致超时。\n\n4. **权限问题**：微信自动化模拟器是否有足够的权限来进行操作，比如发送消息。用户需要检查一下WXAuto库的使用情况，看看是否正确初始化了窗口和权限。\n\n接下来，我应该逐一排查这些可能性：\n\n首先，查看`wx.SendMsg(res_content)`这行代码，是否有合适的超时设置。可能`SendMsg`方法支持指定超时参数，比如`timeout`，如果不指定，默认值可能太小，可以尝试增加。\n\n其次，考虑将整个消息发送过程放在主线程中，以避免由于其他任务导致的延迟或错误。\n\n另外，我还需要检查一下深度求模型返回的内容是否符合微信的格式要求，或者是否有空白回复的情况，这可能影响`SendMsg`方法的执行。\n\n如果以上方法都无法解决问题，可能需要进一步调试，看看`SendMsg`内部是如何处理消息的，是否有异常被抛出，或者日志信息是否提供了更多的线索。\n\n最后，如果所有方法尝试过仍然不行，可能需要考虑升级微信自动化库，或者使用更高效的方式来发送消息，以避免超时问题。\n</think>\n\n根据上述分析，以下是逐步解决方案：\n\n1. **检查`wx.SendMsg`的超时设置**：\n   - 查看WXAuto库文档或源码，看看`SendMsg`方法是否支持指定超时参数。如果支持，在调用时增加合理的超时值。\n\n2. **将消息发送操作放到主线程中**：\n   - 确保在发送消息之前，程序已经在主线程执行。避免由于其他线程或任务影响消息发送。\n\n3. **检查权限和初始化**：\n   - 确认微信自动化模拟器是否正确初始化，并且具有发送消息的权限。如果必要，可以参考WXAuto库的示例代码，确保窗口和权限设置正确。\n\n4. **验证模型回复内容**：\n   - 检查深度求模型返回的回复内容是否符合预期。如果有空白或错误内容，可能导致`SendMsg`方法无法正常发送消息。\n\n5. **调试和日志记录**：\n   - 在关键步骤添加日志输出，监控消息发送前后的状态，以便快速定位问题所在。\n   \n6. **升级库或改用其他工具**：\n   - 如果以上方法无效，可以考虑更换微信自动化库，或者使用其他工具来实现消息回复功能。\n\n通过以上步骤，您可以逐步排查并解决`wx.SendMsg`超时的问题，从而让您的自动聊天程序正常运行。"
    },
    {
      "role": "user",
      "content": "测试"
    },
    {
      "role": "assistant",
      "content": "<think>\n嗯，我遇到了一个问题，就是在我的Python脚本中，当尝试调用`wx.SendMsg(res_content)`的时候，程序报错说`TimeoutError: 发送消息超时 --> 沙系 -`。我不太明白这是什么意思，也不确定如何解决这个问题。我需要一步步地来分析和修复这个问题。\n\n首先，我应该了解一下`wx.SendMsg`方法在WXAuto库中的实现方式，以及它为什么会抛出这样的异常。根据错误信息，超时似乎是因为消息发送操作没有完成，而是在一定时间内没有按时完成任务。这可能有几种原因：比如网络连接问题、目标窗口状态不正常，或者模型返回的内容有问题。\n\n接下来，我应该检查一下`res_content`是否正确。也许模型返回的内容有空白，导致消息发送失败。我可以在发送消息之前打印出`res_content`来确认它是否有意义。如果发现内容为空或不符合预期，那么可能需要调整我的请求参数或者修复模型的问题。\n\n然后，我想到可能是WXAuto库本身存在问题。也许在调用`SendMsg`时，它没有正确地处理超时设置，或者窗口状态不稳定。我可以查阅WXAuto的文档，看看是否有相关的超时参数可以配置，或者是否有其他方法来确保消息能够成功发送。\n\n另外，我还需要考虑程序的主线程问题。有时候，由于多线程操作，某些操作可能在主线程之外完成，导致`SendMsg`无法及时执行。我可以将消息发送操作放入一个与主线程独立的子线程中，并使用异常捕捉来处理潜在的问题。\n\n我还应该检查一下模型是否正常运行。确保我的请求参数和模型配置正确无误。如果发现问题，我可以尝试调整模型的参数或者重新加载模型以获取更好的结果。\n\n最后，如果以上方法都无法解决问题，我可能需要考虑升级WXAuto库到最新版本，或者寻找其他支持微信自动化的库，以确保程序能够稳定运行。\n\n通过这些步骤，我希望能够逐步排查并找到导致`wx.SendMsg`超时的原因，从而修复这个问题，让我的自动聊天程序正常运行。\n</think>\n\n根据上述思考过程，以下是具体的解决步骤：\n\n1. **打印模型返回内容**：\n   - 在发送消息之前，先将`res_content`打印出来，确认是否有实际内容。可以使用 `print(res_content)` 来实现这一点。\n\n2. **处理`SendMsg`超时异常**：\n   - 将调用`chat_win.SendMsg(res_content)` 放入一个try-except块中，这样可以捕获可能发生的超时错误，并记录详细信息。\n   \n3. **检查窗口状态**：\n   - 在发送消息之前，确保目标窗口（`chat_win`）是活跃的。可以通过打印 `print(chat_win.who)` 来确认窗口是否存在以及内容是否正确。\n\n4. **测试网络连接**：\n   - 确认系统的网络连接是否正常。如果发现网络问题，可以尝试更换网络环境或检查网络配置是否影响了消息发送。\n\n5. **验证数据存储和读取**：\n   - 确保`DB`字典能够正确加载和保存历史聊天记录。可以通过打印 `print(DB)` 来查看当前数据库内容，确认没有异常。\n\n6. **升级库或改用其他工具**：\n   - 如果问题仍未解决，可以考虑更换到更为稳定的微信自动化库，或者使用其他工具如Selenium来实现自动化功能。\n\n7. **确保消息发送在主线程中执行**：\n   - 确认所有操作都在主线程中进行。避免因多线程问题导致的消息发送异常。\n\n8. **检查模型返回内容的有效性**：\n   - 确认模型返回的内容是否合理，是否有空白或无效数据。如果存在问题，可以调整请求参数或者更换模型来获得更好的结果。\n\n通过以上步骤，您可以系统地排查并解决“发送消息超时”的问题，从而使您的自动聊天程序正常运行。"
    },
    {
      "role": "user",
      "content": "你是谁"
    },
    {
      "role": "assistant",
      "content": "<think>\n嗯，用户问“你是谁”，我需要详细解释我的身份和功能。我是一个AI助手，由中国的深度求（DeepSeek）公司独立开发，专注于通过算法和大数据分析来提供信息和帮助。\n\n首先，我要说明我的来源，这样用户能理解我不是来自其他知名公司或项目。然后，描述我的功能，让用户知道我可以回答各种问题，无论是科技、科学还是日常事务。接着，可以提到一些具体的能力，比如代码审查、文档生成等，这显示出我的多功能性。\n\n接下来，我需要强调数据安全和隐私保护，这是用户关心的重点，说明我会处理用户提供的信息时遵循相关法律法规。然后，考虑用户可能对模型背景感兴趣，可以简要介绍深度求的技术实力和他们在AI领域的成就。\n\n最后，以一个友好的邀请结束，让用户感到被支持，并愿意进一步提问。这整个过程需要用中文自然流畅地表达，避免使用任何 markdown 格式，只是简单的文本。\n</think>\n\n我是一个由中国深度求（DeepSeek）公司开发的智能助手，我可以通过算法和大数据分析来提供信息解答和帮助。我的目标是以专业、准确的方式为用户提供服务，同时也倡导健康、理性、尊重的交流环境。如果你有任何问题或需要帮助，随时告诉我！"
    }
  ]
}

可以看到，这样会导致每次需要发送很多聊天记录，且本地模型有时会重复回复以前的聊天信息，虽然可以手动删除聊天记录，但是整体体验很差。

能够实现自动监听微信好友聊天窗口，并根据聊天内容进行回复，由于调用的是本地模型，因此只有基本的文本回复功能。

偶尔可能出现模型调用失败的情况，原因可能是模型过大或者本地算力不足导致处理时间过长。

1
2
3
4
5
6


收到来自 沙系 的消息:
  你好，请介绍一下deepseek
模型调用失败: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=30)

回复 沙系:
  ⚠️ 处理请求时出错，请稍后再试

关闭微信，程序报错。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36


收到来自 沙系 的消息:
  在吗
  11
  测试
模型调用失败: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/chat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x00000199D427AD70>: Failed to establish a new connection: [WinError 10061] 由于目标计算机积极拒绝，无法连接。'))

回复 沙系:
  ⚠️ 处理请求时出错，请稍后再试
发送失败（尝试 1/3）: (1400, 'SetWindowPos', '无效的窗口句柄。')
发送失败（尝试 2/3）: (1400, 'SetWindowPos', '无效的窗口句柄。')
发送失败（尝试 3/3）: (1400, 'SetWindowPos', '无效的窗口句柄。')
⚠️ 最终发送失败: ⚠️ 处理请求时出错，请稍后再试...
主循环异常: (-2147220991, '事件无法调用任何订户', (None, None, None, 0, None))
主循环异常: (-2147220991, '事件无法调用任何订户', (None, None, None, 0, None))
主循环异常: (-2147220991, '事件无法调用任何订户', (None, None, None, 0, None))
Traceback (most recent call last):
  File "D:\PycharmProject\WebAutoTest\Auto_Chat\AI_Auto.py", line 100, in <module>
    listen_dict = wx.GetListenMessage()
  File "D:\PycharmProject\WebAutoTest\venv\lib\site-packages\wxauto\wxauto.py", line 596, in GetListenMessage
    msg = chat.GetNewMessage(savepic=chat.savepic, savefile=chat.savefile, savevoice=chat.savevoice)
  File "D:\PycharmProject\WebAutoTest\venv\lib\site-packages\wxauto\elements.py", line 333, in GetNewMessage
    MsgItems = self.C_MsgList.GetChildren()
  File "D:\PycharmProject\WebAutoTest\venv\lib\site-packages\wxauto\uiautomation.py", line 5960, in GetChildren
    child = self.GetFirstChildControl()
  File "D:\PycharmProject\WebAutoTest\venv\lib\site-packages\wxauto\uiautomation.py", line 5855, in GetFirstChildControl
    ele = _AutomationClient.instance().ViewWalker.GetFirstChildElement(self.Element)
_ctypes.COMError: (-2147220991, '事件无法调用任何订户', (None, None, None, 0, None))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\PycharmProject\WebAutoTest\Auto_Chat\AI_Auto.py", line 164, in <module>
    time.sleep(5)  # 防止频繁报错
KeyboardInterrupt

进程已结束,退出代码-1073741510 (0xC000013A: interrupted by Ctrl+C)