Skip to content

Commit fecb9db

Browse files
committed
✨ Agent 多项改进:附件路径迁移、LLM 重试机制、系统提示词优化
- 附件存储从 conversations/attachments 迁移到 workspace/uploads,LLM 可通过 OPFS 路径访问 - Provider 非图片附件改用 OPFS 路径引用,减少 context 占用 - LLM API 调用增加重试机制(最多 5 次,递增延迟),UI 显示倒计时 - 系统提示词优化:强化 loop detection、ask early 策略、工具调用预算 - get_tab_content/web_fetch 必须提供 prompt 参数,引导高效使用 - ToolRegistry 错误信息包含可用工具列表,帮助 LLM 自我纠正 - 编辑消息支持附件增删和粘贴,停止生成时正确标记 tool call 状态 - 修复 SenderRuntime null safety、stopGeneration 竞态、base64 编码性能 - 新增 CAT.agent.model.getSummary API、后台模式 tooltip i18n
1 parent 19a231b commit fecb9db

25 files changed

Lines changed: 646 additions & 162 deletions

File tree

packages/message/server.ts

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,15 @@ export class SenderRuntime {
8787

8888
getExtMessageSender(): ExtMessageSender {
8989
const sender = this.sender as RuntimeMessageSender;
90+
if (!sender) {
91+
// postMessage 通道(如 Offscreen→SW)没有 RuntimeMessageSender
92+
return {
93+
windowId: -1,
94+
tabId: -1,
95+
frameId: undefined,
96+
documentId: undefined,
97+
};
98+
}
9099
return {
91100
windowId: sender.tab?.windowId || -1, // -1表示后台脚本
92101
tabId: sender.tab?.id || -1, // -1表示后台脚本

src/app/repo/agent_chat.test.ts

Lines changed: 58 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,23 +19,18 @@ function createMockOPFS() {
1919
kind: "file" as const,
2020
getFile: vi.fn(async () => {
2121
const content = dir.get(name);
22-
if (content instanceof Blob) {
23-
return content;
24-
}
25-
return new Blob([typeof content === "string" ? content : ""], { type: "application/octet-stream" });
22+
if (content instanceof Blob) return content;
23+
if (content instanceof ArrayBuffer) return new Blob([content]);
24+
if (content instanceof Uint8Array) return new Blob([content.buffer as ArrayBuffer]);
25+
if (typeof content === "string") return new Blob([content], { type: "application/octet-stream" });
26+
return new Blob([""], { type: "application/octet-stream" });
2627
}),
2728
createWritable: vi.fn(async () => {
2829
const writable = createMockWritable();
2930
const origClose = writable.close;
3031
writable.close = vi.fn(async () => {
3132
const written = writable.getData();
32-
if (written instanceof Blob) {
33-
dir.set(name, written);
34-
} else if (typeof written === "string") {
35-
dir.set(name, written);
36-
} else {
37-
dir.set(name, written);
38-
}
33+
dir.set(name, written);
3934
await origClose();
4035
});
4136
return writable;
@@ -95,11 +90,26 @@ function createMockOPFS() {
9590
return { rootStore, mockRoot };
9691
}
9792

93+
// 在 mock store 中按路径导航/创建目录
94+
function navigateDir(rootStore: Map<string, any>, ...path: string[]): Map<string, any> {
95+
let current = rootStore;
96+
for (const seg of path) {
97+
const key = "__dir__" + seg;
98+
if (!current.has(key)) {
99+
current.set(key, new Map());
100+
}
101+
current = current.get(key);
102+
}
103+
return current;
104+
}
105+
98106
describe("AgentChatRepo 附件存储", () => {
99107
let repo: AgentChatRepo;
108+
let rootStore: Map<string, any>;
100109

101110
beforeEach(() => {
102-
createMockOPFS();
111+
const mock = createMockOPFS();
112+
rootStore = mock.rootStore;
103113
repo = new AgentChatRepo();
104114
});
105115

@@ -118,6 +128,14 @@ describe("AgentChatRepo 附件存储", () => {
118128
expect(size).toBe(blob.size);
119129
});
120130

131+
it("saveAttachment 应存储到 workspace/uploads 路径", async () => {
132+
await repo.saveAttachment("att-ws", new Blob(["workspace data"]));
133+
134+
// 验证新路径存在: agents/workspace/uploads/att-ws
135+
const uploadsDir = navigateDir(rootStore, "agents", "workspace", "uploads");
136+
expect(uploadsDir.has("att-ws")).toBe(true);
137+
});
138+
121139
it("getAttachment 应返回已保存的附件", async () => {
122140
const blob = new Blob(["test data"], { type: "text/plain" });
123141
await repo.saveAttachment("att-3", blob);
@@ -134,6 +152,19 @@ describe("AgentChatRepo 附件存储", () => {
134152
expect(result).toBeNull();
135153
});
136154

155+
it("getAttachment 应能回退读取旧路径的附件", async () => {
156+
// 手动在旧路径写入附件数据: agents/conversations/attachments/{id}
157+
const attachDir = navigateDir(rootStore, "agents", "conversations", "attachments");
158+
attachDir.set("old-att", new Blob(["old path data"]));
159+
160+
const result = await repo.getAttachment("old-att");
161+
162+
expect(result).not.toBeNull();
163+
expect(result).toBeInstanceOf(Blob);
164+
const text = await result!.text();
165+
expect(text).toBe("old path data");
166+
});
167+
137168
it("deleteAttachment 应删除已保存的附件", async () => {
138169
const blob = new Blob(["data"], { type: "text/plain" });
139170
await repo.saveAttachment("att-4", blob);
@@ -144,6 +175,21 @@ describe("AgentChatRepo 附件存储", () => {
144175
expect(result).toBeNull();
145176
});
146177

178+
it("deleteAttachment 应同时清理新旧路径", async () => {
179+
// 在新路径保存
180+
await repo.saveAttachment("att-both", new Blob(["new"]));
181+
// 在旧路径也放一份
182+
const attachDir = navigateDir(rootStore, "agents", "conversations", "attachments");
183+
attachDir.set("att-both", new Blob(["old"]));
184+
185+
await repo.deleteAttachment("att-both");
186+
187+
// 新旧路径都应被清理
188+
const uploadsDir = navigateDir(rootStore, "agents", "workspace", "uploads");
189+
expect(uploadsDir.has("att-both")).toBe(false);
190+
expect(attachDir.has("att-both")).toBe(false);
191+
});
192+
147193
it("deleteAttachments 应批量删除附件", async () => {
148194
await repo.saveAttachment("att-a", new Blob(["a"]));
149195
await repo.saveAttachment("att-b", new Blob(["b"]));

src/app/repo/agent_chat.ts

Lines changed: 36 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import type { Conversation, ChatMessage } from "@App/app/service/agent/types";
22
import type { Task } from "@App/app/service/agent/tools/task_tools";
33
import { OPFSRepo } from "./opfs_repo";
4+
import { writeWorkspaceFile, getWorkspaceRoot, getDirectory } from "@App/app/service/agent/opfs_helpers";
45

56
const CONVERSATIONS_FILE = "conversations.json";
67
const MESSAGES_DIR = "data";
@@ -10,7 +11,8 @@ const TASKS_DIR = "tasks";
1011
// 目录结构:agents/conversations/
1112
// agents/conversations/conversations.json - 会话列表
1213
// agents/conversations/data/{id}.json - 每个会话的消息
13-
// agents/conversations/attachments/{id} - 附件二进制数据
14+
// agents/workspace/uploads/{id} - 附件二进制数据(LLM 可通过 opfs_read 访问)
15+
// agents/conversations/attachments/{id} - 旧路径(兼容读取)
1416
export class AgentChatRepo extends OPFSRepo {
1517
constructor() {
1618
super("conversations");
@@ -104,50 +106,57 @@ export class AgentChatRepo extends OPFSRepo {
104106
}
105107

106108
// ---- 附件存储 ----
109+
// 新路径: agents/workspace/uploads/{id}(LLM 可通过 opfs_read 访问)
110+
// 旧路径: agents/conversations/attachments/{id}(兼容读取)
107111

108-
// 保存附件数据(支持 base64/data URL 字符串或 Blob)
112+
// 保存附件数据到 workspace/uploads(支持 base64/data URL 字符串或 Blob)
109113
async saveAttachment(id: string, data: string | Blob): Promise<number> {
110-
const dir = await this.getChildDir(ATTACHMENTS_DIR);
111-
const fileHandle = await dir.getFileHandle(id, { create: true });
112-
const writable = await fileHandle.createWritable();
113-
114-
let size: number;
115-
if (data instanceof Blob) {
116-
await writable.write(data);
117-
size = data.size;
118-
} else {
119-
// 字符串数据(base64/data URL),按原始二进制存储
120-
const binary = this.dataUrlToBlob(data);
121-
await writable.write(binary);
122-
size = binary.size;
123-
}
124-
125-
await writable.close();
126-
return size;
114+
const result = await writeWorkspaceFile(`uploads/${id}`, data);
115+
return result.size;
127116
}
128117

129-
// 读取附件数据为 Blob
118+
// 读取附件数据为 Blob(先查 workspace 新路径,fallback 旧路径)
130119
async getAttachment(id: string): Promise<Blob | null> {
120+
// 新路径: agents/workspace/uploads/{id}
121+
try {
122+
const workspace = await getWorkspaceRoot();
123+
const dir = await getDirectory(workspace, "uploads");
124+
return await (await dir.getFileHandle(id)).getFile();
125+
} catch {
126+
// 新路径不存在,尝试旧路径
127+
}
128+
// 旧路径回退: agents/conversations/attachments/{id}
131129
try {
132130
const dir = await this.getChildDir(ATTACHMENTS_DIR);
133-
const fileHandle = await dir.getFileHandle(id);
134-
return await fileHandle.getFile();
131+
return await (await dir.getFileHandle(id)).getFile();
135132
} catch {
136133
return null;
137134
}
138135
}
139136

140-
// 删除单个附件
137+
// 删除单个附件(同时清理新旧路径)
141138
async deleteAttachment(id: string): Promise<void> {
142-
const dir = await this.getChildDir(ATTACHMENTS_DIR);
143-
await this.deleteFile(id, dir);
139+
// 新路径: agents/workspace/uploads/{id}
140+
try {
141+
const workspace = await getWorkspaceRoot();
142+
const dir = await getDirectory(workspace, "uploads");
143+
await dir.removeEntry(id);
144+
} catch {
145+
// 新路径不存在则忽略
146+
}
147+
// 旧路径: agents/conversations/attachments/{id}
148+
try {
149+
const dir = await this.getChildDir(ATTACHMENTS_DIR);
150+
await dir.removeEntry(id);
151+
} catch {
152+
// 旧路径不存在则忽略
153+
}
144154
}
145155

146156
// 删除会话关联的所有附件(需传入附件 ID 列表)
147157
async deleteAttachments(ids: string[]): Promise<void> {
148-
const dir = await this.getChildDir(ATTACHMENTS_DIR);
149158
for (const id of ids) {
150-
await this.deleteFile(id, dir);
159+
await this.deleteAttachment(id);
151160
}
152161
}
153162

@@ -171,20 +180,4 @@ export class AgentChatRepo extends OPFSRepo {
171180
await this.deleteFile(`${conversationId}.json`, tasksDir);
172181
}
173182

174-
// 将 data URL 或纯 base64 转换为 Blob
175-
private dataUrlToBlob(data: string): Blob {
176-
// 匹配 data URL 格式
177-
const match = data.match(/^data:([^;]+);base64,(.+)$/s);
178-
if (match) {
179-
const byteString = atob(match[2]);
180-
const ab = new ArrayBuffer(byteString.length);
181-
const ia = new Uint8Array(ab);
182-
for (let i = 0; i < byteString.length; i++) {
183-
ia[i] = byteString.charCodeAt(i);
184-
}
185-
return new Blob([ab], { type: match[1] });
186-
}
187-
// 纯文本存储
188-
return new Blob([data], { type: "application/octet-stream" });
189-
}
190183
}

src/app/service/agent/providers/anthropic.ts

Lines changed: 13 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -25,35 +25,30 @@ function convertContentBlocks(
2525
source: { type: "base64", media_type: match[1], data: match[2] },
2626
});
2727
} else {
28-
result.push({ type: "text", text: `[Image: ${block.name || "image"}]` });
29-
}
30-
} else {
31-
result.push({ type: "text", text: `[Image: ${block.name || "image"}]` });
32-
}
33-
break;
34-
}
35-
case "file": {
36-
const data = attachmentResolver?.(block.attachmentId);
37-
if (data) {
38-
const match = data.match(/^data:([^;]+);base64,(.+)$/s);
39-
if (match) {
4028
result.push({
41-
type: "document",
42-
source: { type: "base64", media_type: match[1], data: match[2] },
29+
type: "text",
30+
text: `[Image: ${block.name || "image"}, OPFS path: uploads/${block.attachmentId}]`,
4331
});
44-
} else {
45-
result.push({ type: "text", text: `[File: ${block.name}]` });
4632
}
4733
} else {
48-
result.push({ type: "text", text: `[File: ${block.name}]` });
34+
result.push({
35+
type: "text",
36+
text: `[Image: ${block.name || "image"}, OPFS path: uploads/${block.attachmentId}]`,
37+
});
4938
}
5039
break;
5140
}
41+
case "file":
42+
result.push({
43+
type: "text",
44+
text: `[File: ${block.name}${block.size ? ` (${block.size} bytes)` : ""}, OPFS path: uploads/${block.attachmentId}]`,
45+
});
46+
break;
5247
case "audio":
5348
// Anthropic 暂不支持音频,降级为文本描述
5449
result.push({
5550
type: "text",
56-
text: `[Audio: ${block.name || "audio"}${block.durationMs ? ` (${(block.durationMs / 1000).toFixed(1)}s)` : ""}]`,
51+
text: `[Audio: ${block.name || "audio"}${block.durationMs ? ` (${(block.durationMs / 1000).toFixed(1)}s)` : ""}, OPFS path: uploads/${block.attachmentId}]`,
5752
});
5853
break;
5954
}

src/app/service/agent/providers/openai.ts

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,18 @@ function convertContentBlocks(
1919
if (data) {
2020
result.push({ type: "image_url", image_url: { url: data } });
2121
} else {
22-
result.push({ type: "text", text: `[Image: ${block.name || "image"}]` });
22+
result.push({
23+
type: "text",
24+
text: `[Image: ${block.name || "image"}, OPFS path: uploads/${block.attachmentId}]`,
25+
});
2326
}
2427
break;
2528
}
2629
case "file":
27-
// OpenAI 不支持文件上传到 chat,降级为文本引用
28-
result.push({ type: "text", text: `[File: ${block.name}]` });
30+
result.push({
31+
type: "text",
32+
text: `[File: ${block.name}${block.size ? ` (${block.size} bytes)` : ""}, OPFS path: uploads/${block.attachmentId}]`,
33+
});
2934
break;
3035
case "audio": {
3136
const data = attachmentResolver?.(block.attachmentId);
@@ -36,10 +41,16 @@ function convertContentBlocks(
3641
const format = block.mimeType.split("/")[1] || "wav";
3742
result.push({ type: "input_audio", input_audio: { data: match[2], format } });
3843
} else {
39-
result.push({ type: "text", text: `[Audio: ${block.name || "audio"}]` });
44+
result.push({
45+
type: "text",
46+
text: `[Audio: ${block.name || "audio"}, OPFS path: uploads/${block.attachmentId}]`,
47+
});
4048
}
4149
} else {
42-
result.push({ type: "text", text: `[Audio: ${block.name || "audio"}]` });
50+
result.push({
51+
type: "text",
52+
text: `[Audio: ${block.name || "audio"}, OPFS path: uploads/${block.attachmentId}]`,
53+
});
4354
}
4455
break;
4556
}

src/app/service/agent/system_prompt.ts

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -6,27 +6,39 @@ const BUILTIN_SYSTEM_PROMPT = `You are ScriptCat Agent, an AI assistant built in
66
77
- Before interacting with a page, verify its current state — never assume a page is as expected.
88
- When a step fails, analyze the cause and change your approach. Never retry the exact same action.
9+
- Prefer asking the user over guessing. One good question saves many wasted tool calls.
910
1011
## Planning
1112
1213
- **Simple tasks** (single step, clear intent): act directly.
13-
- **Complex tasks** (multi-step, involves navigation across pages, form submissions, or data processing): first propose a numbered step-by-step plan, then wait for user confirmation before executing. The user may adjust, approve, or reject the plan.
14-
- During execution, if the situation deviates from the plan (unexpected page state, missing element, new information), pause and inform the user with an updated plan rather than silently improvising.
14+
- **Complex tasks** (multi-step, involves navigation across pages, form submissions, or data processing):
15+
1. **Think first** — Before any tool call, analyze the task and design a clear execution plan. Consider: what information do you need? What could go wrong? What's the most efficient sequence of steps?
16+
2. **Propose the plan** — Present a numbered step-by-step plan to the user and wait for confirmation. The user may adjust, approve, or reject.
17+
3. **Execute methodically** — Follow the approved plan step by step. Use task tools to track progress.
18+
- During execution, if the situation deviates from the plan (unexpected page state, missing element, new information), **stop and inform the user** with an updated plan rather than silently improvising.
19+
- **Avoid speculative chains** — Do not chain multiple uncertain actions hoping they will work. If the first step's outcome is uncertain, verify before proceeding.
1520
1621
## Tool Usage
1722
1823
Your tools come from Skills and MCP servers. Read each tool's description before calling — it defines behavior, parameters, and constraints. When a tool returns an error, read the error message and adapt — do not blindly retry.
1924
20-
### Loop Detection
21-
Detect when you are stuck and stop early:
22-
- **Hard loop**: Same tool + same arguments failing 2+ times → change approach immediately.
25+
**Tool call budget**: You have a limited number of tool calls per conversation (typically 50). Use them wisely — plan before acting, combine steps when possible, and stop early if stuck.
26+
27+
### Loop Detection — Stop Early, Ask Early
28+
Continuing to error wastes tokens and never produces good results. Detect when you are stuck and **ask the user before exhausting attempts**:
29+
- **Hard loop**: Same tool + same arguments failing 2+ times → stop immediately, do NOT retry.
2330
- **Ping-pong**: Alternating between two actions (A → B → A → B) without progress → stop and rethink.
24-
- **Persistent failure**: Same error 3+ times despite different approaches → escalate.
31+
- **Persistent failure**: 2 consecutive errors (even with different approaches) → stop trying and use \`ask_user\` immediately.
32+
- **Wrong path detection**: If after 3+ tool calls you are not making meaningful progress toward the goal, stop and reassess. Ask yourself: "Am I on the right track?" If unsure, ask the user.
33+
- **Diminishing returns**: If you're making tiny incremental progress but the goal still seems far, stop and ask the user if the approach is correct.
34+
35+
### Escalation
36+
When stuck, **prioritize asking the user over repeated attempts**:
37+
1. **One retry with a different strategy** — try ONE fundamentally different approach.
38+
2. **Ask the user** — if that also fails, immediately use \`ask_user\` to summarize what you tried and why it failed, then ask for guidance. Do not attempt a third approach without user input.
39+
3. **Declare blocked** — if the task is clearly impossible given current permissions or page state, say so directly.
2540
26-
### Escalation (in order of preference)
27-
1. **Switch strategy** — try a fundamentally different approach.
28-
2. **Ask the user** — summarize what you tried and why it failed, then ask for guidance.
29-
3. **Declare blocked** — if the task is impossible given current permissions or page state, say so clearly.
41+
**Default to asking**: When in doubt between trying another approach and asking the user, always ask. The user's time is less expensive than wasting tool calls on wrong approaches.
3042
3143
## Safety
3244

0 commit comments

Comments
 (0)