Skip to content

Commit debfc57

Browse files
committed
✨ 子代理类型系统、Compact 提示词优化与 UI 改进
- 新增 sub_agent_types 模块,支持子代理类型定义与提示词生成 - 重构 compact 提示词,改用结构化 8 段摘要格式提升上下文延续质量 - 优化系统提示词分层架构,支持动态工具描述注入 - 改进子代理 UI 展示(折叠/展开、状态指示、工具调用详情) - 增强 agent 服务端子代理管理与消息流转
1 parent 2f57342 commit debfc57

14 files changed

Lines changed: 1176 additions & 147 deletions

File tree

src/app/service/agent/compact_prompt.test.ts

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,12 @@ import { extractSummary, buildCompactUserPrompt, COMPACT_SYSTEM_PROMPT } from ".
33

44
describe("extractSummary", () => {
55
it("extracts content from <summary> tags", () => {
6-
const response = `<analysis>Some analysis here</analysis>
7-
8-
<summary>
9-
1. **Primary Request**: Build a feature
10-
2. **Key Decisions**: Used React
6+
const response = `<summary>
7+
1. **Task Overview**: Build a feature
8+
2. **Current State**: Used React
119
</summary>`;
1210
const result = extractSummary(response);
13-
expect(result).toBe("1. **Primary Request**: Build a feature\n2. **Key Decisions**: Used React");
11+
expect(result).toBe("1. **Task Overview**: Build a feature\n2. **Current State**: Used React");
1412
});
1513

1614
it("returns full content when no <summary> tag found", () => {
@@ -35,11 +33,24 @@ Line 3
3533
describe("buildCompactUserPrompt", () => {
3634
it("builds prompt without custom instruction", () => {
3735
const prompt = buildCompactUserPrompt();
38-
expect(prompt).toContain("Create a detailed summary");
36+
expect(prompt).toContain("continuation summary");
3937
expect(prompt).toContain("<summary>");
38+
expect(prompt).toContain("<analysis>");
4039
expect(prompt).not.toContain("Additional summarization instructions");
4140
});
4241

42+
it("包含所有 8 个摘要段落", () => {
43+
const prompt = buildCompactUserPrompt();
44+
expect(prompt).toContain("**Task Overview**");
45+
expect(prompt).toContain("**Current State**");
46+
expect(prompt).toContain("**User Messages**");
47+
expect(prompt).toContain("**Errors and Fixes**");
48+
expect(prompt).toContain("**Important Discoveries**");
49+
expect(prompt).toContain("**Current Work**");
50+
expect(prompt).toContain("**Next Steps**");
51+
expect(prompt).toContain("**Context to Preserve**");
52+
});
53+
4354
it("appends custom instruction when provided", () => {
4455
const prompt = buildCompactUserPrompt("只保留代码相关内容");
4556
expect(prompt).toContain("Additional summarization instructions from the user: 只保留代码相关内容");

src/app/service/agent/compact_prompt.ts

Lines changed: 41 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,51 @@
1-
export const COMPACT_SYSTEM_PROMPT = `You are a conversation summarizer. Your task is to create a detailed summary of the conversation, preserving all critical information needed to continue effectively.`;
1+
export const COMPACT_SYSTEM_PROMPT = `You are a conversation summarizer. Your task is to create a detailed summary of the conversation so far, paying close attention to the user's explicit requests and your previous actions. This summary will replace the conversation history, enabling efficient task resumption in a new context window.`;
22

33
export function buildCompactUserPrompt(customInstruction?: string): string {
4-
let prompt = `Create a detailed summary of the conversation so far.
4+
let prompt = `Write a structured, concise, and actionable continuation summary of the conversation so far. First analyze the conversation in <analysis> tags, then write the summary in <summary> tags.
55
6-
Before providing your final summary, wrap your analysis in <analysis> tags to organize your thoughts:
6+
Include the following sections in your <summary>:
77
8-
1. Chronologically analyze each message. For each section identify:
9-
- The user's explicit requests and intents
10-
- Key decisions and outcomes
11-
- Specific details: file names, code snippets, function signatures
12-
- Errors encountered and how they were fixed
13-
- Important user feedback or corrections
8+
1. **Task Overview**
9+
- The user's core request and success criteria
10+
- Any clarifications or constraints they specified
1411
15-
2. Double-check for completeness.
12+
2. **Current State**
13+
- What has been completed so far
14+
- Pages visited, data extracted, or actions performed (with URLs/selectors if relevant)
15+
- Key outputs or artifacts produced
1616
17-
Your summary should include the following sections in <summary> tags:
17+
3. **User Messages**
18+
- List ALL user messages that are not tool results
19+
- These are critical for understanding the user's feedback and changing intent
20+
- Include any mid-conversation corrections or preference changes
1821
19-
1. **Primary Request and Intent**: The user's core requests and success criteria
20-
2. **Key Decisions**: Important decisions made and their rationale
21-
3. **Current State**: What has been completed, files modified, artifacts produced
22-
4. **Errors and Fixes**: Problems encountered and their solutions
23-
5. **Pending Tasks**: Outstanding work items
24-
6. **Current Work**: What was being worked on immediately before this summary
25-
7. **Next Steps**: Specific actions needed to continue
22+
4. **Errors and Fixes**
23+
- All errors encountered and how they were resolved
24+
- User feedback on errors (especially "do it differently" instructions)
25+
- What approaches were tried that didn't work (and why)
2626
27-
Be concise but complete — preserve all information that would prevent duplicate work or repeated mistakes.`;
27+
5. **Important Discoveries**
28+
- Technical constraints or site-specific quirks uncovered
29+
- Decisions made and their rationale
30+
- Selectors, page structures, or API endpoints discovered that may be needed again
31+
32+
6. **Current Work**
33+
- Precisely what was being worked on immediately before this summary
34+
- Include specific details: which page, which step, what was the last action
35+
- If a sub-agent was running, what was its task and status
36+
37+
7. **Next Steps**
38+
- Specific actions needed to complete the task
39+
- Any blockers or open questions to resolve
40+
- Priority order if multiple steps remain
41+
- If there is a next step, describe exactly where you left off to prevent task drift
42+
43+
8. **Context to Preserve**
44+
- User preferences or style requirements
45+
- Domain-specific details that aren't obvious
46+
- Any promises or commitments made to the user
47+
48+
Be concise but complete — err on the side of including information that would prevent duplicate work or repeated mistakes.`;
2849

2950
if (customInstruction) {
3051
prompt += `\n\nAdditional summarization instructions from the user: ${customInstruction}`;
@@ -33,7 +54,7 @@ Be concise but complete — preserve all information that would prevent duplicat
3354
return prompt;
3455
}
3556

36-
/** 从 LLM 响应中提取 <summary> 标签内容 */
57+
/** 从 LLM 响应中提取 <summary> 标签内容,跳过 <analysis> 部分 */
3758
export function extractSummary(content: string): string {
3859
const match = content.match(/<summary>([\s\S]*?)<\/summary>/);
3960
return match ? match[1].trim() : content.trim();
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
import { describe, it, expect } from "vitest";
2+
import { resolveSubAgentType, getExcludeToolsForType, SUB_AGENT_TYPES } from "./sub_agent_types";
3+
4+
describe("Sub-Agent 类型系统", () => {
5+
describe("resolveSubAgentType", () => {
6+
it.concurrent("返回指定的内置类型", () => {
7+
expect(resolveSubAgentType("researcher")).toBe(SUB_AGENT_TYPES.researcher);
8+
expect(resolveSubAgentType("page_operator")).toBe(SUB_AGENT_TYPES.page_operator);
9+
expect(resolveSubAgentType("general")).toBe(SUB_AGENT_TYPES.general);
10+
});
11+
12+
it.concurrent("未知类型 fallback 到 general", () => {
13+
expect(resolveSubAgentType("unknown_type")).toBe(SUB_AGENT_TYPES.general);
14+
expect(resolveSubAgentType("")).toBe(SUB_AGENT_TYPES.general);
15+
});
16+
17+
it.concurrent("undefined/不传参数返回 general", () => {
18+
expect(resolveSubAgentType()).toBe(SUB_AGENT_TYPES.general);
19+
expect(resolveSubAgentType(undefined)).toBe(SUB_AGENT_TYPES.general);
20+
});
21+
});
22+
23+
describe("getExcludeToolsForType", () => {
24+
const allTools = [
25+
"web_fetch",
26+
"web_search",
27+
"opfs_read",
28+
"opfs_write",
29+
"opfs_list",
30+
"opfs_delete",
31+
"execute_script",
32+
"get_tab_content",
33+
"list_tabs",
34+
"open_tab",
35+
"close_tab",
36+
"activate_tab",
37+
"ask_user",
38+
"agent",
39+
"create_task",
40+
"update_task",
41+
"get_task",
42+
"list_tasks",
43+
"delete_task",
44+
];
45+
46+
it.concurrent("researcher 类型排除 tab 工具和其他不在白名单中的工具", () => {
47+
const config = SUB_AGENT_TYPES.researcher;
48+
const excluded = getExcludeToolsForType(config, allTools);
49+
50+
// researcher 不包含 tab 工具、ask_user、agent
51+
expect(excluded).toContain("get_tab_content");
52+
expect(excluded).toContain("list_tabs");
53+
expect(excluded).toContain("open_tab");
54+
expect(excluded).toContain("close_tab");
55+
expect(excluded).toContain("activate_tab");
56+
expect(excluded).toContain("ask_user");
57+
expect(excluded).toContain("agent");
58+
59+
// task 工具始终可用(ALWAYS_ALLOWED_TOOLS)
60+
expect(excluded).not.toContain("create_task");
61+
expect(excluded).not.toContain("update_task");
62+
expect(excluded).not.toContain("list_tasks");
63+
64+
// 应该保留的工具不在排除列表中
65+
expect(excluded).not.toContain("web_fetch");
66+
expect(excluded).not.toContain("web_search");
67+
expect(excluded).not.toContain("execute_script");
68+
expect(excluded).not.toContain("opfs_read");
69+
});
70+
71+
it.concurrent("page_operator 类型排除 web_search 和其他不在白名单中的工具", () => {
72+
const config = SUB_AGENT_TYPES.page_operator;
73+
const excluded = getExcludeToolsForType(config, allTools);
74+
75+
// page_operator 不包含 web_search、ask_user、agent
76+
expect(excluded).toContain("web_search");
77+
expect(excluded).toContain("ask_user");
78+
expect(excluded).toContain("agent");
79+
80+
// 应该保留 tab 工具
81+
expect(excluded).not.toContain("get_tab_content");
82+
expect(excluded).not.toContain("list_tabs");
83+
expect(excluded).not.toContain("open_tab");
84+
expect(excluded).not.toContain("execute_script");
85+
expect(excluded).not.toContain("web_fetch");
86+
87+
// task 工具始终可用
88+
expect(excluded).not.toContain("create_task");
89+
expect(excluded).not.toContain("update_task");
90+
});
91+
92+
it.concurrent("general 类型使用黑名单模式,只排除 ask_user 和 agent", () => {
93+
const config = SUB_AGENT_TYPES.general;
94+
const excluded = getExcludeToolsForType(config, allTools);
95+
96+
expect(excluded).toEqual(["ask_user", "agent"]);
97+
});
98+
99+
it.concurrent("allowedTools 和 excludeTools 都未指定时返回空数组", () => {
100+
const config: any = { name: "empty", maxIterations: 10, timeoutMs: 60000, systemPromptAddition: "" };
101+
const excluded = getExcludeToolsForType(config, allTools);
102+
expect(excluded).toEqual([]);
103+
});
104+
105+
it.concurrent("allowedTools 优先于 excludeTools", () => {
106+
const config: any = {
107+
name: "test",
108+
allowedTools: ["web_fetch"],
109+
excludeTools: ["web_search"],
110+
maxIterations: 10,
111+
timeoutMs: 60000,
112+
systemPromptAddition: "",
113+
};
114+
const excluded = getExcludeToolsForType(config, ["web_fetch", "web_search", "execute_script"]);
115+
116+
// 使用白名单模式,排除不在 allowedTools 中的
117+
expect(excluded).toContain("web_search");
118+
expect(excluded).toContain("execute_script");
119+
expect(excluded).not.toContain("web_fetch");
120+
});
121+
});
122+
});
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
// 子代理类型定义和注册表
2+
3+
export interface SubAgentTypeConfig {
4+
name: string;
5+
description: string; // 英文,写入 agent tool 描述供 LLM 选择
6+
allowedTools?: string[]; // 白名单模式(优先于 excludeTools)
7+
excludeTools?: string[]; // 黑名单模式
8+
maxIterations: number;
9+
timeoutMs: number;
10+
systemPromptAddition: string; // 注入 sub-agent system prompt 的角色说明
11+
}
12+
13+
// 所有子代理类型都默认可用的工具(task 工具用于与主 agent 共享任务进度)
14+
const ALWAYS_ALLOWED_TOOLS = [
15+
"create_task",
16+
"update_task",
17+
"get_task",
18+
"list_tasks",
19+
"delete_task",
20+
];
21+
22+
// 内置子代理类型
23+
export const SUB_AGENT_TYPES: Record<string, SubAgentTypeConfig> = {
24+
researcher: {
25+
name: "researcher",
26+
description: "Web search/fetch, data analysis, no tab interaction",
27+
allowedTools: [
28+
"web_fetch",
29+
"web_search",
30+
"opfs_read",
31+
"opfs_write",
32+
"opfs_list",
33+
"opfs_delete",
34+
"execute_script",
35+
],
36+
maxIterations: 20,
37+
timeoutMs: 600_000,
38+
systemPromptAddition: `## Role: Researcher
39+
40+
You are a research-focused sub-agent. Your job is to search, fetch, read, and summarize information.
41+
42+
**Capabilities:** Web search, URL fetching, data analysis via execute_script (sandbox mode only).
43+
**Limitations:** You cannot interact with browser tabs (no navigation, clicking, or form filling). You cannot ask the user questions.
44+
45+
**Guidelines:**
46+
- Use web_search to find relevant sources, then web_fetch to read them.
47+
- Synthesize information from multiple sources when possible.
48+
- Return structured, concise results that the parent agent can act on.
49+
- If you cannot find the information, say so clearly rather than guessing.`,
50+
},
51+
52+
page_operator: {
53+
name: "page_operator",
54+
description: "Browser tab interaction, page automation",
55+
allowedTools: [
56+
"get_tab_content",
57+
"list_tabs",
58+
"open_tab",
59+
"close_tab",
60+
"activate_tab",
61+
"execute_script",
62+
"web_fetch",
63+
"opfs_read",
64+
"opfs_write",
65+
"opfs_list",
66+
"opfs_delete",
67+
],
68+
maxIterations: 30,
69+
timeoutMs: 600_000,
70+
systemPromptAddition: `## Role: Page Operator
71+
72+
You are a page interaction sub-agent. Your job is to navigate web pages, interact with elements, and extract data.
73+
74+
**Capabilities:** Tab navigation, page reading, DOM interaction via execute_script, URL fetching.
75+
**Limitations:** You cannot search the web (use a researcher sub-agent for that). You cannot ask the user questions.
76+
77+
**Guidelines:**
78+
- Always read the page content (get_tab_content) before interacting to understand the current state.
79+
- Verify page state after each interaction — never assume an action succeeded.
80+
- For form filling, check that inputs exist and are visible before attempting to fill them.
81+
- Return extracted data in a structured format.`,
82+
},
83+
84+
general: {
85+
name: "general",
86+
description: "All tools, general-purpose",
87+
excludeTools: ["ask_user", "agent"],
88+
maxIterations: 30,
89+
timeoutMs: 600_000,
90+
systemPromptAddition: `## Role: General Sub-Agent
91+
92+
You are a general-purpose sub-agent with access to all tools except user interaction and nested sub-agents.
93+
94+
**Limitations:** You cannot ask the user questions and cannot spawn nested sub-agents. If you encounter a situation that requires user input, describe the situation clearly in your response so the parent agent can handle it.`,
95+
},
96+
};
97+
98+
/**
99+
* 解析子代理类型名称为配置,未知类型 fallback 到 general
100+
*/
101+
export function resolveSubAgentType(typeName?: string): SubAgentTypeConfig {
102+
if (!typeName) return SUB_AGENT_TYPES.general;
103+
return SUB_AGENT_TYPES[typeName] || SUB_AGENT_TYPES.general;
104+
}
105+
106+
/**
107+
* 根据类型配置和所有可用工具名,计算最终的排除工具列表
108+
* - 白名单模式:排除不在 allowedTools 中的工具
109+
* - 黑名单模式:直接使用 excludeTools
110+
* - 两者都未指定:返回空数组(不排除任何工具)
111+
*/
112+
export function getExcludeToolsForType(config: SubAgentTypeConfig, allToolNames: string[]): string[] {
113+
if (config.allowedTools && config.allowedTools.length > 0) {
114+
// 白名单模式:合并 allowedTools + ALWAYS_ALLOWED_TOOLS
115+
const allowedSet = new Set([...config.allowedTools, ...ALWAYS_ALLOWED_TOOLS]);
116+
return allToolNames.filter((name) => !allowedSet.has(name));
117+
}
118+
if (config.excludeTools && config.excludeTools.length > 0) {
119+
return [...config.excludeTools];
120+
}
121+
return [];
122+
}

0 commit comments

Comments
 (0)