Skip to content

Commit 2f57342

Browse files
committed
✨ 新增 get_task/delete_task 工具,优化 Agent 工具提示词分层
- task_tools: 新增 get_task(获取完整详情)和 delete_task(删除任务) - 工具 description 精简为能力描述,行为指导移入系统提示词 - 系统提示词 Task Management 重构为 When/When NOT/Workflow/Tips 结构 - ask_user description 精简,推荐选项规范移入系统提示词 - 新增 6 个 task_tools 测试用例
1 parent f90ba1d commit 2f57342

4 files changed

Lines changed: 162 additions & 22 deletions

File tree

src/app/service/agent/system_prompt.ts

Lines changed: 20 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ When stuck, **prioritize asking the user over repeated attempts**:
6363
- **Interact with page DOM** → \`execute_script(target='page')\` for clicking, filling forms, reading dynamic state. Runs in MAIN world (shares page globals). Use \`get_tab_content\` first to understand page structure.
6464
- **Compute without DOM** → \`execute_script(target='sandbox')\` for data processing, text parsing, calculations.
6565
- **Search the web** → \`web_search\` returns titles, URLs, and snippets. Follow up with \`web_fetch\` to read specific results.
66-
- **Ask user** → \`ask_user\` for questions. Prefer providing \`options\` for structured choices so the user can select quickly; add \`multiple: true\` for multi-select. The user can also type a custom response even when options are provided.
66+
- **Ask user** → \`ask_user\` to gather preferences, clarify ambiguous instructions, or get decisions on implementation choices. Prefer providing \`options\` for structured choices so the user can select quickly; add \`multiple: true\` for multi-select. If you recommend a specific option, put it first and append "(Recommended)". The user can always type a custom response even when options are provided.
6767
6868
## Sub-Agent
6969
@@ -83,13 +83,26 @@ Use the \`agent\` tool to delegate **independent subtasks** that don't require u
8383
8484
## Task Management
8585
86-
For **complex, multi-step tasks**, use task tools to track your progress:
87-
- \`create_task\` — Break the work into individual steps at the start.
88-
- \`update_task\` — Mark each step as \`in_progress\` when you begin it, and \`completed\` when done.
89-
- \`list_tasks\` — Review remaining steps, especially after resuming a conversation.
86+
Use task tools to create a structured task list that tracks your progress. This helps the user understand what you're doing and how much work remains.
9087
91-
**When to use:** Tasks that involve 3+ distinct steps (e.g., navigating multiple pages, processing data, multi-stage workflows). Do NOT create tasks for simple, single-step requests.
92-
**Workflow:** Create all tasks first → work through them one by one → update status as you go.
88+
**When to use:**
89+
- Complex tasks requiring 3+ distinct steps (e.g., navigating multiple pages, multi-stage data processing)
90+
- The user provides multiple things to do at once
91+
- After receiving new instructions — immediately capture requirements as tasks
92+
93+
**When NOT to use:**
94+
- Single, straightforward tasks that complete in 1-2 steps
95+
- Purely conversational or informational requests
96+
97+
**Workflow:**
98+
1. **Plan** — Call \`list_tasks\` to check for existing tasks, then \`create_task\` for each step with a clear imperative subject and enough description for context.
99+
2. **Execute** — Before starting each task, call \`update_task\` with \`status: "in_progress"\`. When done, set \`status: "completed"\`.
100+
3. **Adapt** — If a completed task reveals follow-up work, create new tasks. If a task becomes irrelevant, use \`delete_task\` to clean up. Use \`get_task\` to review a task's full description before starting it.
101+
102+
**Tips:**
103+
- Write subjects as brief imperatives: "Extract product prices", not "I will extract prices".
104+
- Include acceptance criteria in the description so progress is unambiguous.
105+
- Do not create tasks you intend to complete in the same tool call — tasks are for tracking multi-step progress, not logging what you already did.
93106
94107
## OPFS Workspace
95108

src/app/service/agent/tools/ask_user.ts

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,21 +4,21 @@ import type { ToolExecutor } from "@App/app/service/agent/tool_registry";
44
export const ASK_USER_DEFINITION: ToolDefinition = {
55
name: "ask_user",
66
description:
7-
"Ask the user a question and wait for their response (text only, no image support). " +
8-
"Use options for structured choices (single/multi-select). Times out after 5 minutes.",
7+
"Ask the user a question and wait for their response. " +
8+
"Text response only (no image support). Times out after 5 minutes. " +
9+
"The user can always type a custom response even when options are provided.",
910
parameters: {
1011
type: "object",
1112
properties: {
1213
question: { type: "string", description: "The question to ask the user" },
1314
options: {
1415
type: "array",
1516
items: { type: "string" },
16-
description:
17-
"Optional list of choices for the user. If provided, user selects from these instead of free text input.",
17+
description: "List of choices. User selects from these but can also type a custom response.",
1818
},
1919
multiple: {
2020
type: "boolean",
21-
description: "Allow selecting multiple options (default: false, single-select).",
21+
description: "Allow selecting multiple options (default: false).",
2222
},
2323
},
2424
required: ["question"],

src/app/service/agent/tools/task_tools.test.ts

Lines changed: 78 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@ import { describe, it, expect, vi } from "vitest";
22
import { createTaskTools, type Task } from "./task_tools";
33

44
describe("task_tools", () => {
5-
it("应创建 3 个工具", () => {
5+
it("应创建 5 个工具", () => {
66
const { tools } = createTaskTools();
7-
expect(tools).toHaveLength(3);
7+
expect(tools).toHaveLength(5);
88
const names = tools.map((t) => t.definition.name);
9-
expect(names).toEqual(["create_task", "update_task", "list_tasks"]);
9+
expect(names).toEqual(["create_task", "update_task", "get_task", "delete_task", "list_tasks"]);
1010
});
1111

1212
it("create_task 应创建自增 ID 的任务", async () => {
@@ -132,6 +132,81 @@ describe("task_tools", () => {
132132
expect(sendEvent).not.toHaveBeenCalled();
133133
});
134134

135+
it("get_task 应返回任务完整信息", async () => {
136+
const { tools } = createTaskTools();
137+
const create = tools.find((t) => t.definition.name === "create_task")!;
138+
const get = tools.find((t) => t.definition.name === "get_task")!;
139+
140+
await create.executor.execute({ subject: "Task", description: "Detailed info" });
141+
142+
const result = JSON.parse((await get.executor.execute({ task_id: "1" })) as string);
143+
expect(result).toEqual({ id: "1", subject: "Task", description: "Detailed info", status: "pending" });
144+
});
145+
146+
it("get_task 应对不存在的任务抛错", async () => {
147+
const { tools } = createTaskTools();
148+
const get = tools.find((t) => t.definition.name === "get_task")!;
149+
await expect(get.executor.execute({ task_id: "999" })).rejects.toThrow('Task "999" not found');
150+
});
151+
152+
it("delete_task 应删除任务", async () => {
153+
const { tools } = createTaskTools();
154+
const create = tools.find((t) => t.definition.name === "create_task")!;
155+
const del = tools.find((t) => t.definition.name === "delete_task")!;
156+
const list = tools.find((t) => t.definition.name === "list_tasks")!;
157+
158+
await create.executor.execute({ subject: "A" });
159+
await create.executor.execute({ subject: "B" });
160+
161+
const result = JSON.parse((await del.executor.execute({ task_id: "1" })) as string);
162+
expect(result).toEqual({ deleted: true, task_id: "1" });
163+
164+
const remaining = JSON.parse((await list.executor.execute({})) as string);
165+
expect(remaining).toHaveLength(1);
166+
expect(remaining[0].id).toBe("2");
167+
});
168+
169+
it("delete_task 应对不存在的任务抛错", async () => {
170+
const { tools } = createTaskTools();
171+
const del = tools.find((t) => t.definition.name === "delete_task")!;
172+
await expect(del.executor.execute({ task_id: "999" })).rejects.toThrow('Task "999" not found');
173+
});
174+
175+
it("delete_task 应调用 onSave 和 sendEvent", async () => {
176+
const onSave = vi.fn().mockResolvedValue(undefined);
177+
const sendEvent = vi.fn();
178+
const { tools } = createTaskTools({ onSave, sendEvent });
179+
const create = tools.find((t) => t.definition.name === "create_task")!;
180+
const del = tools.find((t) => t.definition.name === "delete_task")!;
181+
182+
await create.executor.execute({ subject: "Task" });
183+
onSave.mockClear();
184+
sendEvent.mockClear();
185+
186+
await del.executor.execute({ task_id: "1" });
187+
188+
expect(onSave).toHaveBeenCalledOnce();
189+
expect(onSave).toHaveBeenCalledWith([]);
190+
expect(sendEvent).toHaveBeenCalledOnce();
191+
expect(sendEvent).toHaveBeenCalledWith({
192+
type: "task_update",
193+
tasks: [],
194+
});
195+
});
196+
197+
it("get_task 不应触发 onSave 或 sendEvent", async () => {
198+
const onSave = vi.fn().mockResolvedValue(undefined);
199+
const sendEvent = vi.fn();
200+
const initial: Task[] = [{ id: "1", subject: "Existing", status: "pending", description: "Desc" }];
201+
const { tools } = createTaskTools({ initialTasks: initial, onSave, sendEvent });
202+
const get = tools.find((t) => t.definition.name === "get_task")!;
203+
204+
await get.executor.execute({ task_id: "1" });
205+
206+
expect(onSave).not.toHaveBeenCalled();
207+
expect(sendEvent).not.toHaveBeenCalled();
208+
});
209+
135210
it("多实例应独立", async () => {
136211
const instance1 = createTaskTools();
137212
const instance2 = createTaskTools();

src/app/service/agent/tools/task_tools.ts

Lines changed: 59 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,22 +10,26 @@ export type Task = {
1010

1111
const CREATE_TASK_DEFINITION: ToolDefinition = {
1212
name: "create_task",
13-
description:
14-
"Create a new task to track work progress. Use this to break complex, multi-step work into trackable steps. Returns the created task with an auto-assigned ID.",
13+
description: "Create a new task. Returns the created task with an auto-assigned ID and status 'pending'.",
1514
parameters: {
1615
type: "object",
1716
properties: {
18-
subject: { type: "string", description: "Brief title for the task" },
19-
description: { type: "string", description: "Detailed description of what needs to be done" },
17+
subject: {
18+
type: "string",
19+
description: "Brief, actionable title in imperative form (e.g., 'Extract product prices from page')",
20+
},
21+
description: {
22+
type: "string",
23+
description: "Detailed description including context and acceptance criteria",
24+
},
2025
},
2126
required: ["subject"],
2227
},
2328
};
2429

2530
const UPDATE_TASK_DEFINITION: ToolDefinition = {
2631
name: "update_task",
27-
description:
28-
'Update a task\'s status or details. Set status to "in_progress" when starting work, "completed" when done.',
32+
description: "Update a task's status or details. Can change status, subject, and description.",
2933
parameters: {
3034
type: "object",
3135
properties: {
@@ -42,9 +46,33 @@ const UPDATE_TASK_DEFINITION: ToolDefinition = {
4246
},
4347
};
4448

49+
const GET_TASK_DEFINITION: ToolDefinition = {
50+
name: "get_task",
51+
description: "Get a task's full details including description. list_tasks only returns id/subject/status.",
52+
parameters: {
53+
type: "object",
54+
properties: {
55+
task_id: { type: "string", description: "The task ID" },
56+
},
57+
required: ["task_id"],
58+
},
59+
};
60+
61+
const DELETE_TASK_DEFINITION: ToolDefinition = {
62+
name: "delete_task",
63+
description: "Delete a task permanently.",
64+
parameters: {
65+
type: "object",
66+
properties: {
67+
task_id: { type: "string", description: "The task ID to delete" },
68+
},
69+
required: ["task_id"],
70+
},
71+
};
72+
4573
const LIST_TASKS_DEFINITION: ToolDefinition = {
4674
name: "list_tasks",
47-
description: "List all tasks with their IDs, subjects, and statuses. Use to review remaining work.",
75+
description: "List all tasks with their IDs, subjects, and statuses (without descriptions).",
4876
parameters: {
4977
type: "object",
5078
properties: {},
@@ -125,6 +153,28 @@ export function createTaskTools(options?: TaskToolsOptions): {
125153
},
126154
};
127155

156+
const getExecutor: ToolExecutor = {
157+
execute: async (args: Record<string, unknown>) => {
158+
const task = tasks.get(args.task_id as string);
159+
if (!task) {
160+
throw new Error(`Task "${args.task_id}" not found`);
161+
}
162+
return JSON.stringify(task);
163+
},
164+
};
165+
166+
const deleteExecutor: ToolExecutor = {
167+
execute: async (args: Record<string, unknown>) => {
168+
const taskId = args.task_id as string;
169+
if (!tasks.has(taskId)) {
170+
throw new Error(`Task "${taskId}" not found`);
171+
}
172+
tasks.delete(taskId);
173+
await emitUpdate();
174+
return JSON.stringify({ deleted: true, task_id: taskId });
175+
},
176+
};
177+
128178
const listExecutor: ToolExecutor = {
129179
execute: async () => {
130180
const list = Array.from(tasks.values()).map((t) => ({
@@ -140,6 +190,8 @@ export function createTaskTools(options?: TaskToolsOptions): {
140190
tools: [
141191
{ definition: CREATE_TASK_DEFINITION, executor: createExecutor },
142192
{ definition: UPDATE_TASK_DEFINITION, executor: updateExecutor },
193+
{ definition: GET_TASK_DEFINITION, executor: getExecutor },
194+
{ definition: DELETE_TASK_DEFINITION, executor: deleteExecutor },
143195
{ definition: LIST_TASKS_DEFINITION, executor: listExecutor },
144196
],
145197
tasks,

0 commit comments

Comments
 (0)