From 1af1cf9e2d3a03d6ffefc719908c30d31facb3be Mon Sep 17 00:00:00 2001 From: LearningGp Date: Thu, 2 Apr 2026 15:16:24 +0800 Subject: [PATCH 1/2] fix(docs): align Chinese documentation with actual codebase Co-Authored-By: Claude Opus 4.6 --- docs/zh/intro.md | 7 +++-- docs/zh/multi-agent/handoffs.md | 2 +- docs/zh/multi-agent/workflow.md | 6 ++--- docs/zh/quickstart/agent.md | 10 ++++--- docs/zh/quickstart/installation.md | 5 +++- docs/zh/quickstart/key-concepts.md | 32 ++++++++++++++++------- docs/zh/task/a2a.md | 4 +-- docs/zh/task/agent-config.md | 22 ++++++++-------- docs/zh/task/agent-skill.md | 2 +- docs/zh/task/hook.md | 5 +++- docs/zh/task/mcp.md | 6 ++--- docs/zh/task/memory.md | 2 +- docs/zh/task/model.md | 42 +++++++++++++++++++++++++----- docs/zh/task/online-training.md | 2 +- docs/zh/task/plan.md | 24 +++++++++++++++++ docs/zh/task/rag.md | 1 - docs/zh/task/studio.md | 2 +- docs/zh/task/tool.md | 30 +++++++++++---------- docs/zh/task/tts.md | 9 ++++--- 19 files changed, 147 insertions(+), 66 deletions(-) diff --git a/docs/zh/intro.md b/docs/zh/intro.md index be80bd010..ba17ce5fc 100644 --- a/docs/zh/intro.md +++ b/docs/zh/intro.md @@ -34,15 +34,14 @@ AgentScope 包含生产就绪的工具,解决智能体开发中的常见挑战 AgentScope 设计为与现有企业基础设施集成,无需大量修改: - **MCP 协议** - 与任何 MCP 兼容服务器集成,即时扩展智能体能力。连接到不断增长的 MCP 工具和服务生态系统——从文件系统和数据库到 Web 浏览器和代码解释器——无需编写自定义集成代码。 -- **A2A 协议** - 通过标准服务发现实现分布式多智能体协作。将智能体能力注册到 Nacos 或类似注册中心,允许智能体像调用微服务一样自然地发现和调用彼此。 +- **A2A 协议** - 通过扩展模块实现分布式多智能体协作。将智能体能力注册到 Nacos 或类似注册中心(通过 `agentscope-extensions-nacos-a2a` 扩展),允许智能体像调用微服务一样自然地发现和调用彼此。 ### 生产级别 为企业部署需求而构建: -- **高性能** - 基于 Project Reactor 的响应式架构确保非阻塞执行。GraalVM 原生镜像编译实现 200ms 冷启动时间,使 AgentScope 适用于 Serverless 和自动扩缩容环境。 -- **安全沙箱** - AgentScope Runtime 为不受信任的工具代码提供隔离的执行环境。包括用于 GUI 自动化、文件系统操作和移动设备交互的预构建沙箱,防止未授权访问系统资源。 -- **可观测性** - 原生集成 OpenTelemetry,实现整个智能体执行管道的分布式追踪。AgentScope Studio 为开发和生产环境提供可视化调试、实时监控和全面的日志记录。 +- **高性能** - 基于 Project Reactor 的响应式架构确保非阻塞执行。支持通过 Micronaut/Quarkus 等框架进行 GraalVM 原生镜像编译,适用于 Serverless 和自动扩缩容环境。 +- **可观测性** - 提供可插拔的 Tracer SPI,通过扩展模块支持 OpenTelemetry 集成,实现整个智能体执行管道的分布式追踪。AgentScope Studio 为开发和生产环境提供可视化调试、实时监控和全面的日志记录。 ## 系统要求 diff --git a/docs/zh/multi-agent/handoffs.md b/docs/zh/multi-agent/handoffs.md index 885c1dc37..b298522ba 100644 --- a/docs/zh/multi-agent/handoffs.md +++ b/docs/zh/multi-agent/handoffs.md @@ -105,7 +105,7 @@ public String transferToSales( 创建销售与支持两个 `AgentScopeAgent`,各自使用 ReActAgent、系统提示和包含对应交接工具的 Toolkit。 ```java -import com.alibaba.cloud.ai.graph.agent.agentscope.AgentScopeAgent; +import com.alibaba.cloud.ai.agent.agentscope.AgentScopeAgent; import io.agentscope.core.ReActAgent; import io.agentscope.core.memory.InMemoryMemory; import io.agentscope.core.model.DashScopeChatModel; diff --git a/docs/zh/multi-agent/workflow.md b/docs/zh/multi-agent/workflow.md index 8515e4c10..b603eb982 100644 --- a/docs/zh/multi-agent/workflow.md +++ b/docs/zh/multi-agent/workflow.md @@ -83,9 +83,9 @@ START → list_tables → call_get_schema → get_schema → generate_query → **配置**: -- `workflow.rag.enabled` – 启用 RAG 工作流 Bean。 -- `workflow.sql.enabled` – 启用 SQL 工作流 Bean。 -- `workflow.runner.enabled` – 为 `true` 时,启动时执行一次演示(需与上述其一搭配)。 +- `workflow.rag.enabled` – 启用 RAG 工作流 Bean(默认 `false`)。 +- `workflow.sql.enabled` – 启用 SQL 工作流 Bean(示例中默认 `true`)。 +- `workflow.runner.enabled` – 为 `true` 时,启动时执行一次演示(示例中默认 `true`,需与上述其一搭配)。 - **DashScope API Key**:`AI_DASHSCOPE_API_KEY` 或 `spring.ai.dashscope.api-key`(RAG 与 SQL 均需;RAG 还需配置 embedding 模型)。 ## 与其他模式的关系 diff --git a/docs/zh/quickstart/agent.md b/docs/zh/quickstart/agent.md index a25ba4bfa..684a14530 100644 --- a/docs/zh/quickstart/agent.md +++ b/docs/zh/quickstart/agent.md @@ -26,12 +26,16 @@ AgentScope 提供了开箱即用的 ReAct 智能体 `ReActAgent` 供开发者使 | 参数 | 进一步阅读 | 描述 | |------|-----------|------| | `name` (必需) | | 智能体的名称 | -| `sysPrompt` (必需) | | 智能体的系统提示 | +| `sysPrompt` | | 智能体的系统提示(建议设置) | | `model` (必需) | [模型集成](../task/model.md) | 智能体用于生成响应的模型 | | `toolkit` | [工具系统](../task/tool.md) | 用于注册/调用工具函数的工具模块 | | `memory` | [记忆管理](../task/memory.md) | 用于存储对话历史的短期记忆 | -| `longTermMemory` | [长期记忆](../task/long-term-memory.md) | 长期记忆 | -| `longTermMemoryMode` | [长期记忆](../task/long-term-memory.md) | 长期记忆的管理模式:`AGENT_CONTROL`(智能体自主控制)、`STATIC_CONTROL`(静态管理)、`BOTH`(两者皆有) | +| `description` | | 智能体的描述信息 | +| `generateOptions` | | LLM 生成参数(temperature、topP、maxTokens 等) | +| `toolExecutionContext` | [工具系统](../task/tool.md) | 工具执行上下文,用于向工具注入依赖 | +| `planNotebook` | [计划](../task/plan.md) | 计划管理器 | +| `longTermMemory` | [记忆管理](../task/memory.md) | 长期记忆 | +| `longTermMemoryMode` | [记忆管理](../task/memory.md) | 长期记忆的管理模式:`AGENT_CONTROL`(智能体自主控制)、`STATIC_CONTROL`(静态管理)、`BOTH`(两者皆有) | | `maxIters` | | 智能体生成响应的最大迭代次数(默认:10) | | `hooks` | [Hook 系统](../task/hook.md) | 用于自定义智能体行为的事件钩子 | | `modelExecutionConfig` | | 模型调用的超时/重试配置 | diff --git a/docs/zh/quickstart/installation.md b/docs/zh/quickstart/installation.md index 15a0b920e..635dd53ea 100644 --- a/docs/zh/quickstart/installation.md +++ b/docs/zh/quickstart/installation.md @@ -38,7 +38,7 @@ implementation 'io.agentscope:agentscope:1.0.11' All-in-one 包默认带以下依赖,不用额外配置: -- DashScope SDK(通义千问系列模型) +- DashScope 模型支持(通义千问系列模型,通过原生 HTTP 调用,无需额外 SDK) - MCP SDK(模型上下文协议) - Reactor Core、Jackson、SLF4J(基础框架) @@ -189,6 +189,7 @@ implementation 'io.agentscope:agentscope-core:1.0.11' |-----|------|-----------| | [agentscope-extensions-scheduler-common](https://central.sonatype.com/artifact/io.agentscope/agentscope-extensions-scheduler-common) | 调度通用模块 | `io.agentscope:agentscope-extensions-scheduler-common` | | [agentscope-extensions-scheduler-xxl-job](https://central.sonatype.com/artifact/io.agentscope/agentscope-extensions-scheduler-xxl-job) | XXL-Job 调度 | `io.agentscope:agentscope-extensions-scheduler-xxl-job` | +| [agentscope-extensions-scheduler-quartz](https://central.sonatype.com/artifact/io.agentscope/agentscope-extensions-scheduler-quartz) | Quartz 调度 | `io.agentscope:agentscope-extensions-scheduler-quartz` | #### 用户界面 @@ -228,6 +229,8 @@ implementation 'io.agentscope:agentscope-core:1.0.11' |---------|------|-----------| | agentscope-a2a-spring-boot-starter | A2A 集成 | `io.agentscope:agentscope-a2a-spring-boot-starter` | | agentscope-agui-spring-boot-starter | AG-UI 集成 | `io.agentscope:agentscope-agui-spring-boot-starter` | +| agentscope-chat-completions-web-starter | Chat Completions Web 集成 | `io.agentscope:agentscope-chat-completions-web-starter` | +| agentscope-nacos-spring-boot-starter | Nacos 集成 | `io.agentscope:agentscope-nacos-spring-boot-starter` | ### Quarkus diff --git a/docs/zh/quickstart/key-concepts.md b/docs/zh/quickstart/key-concepts.md index 3ec7b0208..cf8e2eb06 100644 --- a/docs/zh/quickstart/key-concepts.md +++ b/docs/zh/quickstart/key-concepts.md @@ -75,9 +75,11 @@ Message 是 AgentScope 最核心的数据结构,用于: | 字段 | 说明 | |-----|------| +| `id` | 消息唯一标识符(自动生成 UUID) | | `name` | 发送者名称,多智能体场景用于区分身份 | | `role` | 角色:`USER`、`ASSISTANT`、`SYSTEM` 或 `TOOL` | | `content` | 内容块列表,支持多种类型 | +| `timestamp` | 消息时间戳 | | `metadata` | 可选的结构化数据 | **内容类型**: @@ -102,6 +104,8 @@ Agent 返回的消息包含额外的元信息,帮助理解执行状态: | 值 | 说明 | |----|------| | `MODEL_STOP` | 任务正常完成 | +| `TOOL_CALLS` | 模型返回工具调用(内部工具,框架继续执行) | +| `STRUCTURED_OUTPUT` | 结构化输出完成 | | `TOOL_SUSPENDED` | 工具需要外部执行,等待提供结果 | | `REASONING_STOP_REQUESTED` | Reasoning 阶段被 Hook 暂停(HITL) | | `ACTING_STOP_REQUESTED` | Acting 阶段被 Hook 暂停(HITL) | @@ -136,10 +140,11 @@ Msg imgMsg = Msg.builder() Agent 接口定义了智能体的核心契约: ```java -public interface Agent { - Mono call(Msg msg); // 处理消息,返回响应 - Flux stream(Msg msg); // 流式返回响应 - void interrupt(); // 中断执行 +public interface Agent extends CallableAgent, StreamableAgent, ObservableAgent { + String getAgentId(); + String getName(); + void interrupt(); + void interrupt(Msg msg); } ``` @@ -242,8 +247,13 @@ Formatter 负责将 AgentScope 的消息转换为特定 LLM API 所需的格式 - 多智能体场景的身份处理 **内置实现**: -- `DashScopeFormatter` - 阿里云百炼(通义千问系列) -- `OpenAIFormatter` - OpenAI 及兼容 API +- `DashScopeChatFormatter` - 阿里云百炼(通义千问系列) +- `OpenAIChatFormatter` - OpenAI 及兼容 API +- `AnthropicChatFormatter` - Anthropic(Claude 系列) +- `GeminiChatFormatter` - Google Gemini +- `OllamaChatFormatter` - Ollama 本地模型 +- `DeepSeekFormatter` - DeepSeek +- `GLMFormatter` - GLM(智谱) > 格式化器根据 Model 类型自动选择,通常无需手动配置。 @@ -265,6 +275,9 @@ Hook 通过事件机制在 ReAct 循环的关键节点提供扩展点: | `PreActingEvent` | 执行工具前 | ✓ | | `PostActingEvent` | 工具执行后 | ✓ | | `ActingChunkEvent` | 工具流式输出时 | - | +| `PreSummaryEvent` | 摘要生成前 | ✓ | +| `PostSummaryEvent` | 摘要生成后 | ✓ | +| `SummaryChunkEvent` | 摘要流式输出时 | - | | `ErrorEvent` | 发生错误时 | - | **Hook 优先级**:Hook 按优先级执行,数值越小优先级越高,默认 100。 @@ -312,9 +325,10 @@ ReActAgent agent = ReActAgent.builder() **解决的问题**:智能体的对话历史、配置等状态需要能够保存和恢复,以支持会话持久化。 -AgentScope 将对象的"初始化"与"状态"分离: -- `saveState()` - 导出当前状态为可序列化的 Map -- `loadState()` - 从保存的状态恢复 +AgentScope 将对象的"初始化"与"状态"分离,通过 `StateModule` 接口管理: +- `saveTo(Session, SessionKey)` - 将当前状态保存到 Session +- `loadFrom(Session, SessionKey)` - 从 Session 恢复状态 +- `loadIfExists(Session, SessionKey)` - 如果存在则恢复状态 **Session** 提供跨运行的持久化存储: diff --git a/docs/zh/task/a2a.md b/docs/zh/task/a2a.md index c00de4369..cabac5282 100644 --- a/docs/zh/task/a2a.md +++ b/docs/zh/task/a2a.md @@ -160,8 +160,8 @@ ConfigurableAgentCard agentCard = new ConfigurableAgentCard.Builder() .description("智能助手") .version("1.0.0") .skills(List.of( - new AgentSkill("text-generation", "文本生成"), - new AgentSkill("question-answering", "问答"))) + AgentSkill.builder().name("text-generation").description("文本生成").skillContent("").build(), + AgentSkill.builder().name("question-answering").description("问答").skillContent("").build())) .build(); AgentScopeA2aServer.builder(agentBuilder) diff --git a/docs/zh/task/agent-config.md b/docs/zh/task/agent-config.md index 0909d2c79..0e19f316a 100644 --- a/docs/zh/task/agent-config.md +++ b/docs/zh/task/agent-config.md @@ -254,10 +254,10 @@ ExecutionConfig modelConfig = ExecutionConfig.builder() **默认配置**(ExecutionConfig.MODEL_DEFAULTS): - 超时:5 分钟 - 最大尝试:3 次(初始 + 2 次重试) -- 初始退避:1 秒 -- 最大退避:10 秒 +- 初始退避:2 秒 +- 最大退避:30 秒 - 退避倍数:2.0(指数退避) -- 重试条件:所有错误 +- 重试条件:可重试错误(429、5xx、超时、网络 IO 错误) **使用场景**: - 调整模型 API 的超时时间 @@ -460,7 +460,7 @@ Formatter 负责在 AgentScope 格式和模型 API 格式之间转换。 提供 Agent 可用的技能集。它通过提供工具函数让 Agent 加载技能,并通过 Hook 机制自动注入技能提示。 ```java -SkillBox skillBox = new SkillBox(); +SkillBox skillBox = new SkillBox(new Toolkit()); .skillBox(skillBox) ``` @@ -560,12 +560,12 @@ public class ComprehensiveAgentExample { public CityWeather() {} } - // 5. 定义技能类 - public static class WeatherSkill extends AgentSkill { - public WeatherSkill() { - super("weather", "weather", "weather", null); - } - } + // 5. 定义技能 + AgentSkill weatherSkill = AgentSkill.builder() + .name("weather") + .description("天气查询技能") + .skillContent("# Weather Skill\n查询指定城市的天气信息。") + .build(); public static void main(String[] args) { // ============================================================ @@ -636,7 +636,7 @@ public class ComprehensiveAgentExample { // 第五步:配置技能 // ============================================================ - SkillBox skillBox = new SkillBox(); + SkillBox skillBox = new SkillBox(new Toolkit()); skillBox.registerSkill(new WeatherSkill()); // ============================================================ diff --git a/docs/zh/task/agent-skill.md b/docs/zh/task/agent-skill.md index 7b96a6161..c7ff76b72 100644 --- a/docs/zh/task/agent-skill.md +++ b/docs/zh/task/agent-skill.md @@ -150,7 +150,7 @@ ReActAgent agent = ReActAgent.builder() ## 简化的集成方式 ```java -SkillBox skillBox = new SkillBox(); +SkillBox skillBox = new SkillBox(new Toolkit()); skillBox.registerSkill(dataSkill); diff --git a/docs/zh/task/hook.md b/docs/zh/task/hook.md index b5903295c..573833bde 100644 --- a/docs/zh/task/hook.md +++ b/docs/zh/task/hook.md @@ -15,7 +15,7 @@ AgentScope Java 使用**统一事件模型**,所有 Hook 都需要实现 `onEv | 事件类型 | 时机 | 可修改 | 描述 | |-----------------------|---------------------------|--------|------------------------------------------| -| PreCallEvent | 智能体调用前 | ❌ | 智能体开始处理之前(仅通知) | +| PreCallEvent | 智能体调用前 | ✅ | 智能体开始处理之前(可修改输入消息) | | PostCallEvent | 智能体调用后 | ✅ | 智能体完成响应之后(可修改最终消息) | | PreReasoningEvent | 推理前 | ✅ | LLM 推理之前(可修改输入消息) | | PostReasoningEvent | 推理后 | ✅ | LLM 推理完成之后(可修改推理结果) | @@ -23,6 +23,9 @@ AgentScope Java 使用**统一事件模型**,所有 Hook 都需要实现 `onEv | PreActingEvent | 工具执行前 | ✅ | 工具执行之前(可修改工具参数) | | PostActingEvent | 工具执行后 | ✅ | 工具执行之后(可修改工具结果) | | ActingChunkEvent | 工具流式期间 | ❌ | 工具执行进度块(仅通知) | +| PreSummaryEvent | 摘要生成前 | ✅ | 达到最大迭代次数时,摘要生成之前 | +| PostSummaryEvent | 摘要生成后 | ✅ | 摘要生成完成之后(可修改摘要结果) | +| SummaryChunkEvent | 摘要流式期间 | ❌ | 摘要流式生成的每个块(仅通知) | | ErrorEvent | 发生错误时 | ❌ | 发生错误时(仅通知) | ## 创建 Hook diff --git a/docs/zh/task/mcp.md b/docs/zh/task/mcp.md index 0e7e01d17..e885da1d9 100644 --- a/docs/zh/task/mcp.md +++ b/docs/zh/task/mcp.md @@ -157,9 +157,9 @@ String groupName = "filesystem"; toolkit.createToolGroup(groupName, "Tools for operating system files", true); // 将 MCP 工具注册到组中 -toolkit.registration().mcpClient(mcpClient).group("groupName").apply(); +toolkit.registration().mcpClient(mcpClient).group(groupName).apply(); -// 创建仅使用特定组的智能体 +// 创建使用工具包的智能体(仅 active 组中的工具可用) ReActAgent agent = ReActAgent.builder() .name("Assistant") .model(model) @@ -222,7 +222,7 @@ McpClientWrapper client = McpClientBuilder.create("mcp") .block(); ``` -> **注意**:Query 参数仅对 HTTP 传输(SSE 和 HTTP)有效,对 StdIO 传输会被忽略。 +> **注意**:Query 参数和 HTTP 头仅对 HTTP 传输(SSE 和 HTTP)有效,对 StdIO 传输会被静默忽略。 ### 同步 vs 异步客户端 diff --git a/docs/zh/task/memory.md b/docs/zh/task/memory.md index 646607fcf..e25ac67b2 100644 --- a/docs/zh/task/memory.md +++ b/docs/zh/task/memory.md @@ -316,4 +316,4 @@ mvn exec:java -Dexec.mainClass="io.agentscope.examples.advanced.ReMeExample" - [AutoContextMemory 详细文档](https://github.com/agentscope-ai/agentscope-java/blob/main/agentscope-extensions/agentscope-extensions-autocontext-memory/README_zh.md) - [Session 管理](./session.md) -- [ReActAgent 使用指南](./react-agent.md) +- [Agent 配置](./agent-config.md) diff --git a/docs/zh/task/model.md b/docs/zh/task/model.md index a273f588a..6c5195653 100644 --- a/docs/zh/task/model.md +++ b/docs/zh/task/model.md @@ -7,7 +7,7 @@ | 提供商 | 类 | 流式 | 工具 | 视觉 | 推理 | |------------|-------------------------|-------|-------|-------|-------| | DashScope | `DashScopeChatModel` | ✅ | ✅ | ✅ | ✅ | -| OpenAI | `OpenAIChatModel` | ✅ | ✅ | ✅ | | +| OpenAI | `OpenAIChatModel` | ✅ | ✅ | ✅ | ✅ | | Anthropic | `AnthropicChatModel` | ✅ | ✅ | ✅ | ✅ | | Gemini | `GeminiChatModel` | ✅ | ✅ | ✅ | ✅ | | Ollama | `OllamaChatModel` | ✅ | ✅ | ✅ | ✅ | @@ -47,6 +47,31 @@ DashScopeChatModel model = DashScopeChatModel.builder() | `stream` | 是否启用流式输出,默认 `true` | | `enableThinking` | 启用思考模式,模型会展示推理过程 | | `enableSearch` | 启用联网搜索,获取实时信息 | +| `endpointType` | API 端点类型(默认 `AUTO` 自动识别),可选 `TEXT`(强制文本 API)或 `MULTIMODAL`(强制多模态 API) | +| `defaultOptions` | 默认生成选项(temperature、maxTokens 等) | +| `formatter` | 消息格式化器(默认 `DashScopeChatFormatter`) | + +### 端点类型(endpointType) + +DashScope 模型支持文本和多模态两种 API 端点。默认情况下,框架会根据模型名称自动识别应使用的端点类型(如 `qwen-vl-*` 以及 `qwen3.5` 系列自动使用多模态端点)。 + +当自动识别不准确时(例如使用自定义模型名称或兼容 API),可以手动指定端点类型: + +```java +// 强制使用多模态 API(适用于包含图片/音频等内容的场景) +DashScopeChatModel model = DashScopeChatModel.builder() + .apiKey(System.getenv("DASHSCOPE_API_KEY")) + .modelName("custom-model") + .endpointType(EndpointType.MULTIMODAL) + .build(); + +// 强制使用文本 API +DashScopeChatModel model = DashScopeChatModel.builder() + .apiKey(System.getenv("DASHSCOPE_API_KEY")) + .modelName("custom-model") + .endpointType(EndpointType.TEXT) + .build(); +``` ### 思考模式 @@ -102,6 +127,7 @@ OpenAIChatModel model = OpenAIChatModel.builder() | `modelName` | 模型名称,如 `gpt-4o`、`gpt-4o-mini` | | `baseUrl` | 自定义 API 端点(可选) | | `stream` | 是否启用流式输出,默认 `true` | +| `generateOptions` | 默认生成选项(注意:OpenAI 使用 `.generateOptions()` 而非 `.defaultOptions()`) | ## Anthropic @@ -271,7 +297,7 @@ GenerateOptions options = GenerateOptions.builder() .topK(40) // Top-K 采样 .maxTokens(2000) // 最大输出 token 数 .seed(42L) // 随机种子 - .toolChoice(new ToolChoice.auto()) // 工具选择策略 + .toolChoice(new ToolChoice.Auto()) // 工具选择策略 .build(); DashScopeChatModel model = DashScopeChatModel.builder() @@ -295,17 +321,21 @@ OllamaChatModel model = OllamaChatModel.builder() | `topP` | Double | 核采样阈值,0.0-1.0 | | `topK` | Integer | 限制候选 token 数量 | | `maxTokens` | Integer | 最大生成 token 数 | +| `maxCompletionTokens` | Integer | 最大完成 token 数 | | `thinkingBudget` | Integer | 思考 token 预算 | +| `reasoningEffort` | String | 推理强度(如 `low`、`medium`、`high`) | +| `frequencyPenalty` | Double | 频率惩罚,-2.0-2.0 | +| `presencePenalty` | Double | 存在惩罚,-2.0-2.0 | | `seed` | Long | 随机种子 | | `toolChoice` | ToolChoice | 工具选择策略 | ### 工具选择策略 ```java -ToolChoice.auto() // 模型自行决定(默认) -ToolChoice.none() // 禁止工具调用 -ToolChoice.required() // 强制调用工具 -ToolChoice.specific("tool_name") // 强制调用指定工具 +new ToolChoice.Auto() // 模型自行决定(默认) +new ToolChoice.None() // 禁止工具调用 +new ToolChoice.Required() // 强制调用工具 +new ToolChoice.Specific("tool_name") // 强制调用指定工具 ``` ### 扩展参数 diff --git a/docs/zh/task/online-training.md b/docs/zh/task/online-training.md index 4eac61e39..960bb61c3 100644 --- a/docs/zh/task/online-training.md +++ b/docs/zh/task/online-training.md @@ -259,7 +259,7 @@ ReActAgent agent = ReActAgent.builder() .build(); // 用户请求正常处理(使用 GPT-4),自动采样10%请求用于训练 -Msg response = agent.call(Msg.userMsg("搜索 Python 教程")).block(); +Msg response = agent.call(Msg.builder().textContent("搜索 Python 教程").build()).block(); // 3. 训练完成后停止 runner.stop(); diff --git a/docs/zh/task/plan.md b/docs/zh/task/plan.md index 1978a5491..8263dc3a0 100644 --- a/docs/zh/task/plan.md +++ b/docs/zh/task/plan.md @@ -86,6 +86,30 @@ Msg response = agent.call(task).block(); ## 配置选项 +### 用户确认(needUserConfirm) + +控制智能体在创建计划后是否需要等待用户确认才能开始执行。 + +**默认值**:`true`(需要用户确认) + +当启用时,智能体创建计划后会展示计划内容并询问用户是否同意执行(例如 "Should I proceed with this plan?"),只有在用户明确确认后(如回复 "yes"、"go ahead")才会开始执行子任务。如果用户的消息本身已隐含执行意图(如 "execute the plan"),则跳过确认直接执行。 + +当禁用时,智能体创建计划后会立即开始执行,无需等待用户确认。 + +```java +// 需要用户确认(默认行为) +PlanNotebook planNotebook = PlanNotebook.builder() + .needUserConfirm(true) + .build(); + +// 无需确认,创建计划后立即执行 +PlanNotebook planNotebook = PlanNotebook.builder() + .needUserConfirm(false) + .build(); +``` + +> **注意**:当有子任务正在执行时(状态为 `in_progress`),无论 `needUserConfirm` 如何设置,都不会再注入确认规则提示。 + ### 限制子任务数量 ```java diff --git a/docs/zh/task/rag.md b/docs/zh/task/rag.md index 9a40f25b1..dbe9a9333 100644 --- a/docs/zh/task/rag.md +++ b/docs/zh/task/rag.md @@ -48,7 +48,6 @@ ReActAgent agent = ReActAgent.builder() .limit(3) .scoreThreshold(0.3) .build()) - .enableOnlyForUserQueries(true) // 仅为用户消息检索 .build(); ``` diff --git a/docs/zh/task/studio.md b/docs/zh/task/studio.md index b5034b677..ac5e8a38f 100644 --- a/docs/zh/task/studio.md +++ b/docs/zh/task/studio.md @@ -29,7 +29,7 @@ npm安装 npm install -g @agentscope/studio # or npm install @agentscope/studio as_studio ``` -Studio 将运行在 http://localhost:5173 +Studio 将运行在 http://localhost:5173(前端开发服务器) ![Studio Server 页面](../../imgs/studioServer.png) diff --git a/docs/zh/task/tool.md b/docs/zh/task/tool.md index 7ade449d4..012ac887f 100644 --- a/docs/zh/task/tool.md +++ b/docs/zh/task/tool.md @@ -72,7 +72,7 @@ public Mono search( ### 流式工具 -使用 `ToolEmitter` 发送中间进度,适合长时间任务: +使用 `ToolEmitter` 发送中间进度,适合长时间任务(进度仅对 Hook 可见,不会发送给 LLM): ```java @Tool(description = "生成数据") @@ -200,6 +200,7 @@ toolkit.registerTool(new WriteFileTool("/safe/workspace")); | 工具 | 方法 | 说明 | |------|------|------| | `ReadFileTool` | `view_text_file` | 按行范围查看文件 | +| `ReadFileTool` | `list_directory` | 列出目录下的文件和文件夹 | | `WriteFileTool` | `write_text_file` | 创建/覆盖/替换文件内容 | | `WriteFileTool` | `insert_text_file` | 在指定行插入内容 | @@ -230,8 +231,8 @@ toolkit.registerTool(new OpenAIMultiModalTool(System.getenv("OPENAI_API_KEY"))); | 工具 | 能力 | |------|------| -| `DashScopeMultiModalTool` | 文生图、图生文、文生语音、语音转文字 | -| `OpenAIMultiModalTool` | 文生图、图片编辑、图片变体、图生文、文生语音、语音转文字 | +| `DashScopeMultiModalTool` | 文生图、图生文、文生语音、语音转文字、文生视频、图生视频、首尾帧图生视频、视频理解 | +| `OpenAIMultiModalTool` | 文生图、图生文、文生语音、语音转文字 | ### 子智能体工具 @@ -282,7 +283,7 @@ Toolkit toolkit = new Toolkit(ToolkitConfig.builder() | 配置项 | 说明 | 默认值 | |--------|------|--------| -| `parallel` | 是否并行执行多个工具 | `true` | +| `parallel` | 是否并行执行多个工具 | `false` | | `allowToolDeletion` | 是否允许删除工具 | `true` | | `executionConfig.timeout` | 工具执行超时时间 | 5 分钟 | @@ -292,7 +293,7 @@ Toolkit toolkit = new Toolkit(ToolkitConfig.builder() ```java toolkit.registerMetaTool(); -// Agent 可调用 "reset_equipped_tools" 激活/停用工具组 +// Agent 可调用 "reset_equipped_tools" 激活指定的工具组(重置为指定的工具组集合) ``` 当工具组较多时,可让智能体根据任务需求自主选择激活哪些工具组。 @@ -327,15 +328,16 @@ if (response.getGenerateReason() == GenerateReason.TOOL_SUSPENDED) { List pendingTools = response.getContentBlocks(ToolUseBlock.class); // 外部执行后,提供结果 - Msg toolResult = Msg.builder() - .role(MsgRole.TOOL) - .content(ToolResultBlock.of(toolUse.getId(), toolUse.getName(), - TextBlock.builder().text("外部执行结果").build())) - .build(); - - // 恢复执行 - response = agent.call(toolResult).block(); -} + for (ToolUseBlock toolUse : pendingTools) { + Msg toolResult = Msg.builder() + .role(MsgRole.TOOL) + .content(ToolResultBlock.of(toolUse.getId(), toolUse.getName(), + TextBlock.builder().text("外部执行结果").build())) + .build(); + + // 恢复执行 + response = agent.call(toolResult).block(); + } ``` ## 仅 Schema 工具(Schema Only Tool) diff --git a/docs/zh/task/tts.md b/docs/zh/task/tts.md index b2daf2378..4d1b07379 100644 --- a/docs/zh/task/tts.md +++ b/docs/zh/task/tts.md @@ -63,7 +63,7 @@ ReActAgent agent = ReActAgent.builder() .build(); // 4. 与 Agent 对话 - Agent 会边生成回复边朗读 -Msg response = agent.call(Msg.user("你好,今天天气怎么样?")).block(); +Msg response = agent.call(Msg.builder().textContent("你好,今天天气怎么样?").build()).block(); ``` ### 服务器模式(Web/SSE) @@ -152,15 +152,18 @@ Agent 通过工具方式调用 TTS,Agent 自行判断在需要时将文字转 DashScopeMultiModalTool multiModalTool = new DashScopeMultiModalTool(apiKey); // 2. 创建 Agent,注册工具 +Toolkit toolkit = new Toolkit(); +toolkit.registerTool(multiModalTool); + ReActAgent agent = ReActAgent.builder() .name("MultiModalAssistant") .sysPrompt("你是一个多模态助手。当用户要求朗读时,使用 dashscope_text_to_audio 工具。") .model(chatModel) - .tools(multiModalTool) + .toolkit(toolkit) .build(); // 3. Agent 可以主动调用 TTS 工具 -Msg response = agent.call(Msg.user("请用语音说一句'欢迎光临'")).block(); +Msg response = agent.call(Msg.builder().textContent("请用语音说一句'欢迎光临'").build()).block(); ``` --- From 4aba3285b52e8b869dc9cac827a29679dd9372dd Mon Sep 17 00:00:00 2001 From: LearningGp Date: Thu, 2 Apr 2026 15:33:48 +0800 Subject: [PATCH 2/2] fix(docs): sync English documentation with Chinese doc fixes Co-Authored-By: Claude Opus 4.6 --- docs/en/intro.md | 7 +++-- docs/en/multi-agent/handoffs.md | 4 ++- docs/en/multi-agent/workflow.md | 6 ++--- docs/en/quickstart/agent.md | 10 ++++--- docs/en/quickstart/installation.md | 5 +++- docs/en/quickstart/key-concepts.md | 32 ++++++++++++++++------- docs/en/task/a2a.md | 4 +-- docs/en/task/agent-config.md | 16 +++++++----- docs/en/task/agent-skill.md | 2 +- docs/en/task/hook.md | 5 +++- docs/en/task/mcp.md | 6 ++--- docs/en/task/memory.md | 2 +- docs/en/task/model.md | 42 +++++++++++++++++++++++++----- docs/en/task/online-training.md | 2 +- docs/en/task/plan.md | 24 +++++++++++++++++ docs/en/task/rag.md | 1 - docs/en/task/studio.md | 2 +- docs/en/task/tool.md | 29 ++++++++++++--------- docs/en/task/tts.md | 9 ++++--- 19 files changed, 148 insertions(+), 60 deletions(-) diff --git a/docs/en/intro.md b/docs/en/intro.md index 9518fc0aa..0be449b6c 100644 --- a/docs/en/intro.md +++ b/docs/en/intro.md @@ -34,15 +34,14 @@ AgentScope includes production-ready tools that address common challenges in age AgentScope is designed to integrate with existing enterprise infrastructure without requiring extensive modifications: - **MCP Protocol** - Integrate with any MCP-compatible server to instantly extend agent capabilities. Connect to the growing ecosystem of MCP tools and services—from file systems and databases to web browsers and code interpreters—without writing custom integration code. -- **A2A Protocol** - Enable distributed multi-agent collaboration through standard service discovery. Register agent capabilities to Nacos or similar registries, allowing agents to discover and invoke each other as naturally as calling microservices. +- **A2A Protocol** - Enable distributed multi-agent collaboration through extension modules. Register agent capabilities to Nacos or similar registries (via `agentscope-extensions-nacos-a2a`), allowing agents to discover and invoke each other as naturally as calling microservices. ### Production Grade Built for enterprise deployment requirements: -- **High Performance** - Reactive architecture based on Project Reactor ensures non-blocking execution. GraalVM native image compilation achieves 200ms cold start times, making AgentScope suitable for serverless and auto-scaling environments. -- **Security Sandbox** - AgentScope Runtime provides isolated execution environments for untrusted tool code. Includes pre-built sandboxes for GUI automation, file system operations, and mobile device interaction, preventing unauthorized access to system resources. -- **Observability** - Native integration with OpenTelemetry for distributed tracing across the entire agent execution pipeline. AgentScope Studio provides visual debugging, real-time monitoring, and comprehensive logging for development and production environments. +- **High Performance** - Reactive architecture based on Project Reactor ensures non-blocking execution. Supports GraalVM native image compilation through Micronaut/Quarkus, suitable for serverless and auto-scaling environments. +- **Observability** - Pluggable Tracer SPI with extension module support for OpenTelemetry, enabling distributed tracing across the entire agent execution pipeline. AgentScope Studio provides visual debugging, real-time monitoring, and comprehensive logging for development and production environments. ## Requirements diff --git a/docs/en/multi-agent/handoffs.md b/docs/en/multi-agent/handoffs.md index 722b803dd..c4d41bbf8 100644 --- a/docs/en/multi-agent/handoffs.md +++ b/docs/en/multi-agent/handoffs.md @@ -105,7 +105,7 @@ Register each tool on the corresponding agent’s Toolkit via `toolkit.registerT Create a sales and a support agent as `AgentScopeAgent`, each with its own ReActAgent, system prompt, and Toolkit that includes the appropriate handoff tool. ```java -import com.alibaba.cloud.ai.graph.agent.agentscope.AgentScopeAgent; +import com.alibaba.cloud.ai.agent.agentscope.AgentScopeAgent; import io.agentscope.core.ReActAgent; import io.agentscope.core.memory.InMemoryMemory; import io.agentscope.core.model.DashScopeChatModel; @@ -129,6 +129,7 @@ ReActAgent.Builder salesReActBuilder = ReActAgent.builder() AgentScopeAgent salesAgent = AgentScopeAgent.fromBuilder(salesReActBuilder) .name(AgentScopeStateConstants.SALES_AGENT) + .description("Sales agent for pricing, product availability, and sales inquiries") .instruction("please assist the customer with their sales inquiry: {input}.") .includeContents(true) .returnReasoningContents(true) @@ -152,6 +153,7 @@ ReActAgent.Builder supportReActBuilder = ReActAgent.builder() AgentScopeAgent supportAgent = AgentScopeAgent.fromBuilder(supportReActBuilder) .name(AgentScopeStateConstants.SUPPORT_AGENT) + .description("Support agent for technical issues and troubleshooting") .instruction("please assist the customer with their product technical inquiry: {input}.") .includeContents(true) .returnReasoningContents(true) diff --git a/docs/en/multi-agent/workflow.md b/docs/en/multi-agent/workflow.md index dfd45d484..b799b70ef 100644 --- a/docs/en/multi-agent/workflow.md +++ b/docs/en/multi-agent/workflow.md @@ -83,9 +83,9 @@ START → list_tables → call_get_schema → get_schema → generate_query → **Configuration**: -- `workflow.rag.enabled` – Enable RAG workflow beans. -- `workflow.sql.enabled` – Enable SQL workflow beans. -- `workflow.runner.enabled` – When `true`, run a one-shot demo on startup (use with one of the above). +- `workflow.rag.enabled` – Enable RAG workflow beans (default `false`). +- `workflow.sql.enabled` – Enable SQL workflow beans (default `true` in example). +- `workflow.runner.enabled` – When `true`, run a one-shot demo on startup (default `true` in example; use with one of the above). - **DashScope API key**: `AI_DASHSCOPE_API_KEY` or `spring.ai.dashscope.api-key` (required for RAG and SQL; RAG also needs an embedding model). ## Custom workflow vs other patterns diff --git a/docs/en/quickstart/agent.md b/docs/en/quickstart/agent.md index a461869b1..5c7ec32bb 100644 --- a/docs/en/quickstart/agent.md +++ b/docs/en/quickstart/agent.md @@ -26,12 +26,16 @@ The `ReActAgent` class exposes the following parameters in its constructor: | Parameter | Further Reading | Description | |-----------|-----------------|-------------| | `name` (required) | | Agent's name | -| `sysPrompt` (required) | | System prompt | +| `sysPrompt` | | System prompt (recommended) | | `model` (required) | [Model Integration](../task/model.md) | Model for generating responses | | `toolkit` | [Tool System](../task/tool.md) | Module for registering/calling tool functions | | `memory` | [Memory Management](../task/memory.md) | Short-term memory for conversation history | -| `longTermMemory` | [Long-term Memory](../task/long-term-memory.md) | Long-term memory | -| `longTermMemoryMode` | [Long-term Memory](../task/long-term-memory.md) | Long-term memory mode: `AGENT_CONTROL`, `STATIC_CONTROL`, or `BOTH` | +| `description` | | Agent description | +| `generateOptions` | | LLM generation parameters (temperature, topP, maxTokens, etc.) | +| `toolExecutionContext` | [Tool System](../task/tool.md) | Tool execution context for dependency injection into tools | +| `planNotebook` | [Planning](../task/plan.md) | Plan manager | +| `longTermMemory` | [Memory Management](../task/memory.md) | Long-term memory | +| `longTermMemoryMode` | [Memory Management](../task/memory.md) | Long-term memory mode: `AGENT_CONTROL`, `STATIC_CONTROL`, or `BOTH` | | `maxIters` | | Max iterations for generating response (default: 10) | | `hooks` | [Hook System](../task/hook.md) | Event hooks for customizing agent behavior | | `modelExecutionConfig` | | Timeout/retry config for model calls | diff --git a/docs/en/quickstart/installation.md b/docs/en/quickstart/installation.md index d7f730322..0c03db7ca 100644 --- a/docs/en/quickstart/installation.md +++ b/docs/en/quickstart/installation.md @@ -36,7 +36,7 @@ implementation 'io.agentscope:agentscope:1.0.11' The all-in-one package includes these dependencies by default: -- DashScope SDK (Qwen series models) +- DashScope model support (Qwen series models, via native HTTP calls, no additional SDK required) - MCP SDK (Model Context Protocol) - Reactor Core, Jackson, SLF4J (base frameworks) @@ -185,6 +185,7 @@ implementation 'io.agentscope:agentscope-core:1.0.11' |--------|---------|-------------------| | [agentscope-extensions-scheduler-common](https://central.sonatype.com/artifact/io.agentscope/agentscope-extensions-scheduler-common) | Scheduler Common | `io.agentscope:agentscope-extensions-scheduler-common` | | [agentscope-extensions-scheduler-xxl-job](https://central.sonatype.com/artifact/io.agentscope/agentscope-extensions-scheduler-xxl-job) | XXL-Job Scheduler | `io.agentscope:agentscope-extensions-scheduler-xxl-job` | +| [agentscope-extensions-scheduler-quartz](https://central.sonatype.com/artifact/io.agentscope/agentscope-extensions-scheduler-quartz) | Quartz Scheduler | `io.agentscope:agentscope-extensions-scheduler-quartz` | #### User Interface @@ -224,6 +225,8 @@ Additional starters: |---------|---------|-------------------| | agentscope-a2a-spring-boot-starter | A2A Integration | `io.agentscope:agentscope-a2a-spring-boot-starter` | | agentscope-agui-spring-boot-starter | AG-UI Integration | `io.agentscope:agentscope-agui-spring-boot-starter` | +| agentscope-chat-completions-web-starter | Chat Completions Web Integration | `io.agentscope:agentscope-chat-completions-web-starter` | +| agentscope-nacos-spring-boot-starter | Nacos Integration | `io.agentscope:agentscope-nacos-spring-boot-starter` | ### Quarkus diff --git a/docs/en/quickstart/key-concepts.md b/docs/en/quickstart/key-concepts.md index 8a8779c5e..c54ec535c 100644 --- a/docs/en/quickstart/key-concepts.md +++ b/docs/en/quickstart/key-concepts.md @@ -72,9 +72,11 @@ Message is the most fundamental data structure in AgentScope, used for: | Field | Description | |-------|-------------| +| `id` | Unique message identifier (auto-generated UUID) | | `name` | Sender's name, used to distinguish identities in multi-agent scenarios | | `role` | Role: `USER`, `ASSISTANT`, `SYSTEM`, or `TOOL` | | `content` | List of content blocks, supports multiple types | +| `timestamp` | Message timestamp | | `metadata` | Optional structured data | **Content types**: @@ -99,6 +101,8 @@ Messages returned by Agent contain additional metadata to help understand execut | Value | Description | |-------|-------------| | `MODEL_STOP` | Task completed normally | +| `TOOL_CALLS` | Model returned tool calls (internal tools, framework continues execution) | +| `STRUCTURED_OUTPUT` | Structured output completed | | `TOOL_SUSPENDED` | Tool needs external execution, waiting for result | | `REASONING_STOP_REQUESTED` | Paused by Hook during Reasoning phase (HITL) | | `ACTING_STOP_REQUESTED` | Paused by Hook during Acting phase (HITL) | @@ -133,10 +137,11 @@ Msg imgMsg = Msg.builder() The Agent interface defines the core contract: ```java -public interface Agent { - Mono call(Msg msg); // Process message and return response - Flux stream(Msg msg); // Stream response in real-time - void interrupt(); // Stop execution +public interface Agent extends CallableAgent, StreamableAgent, ObservableAgent { + String getAgentId(); + String getName(); + void interrupt(); + void interrupt(Msg msg); } ``` @@ -239,8 +244,13 @@ Formatter is responsible for converting AgentScope messages to the format requir - Identity handling in multi-agent scenarios **Built-in implementations**: -- `DashScopeFormatter` - Alibaba Cloud DashScope (Qwen series) -- `OpenAIFormatter` - OpenAI and compatible APIs +- `DashScopeChatFormatter` - Alibaba Cloud DashScope (Qwen series) +- `OpenAIChatFormatter` - OpenAI and compatible APIs +- `AnthropicChatFormatter` - Anthropic (Claude series) +- `GeminiChatFormatter` - Google Gemini +- `OllamaChatFormatter` - Ollama local models +- `DeepSeekFormatter` - DeepSeek +- `GLMFormatter` - GLM (Zhipu) > Formatter is automatically selected based on Model type; manual configuration is usually not needed. @@ -262,6 +272,9 @@ Hook provides extension points at key nodes of the ReAct loop through an event m | `PreActingEvent` | Before executing tool | ✓ | | `PostActingEvent` | After tool execution | ✓ | | `ActingChunkEvent` | During tool streaming output | - | +| `PreSummaryEvent` | Before summary generation | ✓ | +| `PostSummaryEvent` | After summary generation | ✓ | +| `SummaryChunkEvent` | During summary streaming output | - | | `ErrorEvent` | When error occurs | - | **Hook Priority**: Hooks execute in priority order (lower value = higher priority), default is 100. @@ -309,9 +322,10 @@ ReActAgent agent = ReActAgent.builder() **Problem solved**: Agent state such as conversation history and configuration needs to be saved and restored to support session persistence. -AgentScope separates "initialization" from "state": -- `saveState()` - Export current state as a serializable Map -- `loadState()` - Restore from saved state +AgentScope separates "initialization" from "state" through the `StateModule` interface: +- `saveTo(Session, SessionKey)` - Save current state to Session +- `loadFrom(Session, SessionKey)` - Restore state from Session +- `loadIfExists(Session, SessionKey)` - Restore state from Session if it exists **Session** provides persistent storage across runs: diff --git a/docs/en/task/a2a.md b/docs/en/task/a2a.md index 7ca21c36d..1c23e2afb 100644 --- a/docs/en/task/a2a.md +++ b/docs/en/task/a2a.md @@ -160,8 +160,8 @@ ConfigurableAgentCard agentCard = new ConfigurableAgentCard.Builder() .description("Intelligent assistant") .version("1.0.0") .skills(List.of( - new AgentSkill("text-generation", "Text Generation"), - new AgentSkill("question-answering", "Q&A"))) + AgentSkill.builder().name("text-generation").description("Text Generation").skillContent("").build(), + AgentSkill.builder().name("question-answering").description("Q&A").skillContent("").build())) .build(); AgentScopeA2aServer.builder(agentBuilder) diff --git a/docs/en/task/agent-config.md b/docs/en/task/agent-config.md index ba02d82f8..19b4239bc 100644 --- a/docs/en/task/agent-config.md +++ b/docs/en/task/agent-config.md @@ -254,10 +254,10 @@ In the `ReActAgent`'s reasoning phase (ReasoningPipeline), this configuration is **Default Configuration** (ExecutionConfig.MODEL_DEFAULTS): - Timeout: 5 minutes - Max attempts: 3 (initial + 2 retries) -- Initial backoff: 1 second -- Max backoff: 10 seconds +- Initial backoff: 2 seconds +- Max backoff: 30 seconds - Backoff multiplier: 2.0 (exponential) -- Retry condition: all errors +- Retry condition: retryable errors (429, 5xx, timeout, network IO errors) **Use Cases**: - Adjust model API timeout @@ -460,7 +460,7 @@ Generally, there's no need to explicitly specify; the model will automatically s Provides the set of skills available to the Agent. It allows the Agent to load skills through tool functions and automatically injects skill hints via the Hook mechanism. ```java -SkillBox skillBox = new SkillBox(); +SkillBox skillBox = new SkillBox(new Toolkit()); .skillBox(skillBox) ``` @@ -563,7 +563,11 @@ public class ComprehensiveAgentExample { // 5. Define skill class public static class WeatherSkill extends AgentSkill { public WeatherSkill() { - super("weather", "weather", "weather", null); + super(AgentSkill.builder() + .name("weather") + .description("weather") + .skillContent("weather") + .build()); } } @@ -635,7 +639,7 @@ public class ComprehensiveAgentExample { // Step 5: Configure Skills // ============================================================ - SkillBox skillBox = new SkillBox(); + SkillBox skillBox = new SkillBox(new Toolkit()); skillBox.registerSkill(new WeatherSkill()); // ============================================================ diff --git a/docs/en/task/agent-skill.md b/docs/en/task/agent-skill.md index 6bf8e7d90..cdce7c4a5 100644 --- a/docs/en/task/agent-skill.md +++ b/docs/en/task/agent-skill.md @@ -153,7 +153,7 @@ ReActAgent agent = ReActAgent.builder() ## Simplified Integration ```java -SkillBox skillBox = new SkillBox(); +SkillBox skillBox = new SkillBox(new Toolkit()); skillBox.registerSkill(dataSkill); diff --git a/docs/en/task/hook.md b/docs/en/task/hook.md index 45fabbfd8..c1061b094 100644 --- a/docs/en/task/hook.md +++ b/docs/en/task/hook.md @@ -15,7 +15,7 @@ AgentScope Java uses a **unified event model** where all hooks implement the `on | Event Type | Timing | Modifiable | Description | |-----------------------|---------------------------|------------|------------------------------------------| -| PreCallEvent | Before agent call | ❌ | Before agent starts processing (notification-only) | +| PreCallEvent | Before agent call | ✅ | Before agent starts processing (modifiable input messages) | | PostCallEvent | After agent call | ✅ | After agent completes response (can modify final message) | | PreReasoningEvent | Before reasoning | ✅ | Before LLM reasoning (can modify input messages) | | PostReasoningEvent | After reasoning | ✅ | After LLM reasoning (can modify reasoning result) | @@ -23,6 +23,9 @@ AgentScope Java uses a **unified event model** where all hooks implement the `on | PreActingEvent | Before tool execution | ✅ | Before tool execution (can modify tool parameters) | | PostActingEvent | After tool execution | ✅ | After tool execution (can modify tool result) | | ActingChunkEvent | During tool stream | ❌ | Tool execution progress chunks (notification-only) | +| PreSummaryEvent | Before summary | ✅ | Before summary generation when max iterations reached | +| PostSummaryEvent | After summary | ✅ | After summary generation (can modify summary result) | +| SummaryChunkEvent | During summary stream | ❌ | Each chunk of streaming summary (notification-only) | | ErrorEvent | On error | ❌ | When errors occur (notification-only) | ## Creating Hooks diff --git a/docs/en/task/mcp.md b/docs/en/task/mcp.md index 7f2b4bfb9..82c3d8f6c 100644 --- a/docs/en/task/mcp.md +++ b/docs/en/task/mcp.md @@ -157,9 +157,9 @@ String groupName = "filesystem"; toolkit.createToolGroup(groupName, "Tools for operating system files", true); // Register MCP tools in a group -toolkit.registration().mcpClient(mcpClient).group("groupName").apply(); +toolkit.registration().mcpClient(mcpClient).group(groupName).apply(); -// Create agent that only uses specific groups +// Create agent with toolkit (only active group tools available) ReActAgent agent = ReActAgent.builder() .name("Assistant") .model(model) @@ -222,7 +222,7 @@ McpClientWrapper client = McpClientBuilder.create("mcp") .block(); ``` -> **Note**: Query parameters only apply to HTTP transports (SSE and HTTP). They are ignored for StdIO transport. +> **Note**: Query parameters and HTTP headers only apply to HTTP transports (SSE and HTTP). They are silently ignored for StdIO transport. ### Synchronous vs Asynchronous Clients diff --git a/docs/en/task/memory.md b/docs/en/task/memory.md index 2633e8911..49f1259fa 100644 --- a/docs/en/task/memory.md +++ b/docs/en/task/memory.md @@ -316,4 +316,4 @@ mvn exec:java -Dexec.mainClass="io.agentscope.examples.advanced.ReMeExample" - [AutoContextMemory Documentation](https://github.com/agentscope-ai/agentscope-java/blob/main/agentscope-extensions/agentscope-extensions-autocontext-memory/README.md) - [Session Management](./session.md) -- [ReActAgent Guide](./react-agent.md) +- [ReActAgent Guide](./agent-config.md) diff --git a/docs/en/task/model.md b/docs/en/task/model.md index 8113c1bb4..0a5c71c04 100644 --- a/docs/en/task/model.md +++ b/docs/en/task/model.md @@ -8,7 +8,7 @@ This guide introduces the LLM models supported by AgentScope Java and how to con | Provider | Class | Streaming | Tools | Vision | Reasoning | |------------|-------------------------|-----------|-------|--------|-----------| | DashScope | `DashScopeChatModel` | ✅ | ✅ | ✅ | ✅ | -| OpenAI | `OpenAIChatModel` | ✅ | ✅ | ✅ | | +| OpenAI | `OpenAIChatModel` | ✅ | ✅ | ✅ | ✅ | | Anthropic | `AnthropicChatModel` | ✅ | ✅ | ✅ | ✅ | | Gemini | `GeminiChatModel` | ✅ | ✅ | ✅ | ✅ | | Ollama | `OllamaChatModel` | ✅ | ✅ | ✅ | ✅ | @@ -48,6 +48,31 @@ DashScopeChatModel model = DashScopeChatModel.builder() | `stream` | Enable streaming, default `true` | | `enableThinking` | Enable thinking mode to show reasoning process | | `enableSearch` | Enable web search for real-time information | +| `endpointType` | API endpoint type (default `AUTO` auto-detect), options: `TEXT` (force text API) or `MULTIMODAL` (force multimodal API) | +| `defaultOptions` | Default generation options (temperature, maxTokens, etc.) | +| `formatter` | Message formatter (default `DashScopeChatFormatter`) | + +### Endpoint Type (endpointType) + +DashScope models support both text and multimodal API endpoints. By default, the framework automatically detects the appropriate endpoint type based on the model name (e.g., `qwen-vl-*` and `qwen3.5` series automatically use the multimodal endpoint). + +When auto-detection is inaccurate (e.g., using custom model names or compatible APIs), you can manually specify the endpoint type: + +```java +// Force multimodal API (suitable for scenarios with images, audio, etc.) +DashScopeChatModel model = DashScopeChatModel.builder() + .apiKey(System.getenv("DASHSCOPE_API_KEY")) + .modelName("custom-model") + .endpointType(EndpointType.MULTIMODAL) + .build(); + +// Force text API +DashScopeChatModel model = DashScopeChatModel.builder() + .apiKey(System.getenv("DASHSCOPE_API_KEY")) + .modelName("custom-model") + .endpointType(EndpointType.TEXT) + .build(); +``` ### Thinking Mode @@ -105,6 +130,7 @@ OpenAIChatModel model = OpenAIChatModel.builder() | `modelName` | Model name, e.g., `gpt-4o`, `gpt-4o-mini` | | `baseUrl` | Custom API endpoint (optional) | | `stream` | Enable streaming, default `true` | +| `generateOptions` | Default generation options (note: OpenAI uses `.generateOptions()` instead of `.defaultOptions()`) | ## Anthropic @@ -275,7 +301,7 @@ GenerateOptions options = GenerateOptions.builder() .topK(40) // Top-K sampling .maxTokens(2000) // Maximum output tokens .seed(42L) // Random seed - .toolChoice(new ToolChoice.auto()) // Tool choice strategy + .toolChoice(new ToolChoice.Auto()) // Tool choice strategy .build(); DashScopeChatModel model = DashScopeChatModel.builder() @@ -299,17 +325,21 @@ OllamaChatModel model = OllamaChatModel.builder() | `topP` | Double | Nucleus sampling threshold, 0.0-1.0 | | `topK` | Integer | Limits candidate tokens | | `maxTokens` | Integer | Maximum tokens to generate | +| `maxCompletionTokens` | Integer | Maximum completion tokens | | `thinkingBudget` | Integer | Token budget for thinking | +| `reasoningEffort` | String | Reasoning effort level (e.g., `low`, `medium`, `high`) | +| `frequencyPenalty` | Double | Frequency penalty, -2.0-2.0 | +| `presencePenalty` | Double | Presence penalty, -2.0-2.0 | | `seed` | Long | Random seed | | `toolChoice` | ToolChoice | Tool choice strategy | ### Tool Choice Strategy ```java -ToolChoice.auto() // Model decides (default) -ToolChoice.none() // Disable tool calling -ToolChoice.required() // Force tool calling -ToolChoice.specific("tool_name") // Force specific tool +new ToolChoice.Auto() // Model decides (default) +new ToolChoice.None() // Disable tool calling +new ToolChoice.Required() // Force tool calling +new ToolChoice.Specific("tool_name") // Force specific tool ``` ### Additional Parameters diff --git a/docs/en/task/online-training.md b/docs/en/task/online-training.md index 1294f7700..0a171050f 100644 --- a/docs/en/task/online-training.md +++ b/docs/en/task/online-training.md @@ -258,7 +258,7 @@ ReActAgent agent = ReActAgent.builder() .build(); // User requests are processed normally (using GPT-4), 10% automatically sampled for training -Msg response = agent.call(Msg.userMsg("Search for Python tutorials")).block(); +Msg response = agent.call(Msg.builder().textContent("Search for Python tutorials").build()).block(); // 3. Stop when training is complete runner.stop(); diff --git a/docs/en/task/plan.md b/docs/en/task/plan.md index 70d22c4e6..7eb934e22 100644 --- a/docs/en/task/plan.md +++ b/docs/en/task/plan.md @@ -86,6 +86,30 @@ See `agentscope-examples/quickstart/src/main/java/io/agentscope/examples/quickst ## Configuration Options +### User Confirmation (needUserConfirm) + +Controls whether the agent needs to wait for user confirmation before starting execution after creating a plan. + +**Default Value**: `true` (user confirmation required) + +When enabled, the agent will display the plan content and ask the user whether to proceed (e.g., "Should I proceed with this plan?") after creating a plan. It will only start executing subtasks after the user explicitly confirms (e.g., replying "yes", "go ahead"). If the user's message already implies execution intent (e.g., "execute the plan"), confirmation is skipped and execution begins directly. + +When disabled, the agent will immediately start executing after creating the plan, without waiting for user confirmation. + +```java +// Require user confirmation (default behavior) +PlanNotebook planNotebook = PlanNotebook.builder() + .needUserConfirm(true) + .build(); + +// No confirmation needed, execute immediately after creating plan +PlanNotebook planNotebook = PlanNotebook.builder() + .needUserConfirm(false) + .build(); +``` + +> **Note**: When subtasks are already in execution (status is `in_progress`), confirmation rule hints will not be injected regardless of the `needUserConfirm` setting. + ### Limit Subtask Count ```java diff --git a/docs/en/task/rag.md b/docs/en/task/rag.md index 503960953..e8286ece3 100644 --- a/docs/en/task/rag.md +++ b/docs/en/task/rag.md @@ -48,7 +48,6 @@ ReActAgent agent = ReActAgent.builder() .limit(3) .scoreThreshold(0.3) .build()) - .enableOnlyForUserQueries(true) // Retrieve only for user messages .build(); ``` diff --git a/docs/en/task/studio.md b/docs/en/task/studio.md index 09664b541..e4127893c 100644 --- a/docs/en/task/studio.md +++ b/docs/en/task/studio.md @@ -29,7 +29,7 @@ Install via npm npm install -g @agentscope/studio # or npm install @agentscope/studio as_studio ``` -Studio will run at http://localhost:5173 +Studio will run at http://localhost:5173 (frontend dev server) ![Studio Server Page](../../imgs/studioServer.png) diff --git a/docs/en/task/tool.md b/docs/en/task/tool.md index 47ae89305..1f17746c1 100644 --- a/docs/en/task/tool.md +++ b/docs/en/task/tool.md @@ -72,7 +72,7 @@ public Mono search( ### Streaming Tools -Use `ToolEmitter` to send intermediate progress, suitable for long-running tasks: +Use `ToolEmitter` to send intermediate progress, suitable for long-running tasks (progress is only visible to Hooks, not sent to LLM): ```java @Tool(description = "Generate data") @@ -200,6 +200,7 @@ toolkit.registerTool(new WriteFileTool("/safe/workspace")); | Tool | Method | Description | |------|--------|-------------| | `ReadFileTool` | `view_text_file` | View files by line range | +| `ReadFileTool` | `list_directory` | List files and folders in a directory | | `WriteFileTool` | `write_text_file` | Create/overwrite/replace file content | | `WriteFileTool` | `insert_text_file` | Insert content at specified line | @@ -230,8 +231,8 @@ toolkit.registerTool(new OpenAIMultiModalTool(System.getenv("OPENAI_API_KEY"))); | Tool | Capabilities | |------|--------------| -| `DashScopeMultiModalTool` | Text-to-image, image-to-text, text-to-speech, speech-to-text | -| `OpenAIMultiModalTool` | Text-to-image, image editing, image variations, image-to-text, text-to-speech, speech-to-text | +| `DashScopeMultiModalTool` | Text-to-image, image-to-text, text-to-speech, speech-to-text, text-to-video, image-to-video, first-last-frame-to-video, video understanding | +| `OpenAIMultiModalTool` | Text-to-image, image-to-text, text-to-speech, speech-to-text | ### Sub-agent Tools @@ -282,7 +283,7 @@ Toolkit toolkit = new Toolkit(ToolkitConfig.builder() | Option | Description | Default | |--------|-------------|---------| -| `parallel` | Whether to execute multiple tools in parallel | `true` | +| `parallel` | Whether to execute multiple tools in parallel | `false` | | `allowToolDeletion` | Whether to allow tool deletion | `true` | | `executionConfig.timeout` | Tool execution timeout | 5 minutes | @@ -292,7 +293,7 @@ Allow agents to autonomously manage tool groups: ```java toolkit.registerMetaTool(); -// Agent can call "reset_equipped_tools" to activate/deactivate tool groups +// Agent can call "reset_equipped_tools" to activate (reset to specified set) tool groups ``` When there are many tool groups, agents can autonomously choose which groups to activate based on task requirements. @@ -327,14 +328,16 @@ if (response.getGenerateReason() == GenerateReason.TOOL_SUSPENDED) { List pendingTools = response.getContentBlocks(ToolUseBlock.class); // After external execution, provide result - Msg toolResult = Msg.builder() - .role(MsgRole.TOOL) - .content(ToolResultBlock.of(toolUse.getId(), toolUse.getName(), - TextBlock.builder().text("External execution result").build())) - .build(); - - // Resume execution - response = agent.call(toolResult).block(); + for (ToolUseBlock toolUse : pendingTools) { + Msg toolResult = Msg.builder() + .role(MsgRole.TOOL) + .content(ToolResultBlock.of(toolUse.getId(), toolUse.getName(), + TextBlock.builder().text("External execution result").build())) + .build(); + + // Resume execution + response = agent.call(toolResult).block(); + } } ``` diff --git a/docs/en/task/tts.md b/docs/en/task/tts.md index 8e174248b..e49a02e0c 100644 --- a/docs/en/task/tts.md +++ b/docs/en/task/tts.md @@ -61,7 +61,7 @@ ReActAgent agent = ReActAgent.builder() .build(); // 4. Chat with Agent - Agent will speak while generating response -Msg response = agent.call(Msg.user("你好,今天天气怎么样?")).block(); +Msg response = agent.call(Msg.builder().textContent("你好,今天天气怎么样?").build()).block(); ``` ### Server Mode (Web/SSE) @@ -150,15 +150,18 @@ Agent calls TTS via tool, Agent decides when to convert text to speech: DashScopeMultiModalTool multiModalTool = new DashScopeMultiModalTool(apiKey); // 2. Create Agent, register tool +Toolkit toolkit = new Toolkit(); +toolkit.registerTool(multiModalTool); + ReActAgent agent = ReActAgent.builder() .name("MultiModalAssistant") .sysPrompt("你是一个多模态助手。当用户要求朗读时,使用 dashscope_text_to_audio 工具。") .model(chatModel) - .tools(multiModalTool) + .toolkit(toolkit) .build(); // 3. Agent can actively call TTS tool -Msg response = agent.call(Msg.user("请用语音说一句'欢迎光临'")).block(); +Msg response = agent.call(Msg.builder().textContent("请用语音说一句'欢迎光临'").build()).block(); ``` ---