This example demonstrates an interactive plan→review→execute workflow using the ECS-based LLM Agent framework. It features a robust state machine, review-gated planning, artifact persistence, recovery semantics, and framework-native auto compaction for both the main agent and spawned subagents.
The workflow follows a structured lifecycle:
- Draft Interview: The agent interviews the user to build a draft plan (
DRAFT_INTERVIEW). - Draft Reviews: The draft must be approved by both an Advisor (
DRAFT_ADVISOR_REVIEW) and a QA subagent (DRAFT_QA_REVIEW). - Write Plan: Once QA approves the draft, the system automatically transitions to
WRITE_PLANand triggers a dedicatedplan_writersubagent (equipped with thewriting-plansskill) to convert the approved draft into a structuredworkflow_plan.md. No manual command is needed. - Plan QA Review: When the plan writer finishes, the system automatically transitions to
PLAN_QA_REVIEWwhere QA reviews the final plan document. - Execution: Once finalized, the plan is decomposed into a task queue and executed.
- Built-in Tools — The main agent has
read_file,write_file,edit_file,bash, andglobtools pre-installed viaBuiltinToolsSkill, workspace-bound to the example directory.edit_fileuses a hash-anchored interface: supplyop,pos, optionalend, andcontent, withpos/endin"N#HASH"format obtained from a priorread_filecall. The main agent has unrestricted access to all these tools. - Subagent Tool Permissions —
advisor,qa, andplan_qareview subagents inherit onlyread_fileandglob(read-only) viaInheritancePolicy(inherit_tools=["read_file", "glob"], inherit_permissions=True). They cannot write files or run shell commands.plan_writerinheritsread_file,write_file,edit_file, andglobsince it must produce the final plan document. - Two Distinct QA Subagents — Draft QA (
qa) and Plan QA (plan_qa) are registered as separate subagents with separate system prompts and separate state machine transition paths:qausesQA_SYSTEM_PROMPT(draft review lens) and routesDelegationCompletedEvent→controller.handle_qa_review()→DRAFT_QA_REVIEWverdict.plan_qausesPLAN_QA_REVIEW_SYSTEM_PROMPT(final plan review lens) and routesDelegationCompletedEvent→controller.handle_plan_qa_review()→PLAN_QA_REVIEWverdict.- The planner system prompt calls
subagent(category="qa", ...)for draft review andsubagent(category="plan_qa", ...)for plan review.
- Review Prompts via File Path — When invoking
advisor,qa,plan_qa, orplan_writersubagents for review, the prompt passes the artifact file path (e.g.scratchbook/<workflow_id>/plan/draft.md) rather than embedding the file content inline. The subagent reads the file itself usingread_file, avoiding prompt token bloat. - Auto Compaction —
build_plan_task_world(...)installsCompactionConfigComponent(threshold_tokens=300_000, compaction_method="predrop_then_compact")by default,ConversationArchiveComponent, andCompactionSystemat priority-30so compaction runs before workflow prompt rendering and before reasoning.SystemPromptRenderSystemthen injects the current summary into the effective system prompt as<chat_history_summary>...</chat_history_summary>XML. - Subagent Compaction Inheritance — Child worlds created by
SubagentSysteminherit the parentCompactionConfigComponent, receive their ownConversationArchiveComponent, and registerCompactionSystemat the same priority. Long-running review and task subagents therefore compact independently without requiring plan-and-task-specific special cases. - Workflow Reset Safety —
/plan:start,/plan:resume, and/task:start <workflow_id>clear staleCurrentCompactionSummaryComponent, reset archived summaries, and invalidateRenderedSystemPromptComponentbefore restoring or switching workflow state. This prevents an old summary from leaking into a newly loaded workflow phase. - Log Truncation — Structured log fields
last_user_promptand user-normalizationprompt_textare truncated to 200 characters to keep logs readable without losing signal. System-prompt render logs still reportprompt_length, but the rendered prompt text itself is not truncated in this example. - ECS Core: Uses
SystemPromptRenderSystem,UserPromptNormalizationSystem,ReasoningSystem, andToolExecutionSystem. - Prompt Configuration: The planner entity declares
SystemPromptConfigSpecwithDRAFT_INTERVIEW_SYSTEM_PROMPT, andSystemPromptRenderSystembridges the rendered value intoLLMComponent.system_promptbefore reasoning. - Workflow DSL: Uses
install_workflowandWorkflowStateSystem(priority -25) to manage the phase graph and automatic prompt-profile selection via${_workflow_state_prompt}. - State Machine: Explicit phase transitions managed by
WorkflowStateMachine. - Artifacts: Durable persistence of plans, state, and execution evidence via
PlanTaskScratchbookAdapter. Main-agent tool results are currently kept inline in ECS conversation/tool-result state rather than being written throughToolResultsSink. - Controller:
PlanControllermanages the high-level workflow logic and review gates. - Subagent Reviews: Advisor, QA, and Plan QA review steps are wired as ECS subagents via
SubagentRegistryComponent. The planner invokes them withsubagent(category="advisor", ...),subagent(category="qa", ...), andsubagent(category="plan_qa", ...)respectively. Verdicts are automatically extracted from subagent results viaDelegationCompletedEventsubscription, routed to the correct controller method based on the subagent name. - Plan Writer Subagent: The
WRITE_PLANphase is executed by a dedicatedplan_writersubagent registered inSubagentRegistryComponent. It is pre-loaded with thewriting-plansskill (discovered from.claude/skills/writing_plans/SKILL.md) and inheritsread_file,write_file,edit_file, andglobtools. When it completes,handle_write_plan_completed()transitions the state toPLAN_QA_REVIEW. - Task Execution:
TaskExechandles plan loading, dependency resolution, and subagent dispatch. - Slash Commands: Dispatched via ECS
TriggerSpecscript handlers onUserPromptConfigComponent. Commands appear as transformed messages in conversation history. - System Execution Order:
UserInputSystemruns at priority -15 (beforeUserPromptNormalizationSystemat -10). This ensures the user's message is already inConversationComponentwhen script handlers fire, so slash commands like/task:startare matched in the same tick they are entered.
The interactive runtime supports eleven slash commands:
/plan:start <description>: Initialize a new workflow with a draft description./plan:resume <workflow_id>: Restore a previously-started workflow from disk by its workflow ID (e.g.creative-writing-assistant-with-llm-workflow). Marks any in-flight subagents as stale and resumes from the persisted phase./plan:status: Show the current workflow phase, status, and review verdicts./plan:finalize: Finalize the plan and transition to task execution (requires all three approved reviews)./plan:write: Transition fromDRAFT_QA_REVIEWtoWRITE_PLANphase to produceworkflow_plan.md. Optional — this transition now happens automatically when QA approves the draft, but can still be invoked manually./plan:qa_review <approved|revise|blocked> [notes]: Record a QA verdict on the final plan document./task:start <workflow_id>: Start execution of a specific task. If no workflow is active in the current session, providing aworkflow_idauto-loads the persisted state from scratchbook (equivalent to/plan:resume <workflow_id>followed by starting task execution). Accepts phasesPLAN_FINALIZED,TASK_READY,TASK_RUNNING, andTASK_BLOCKED./task:status: Show the status of the current task and subagent sessions./task:resume: Resume a blocked or replanned task./task:replan <reason>: Request a replan for the current task./task:abort: Abort the current task and transition to a terminal state.
All workflow data is persisted in scratchbook/<workflow_id>/:
plan/: Containsdraft.md(working draft, included asdraft_planartifact) andworkflow_plan.md(the single living plan file, edited in-place).state/: Containsruntime_state.json,events.jsonl, andtask_queue.json.memory/: Containsknowledge.jsonlfor cross-task context.evidence/: Directory for task execution artifacts.review/: Contains JSON verdicts from Advisor and QA reviews.
Main-agent tool call results are not currently persisted as separate canonical records by this example. They remain inline in ECS tool result state and conversation tool messages while durable workflow artifacts continue to live under scratchbook/<workflow_id>/.
Run the entry point to start an interactive session.
LLM_API_KEY=your-api-key uv run python examples/e2e/plan_and_task/main.pyThe prompt supports multi-line messages. Press Enter to start a new line; submit with a blank line (press Enter on an empty line):
You> /plan:start 我想开发一个辅助写作软件,
... 支持长篇小说和剧本创作,
... 需要多 Agent 协作完成各章节生成。
...
↑ blank line submits
Single-line commands work as before — just type and press Enter, then Enter again on the empty continuation line:
You> /plan:status
...
exit or quit typed as the first line (followed by Enter + blank line) terminates the session. Ctrl+D (EOF) also exits cleanly.
Automate interactions by piping commands. In pipe mode each \n\n (double newline) acts as a submit boundary:
printf '/plan:start Build demo\n\n/plan:status\n\nexit\n\n' | uv run python examples/e2e/plan_and_task/main.pyThe workflow can be restarted at any time. On startup, no workflow ID is resolved and no scratchbook folder is created. Instead:
- Call
/plan:start <original description>— the LLM re-derives the same slug from the same description (or usesslug_from_description()as fallback). - State is restored from
scratchbook/<workflow_id>/state/runtime_state.json. - Any in-flight subagents are marked
staleand the machine transitions toTASK_BLOCKEDfor safe resumption.
Note: Use the same description text (or the same slug) as the original
/plan:startcall so the derived workflow ID matches the existing scratchbook directory.
When /plan:resume <workflow_id> is called, the system reads the persisted state and automatically reconciles any in-progress phases so the workflow can continue without manual intervention:
| Resumed phase | Condition | Automatic action |
|---|---|---|
DRAFT_QA_REVIEW |
review_verdicts contains an approved verdict for this phase |
Transitions to WRITE_PLAN, injects a write-plan trigger message to start the plan_writer subagent |
WRITE_PLAN |
(any — plan_writer was mid-flight) | Injects a write-plan trigger message to restart the plan_writer subagent |
PLAN_QA_REVIEW |
review_verdicts contains an approved verdict for this phase |
Transitions to PLAN_FINALIZED |
| All other phases | — | No automatic action; resumes normally |
This means after a process restart you can call /plan:resume <workflow_id> and, if QA had already approved the draft before the restart, the plan_writer will be triggered automatically — no need to manually issue /plan:write.
Run the integration suite to verify command parsing, state machine logic, artifact persistence, and credential-gated CLI coverage:
uv run pytest tests/integration/test_plan_and_task_flow.py -vuv run pytest tests/integration/test_plan_and_task_flow.py -k "subagent"— verifies subagent component wiringuv run pytest tests/integration/test_plan_and_task_flow.py -k "compaction or stale_compaction_state" -v— verifies main-agent auto compaction, subagent inheritance, and stale-summary reset on workflow switch/resume
uv run pytest tests/integration/test_plan_and_task_flow.py -k "commands"
uv run pytest tests/integration/test_plan_and_task_flow.py -k "artifacts"Requires LLM_API_KEY. Verifies the controller and task execution with a real model:
LLM_API_KEY="$LLM_API_KEY" \
LLM_BASE_URL=https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1 \
LLM_MODEL=qwen3.5-flash \
uv run pytest tests/live/test_plan_and_task_flow_live.py -vTo exercise the new compaction path against an Anthropic-compatible endpoint:
LLM_API_FORMAT=anthropic_messages \
LLM_BASE_URL=https://cc2.caaa.tech \
LLM_API_KEY="$LLM_API_KEY" \
LLM_MODEL=glm-5.1 \
uv run pytest tests/live/test_plan_and_task_flow_live.py::test_anthropic_plan_task_auto_compaction_summarizes_context -vLLM_API_KEY: API key for the chosen provider.LLM_BASE_URL: API base URL (defaults to DashScope Responses API:https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1).LLM_MODEL: Model ID (defaults toqwen3.5-flash).LLM_API_FORMAT: Provider/format selector. Accepted values:openai_responses(default) — OpenAI Responses API viaOpenAIModel(enablesenable_store=Truefor prefix caching)openai_chat_completions— OpenAI Chat Completions API viaOpenAIModelanthropic_messages— Anthropic Messages API viaClaudeModel(also works with Kimi-compatible Anthropic endpoints)
PLAN_TASK_LANGFUSE: Set to1,true,yes, oronto install Langfuse observability on the plan-and-taskWorldbeforeRunner.run()starts.PLAN_TASK_LANGFUSE_ENVIRONMENT: Optional Langfuse environment label. Defaults toplan-and-task.PLAN_TASK_LANGFUSE_RELEASE: Optional release label sent with plan-and-task traces.PLAN_TASK_LANGFUSE_SESSION_ID: Optional session ID for grouping plan-and-task traces. The Langfuse SDK v4 adapter sends this as a trace-level session attribute viapropagate_attributes(...); metadata-only session IDs do not power the Langfuse Sessions UI.LANGFUSE_PUBLIC_KEY,LANGFUSE_SECRET_KEY, andLANGFUSE_HOSTorLANGFUSE_BASE_URL: Langfuse connection settings used whenPLAN_TASK_LANGFUSEis enabled.PLAN_TASK_INTERACTIVE: Set to0to disable interactive stdin.DEBUG: Set to1to make this example callconfigure_logging()with debug logging. Allplan_task_*structured log events will then appear on stderr via structlog.
Install the optional extra before enabling Langfuse for this example:
uv pip install -e ".[langfuse]"When PLAN_TASK_LANGFUSE is enabled, main.py calls install_plan_task_langfuse_observability() after build_plan_task_world(...) creates the World and before Runner.run(...) starts. In interactive mode, every UserInputReceivedEvent starts a user.turn trace that covers the complete chain from that user input until the next user input or process exit: prompt normalization, retrieval/compaction, LLM generations, tool calls, subagent spans, retries, errors, context pressure, and completion scores all stay inside that turn trace. One-shot runs without interactive input keep the runner trace for backward compatibility. Completed LLM, tool, and subagent observations export their ECS-recorded end timestamps through the Langfuse SDK v4 public lifecycle; preserving historical start timestamps requires explicitly validating your SDK version and enabling LangfuseConfig(enable_private_v4_historical_otel=True) because that path uses private SDK hooks. Raw prompts, tool arguments, and outputs are captured by default for backward compatibility; use LangfuseConfig(capture_input=False, capture_output=False) if raw content should not leave the process.
Subagents are exported as subagent.<name> spans inside the active user.turn trace. Their child-world LLM calls are exported as generation observations under that subagent span, and child-world tool/retrieval/API work is exported as child spans/events under the same turn trace rather than creating another top-level Langfuse trace. When a child-world generation requests a tool, that tool observation stays attached to the requesting generation so the Langfuse hierarchy shows the exact delegation chain.
Use environment variables or a secret manager for LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and the Langfuse host value. Do not put concrete keys in scripts, docs, or command history. On exit, the CLI calls flush() and shutdown() on the observability handle so buffered trace events are sent before the process terminates.
Anthropic-compatible Langfuse smoke run:
PLAN_TASK_LANGFUSE=1 \
PLAN_TASK_LANGFUSE_ENVIRONMENT=dev \
PLAN_TASK_LANGFUSE_RELEASE=local-test \
PLAN_TASK_LANGFUSE_SESSION_ID="plan-task-dev-1" \
LLM_API_FORMAT=anthropic_messages \
LLM_MODEL=deepseek-v4-flash \
uv run python examples/e2e/plan_and_task/main.pySet LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL, LLM_BASE_URL, and LLM_API_KEY in your shell or secret manager before running the command.
DEBUG=1 \
LLM_API_FORMAT=anthropic_messages \
LLM_BASE_URL=https://api.anthropic.com \
LLM_API_KEY=sk-... \
LLM_MODEL=kimi-for-coding \
uv run python examples/e2e/plan_and_task/main.pyThe ClaudeModel appends /v1/messages to LLM_BASE_URL, so the actual endpoint called is https://api.anthropic.com/v1/messages. Anthropic cache stats (cache_read_input_tokens, cache_creation_input_tokens) are normalized and surfaced as plan_task_llm_cache_stats events with cache_hit_rate.
This example explicitly enables logging in main.py when DEBUG=1; the base ecs-agent library remains silent until configure_logging() is called.
| Event | Level | File | Description |
|---|---|---|---|
plan_task_workflow_id_derived |
info | runtime.py |
Workflow ID derived from LLM or fallback; method=llm|fallback, slug= |
plan_task_workflow_id_llm_failed |
warning | runtime.py |
LLM slug derivation failed; exception= |
plan_task_draft_written |
info | scratchbook_adapter.py |
Draft written to disk; path= |
plan_task_state_loaded |
debug | scratchbook_adapter.py |
Runtime state read from disk; phase= |
plan_task_event_appended |
debug | scratchbook_adapter.py |
Event appended to events.jsonl; event_type= |
plan_task_memory_appended |
debug | scratchbook_adapter.py |
Memory entry appended; task_id= |
plan_task_subagents_marked_stale |
info | scratchbook_adapter.py |
In-flight subagents staled on restart; stale_count=, task_ids= |
plan_task_task_queue_initialized |
info | task_exec.py |
Task queue built and state updated; task_count=, current_task_id=, phase= |
plan_task_subagent_dispatched |
info | task_exec.py |
Subagent session recorded for a task; task_id=, session_id= |
plan_task_task_completed |
info | task_exec.py |
Task completed; next_task_id=, workflow_done= |
plan_task_circuit_breaker_triggered |
warning | task_exec.py |
Task retry budget exhausted; retry_count=, max_retries= |
plan_task_dependency_cycle_detected |
warning | task_exec.py |
Cyclic dependency found before raise; cycle_ids= |
plan_task_reviews_not_approved |
warning | task_exec.py |
Task start blocked by missing reviews; missing_phases= |
plan_task_finalize_blocked |
warning | controller.py |
Plan finalization blocked by missing verdicts; missing_phases= |
plan_task_plan_artifact_missing |
warning | controller.py |
Plan artifact file not found before raise; path= |
plan_task_plan_not_finalized |
warning | plan_schema.py |
Plan status is not finalized before raise; status= |
plan_task_command_plan_start |
info | main.py |
/plan:start succeeded; workflow_id=, description_len= |
plan_task_command_plan_resume |
info | main.py |
/plan:resume succeeded; workflow_id=, phase= |
plan_task_command_plan_finalize |
info | main.py |
/plan:finalize succeeded; workflow_id= |
plan_task_system_prompt_switched |
info | main.py |
System prompt replaced from PLAN_MAIN_AGENT to TASK_MAIN_AGENT; entity_id=, from_prompt=, to_prompt= |
plan_task_command_task_start |
info | main.py |
/task:start succeeded; task_count=, current_task_id= |
plan_task_task_start_auto_loaded_state |
info | main.py |
/task:start <workflow_id> auto-loaded persisted state; workflow_id=, phase= |
plan_task_command_task_resume |
info | main.py |
/task:resume succeeded; workflow_id= |
plan_task_command_task_replan |
info | main.py |
/task:replan succeeded; task_id= |
plan_task_command_task_abort |
info | main.py |
/task:abort succeeded; task_id= |
plan_task_command_plan_status |
debug | main.py |
/plan:status invoked |
plan_task_command_task_status |
debug | main.py |
/task:status invoked; phase= |
plan_task_command_error |
warning | main.py |
A slash command raised ValueError; command=, exception= |
plan_task_llm_usage |
info | billing.py |
Per-invocation token counts; prompt_tokens=, completion_tokens=, total_tokens=, cached_input_tokens=, cache_creation_tokens=, cache_read_tokens= |
plan_task_llm_cache_stats |
info | billing.py |
Per-invocation cache hit-rate; cache_read_tokens=, total_prompt_tokens=, cache_hit_rate=. Emitted when using ApiFormat.OPENAI_RESPONSES with DashScope (returns input_tokens_details.cached_tokens), OpenAI (returns prompt_tokens_details.cached_tokens), or Anthropic (returns cache_read_input_tokens). Not emitted with DashScope Chat Completions API, which does not expose cached token counts. |
plan_task_session_billing_summary |
info | billing.py |
Cumulative token totals at end of session; invocation_count=, total_prompt_tokens=, total_completion_tokens=, total_tokens=, total_cached_input_tokens= |
accounting_invocation_recorded |
info | ecs_agent.accounting |
Per-invocation cost + cache hit-rate from AccountingSubscriber; total_cost=, cache_hit_rate= |
plan_task_auto_transition_write_plan |
info | controller.py |
QA review approved — auto-transitioned from DRAFT_QA_REVIEW to WRITE_PLAN; workflow_id= |
plan_task_auto_transition_plan_finalized |
info | controller.py |
Plan QA review approved — auto-transitioned from PLAN_QA_REVIEW to PLAN_FINALIZED; workflow_id= |
plan_task_auto_trigger_plan_writer |
info | main.py |
QA approved — injected write-plan trigger message to start plan_writer subagent automatically; workflow_id=, source= (omitted when triggered by live QA event; source=reconcile_after_resume when triggered on /plan:resume) |
- Testable World Factory:
build_plan_task_world(model, base_dir=None, *, compaction_threshold_tokens=..., compaction_method=...)is a public function that returns(world, agent_id, adapter_ref, runtime_state), enabling direct world setup in tests without running the CLI.adapter_refis alist[ArtifactAdapter | None]— starts as[None]and is populated in-place by the/plan:starthandler after the workflow ID is derived. - Framework-Native Auto Compaction:
build_plan_task_world(...)acceptscompaction_threshold_tokensandcompaction_method, installsCompactionConfigComponent, initializesConversationArchiveComponent, and registersCompactionSystem()at priority-30. The example reuses the shared framework compaction pipeline rather than maintaining a bespoke plan-and-task summarizer. - workflow_id Auto-Derivation:
/plan:start <description>callsderive_workflow_id_from_llm()to ask the LLM to generate a short, meaningful English slug from the description (e.g.,"writing-assistant-multi-agent"). Falls back toslug_from_description()on provider error or invalid output. The derived ID controls the scratchbook directory for all subsequent operations in that session. - Progressive Draft Editing: The planning interview fills
draft.mdone section at a time usingread_file(to get LINE#HASH annotated content) plus hash-anchorededit_file(op=..., pos=..., end=..., content=...)calls. The LLM reads the file first to captureN#HASHreferences, then replaces exactly the placeholder line or range. Full-file rewrites viawrite_fileare explicitly prohibited by the system prompt. - Atomic Writes: All artifact updates use atomic file operations to prevent corruption.
- Circuit Breaker:
TaskExecimplements a retry budget to prevent infinite loops on failing tasks. - Review Gating: Finalization is strictly blocked until
DRAFT_ADVISOR_REVIEW,DRAFT_QA_REVIEW, andPLAN_QA_REVIEWall haveapprovedverdicts. - Advisor Retry Loop: When the advisor returns
reviseorblocked, the system prompt instructs the planner LLM to apply the feedback todraft.mdviaedit_fileand re-call the advisor. Only anapprovedadvisor verdict unlocks the QA step. Non-approved verdicts for a phase replace any prior non-approved verdict (upsert semantics). Once a phase reachesapproved, that verdict is sticky — any subsequent upsert attempt for the same phase is silently ignored.approvedverdicts always havenotes=None; non-approved verdicts retain their notes for debugging. - Slash Command Re-trigger Guard: After
/task:startthe command message stays as the lastrole="user"entry in the conversation (tool results userole="tool"and do not replace it)._handle_task_starttherefore checks whetherSystemPromptConfigSpecalready containsTASK_MAIN_AGENT_SYSTEM_PROMPT; if so, it returnsNoneimmediately, letting the LLM continue task execution without re-initializing the task queue. - Compaction State Reset on Workflow Switch:
_reset_compaction_state(...)removesCurrentCompactionSummaryComponent, clearsConversationArchiveComponent.archived_summaries, and removesRenderedSystemPromptComponentbefore/plan:start,/plan:resume, or auto-loading state from/task:start <workflow_id>. This keeps prompt summaries aligned with the active workflow instead of a previous session. - Plan Template:
templates/workflow_plan_template.mdis an annotated reference showing the exactworkflow_plan.mdformat: YAML frontmatter,## Overviewwith### Dependency Graph,## Taskssection, per-task### Task: <task_id>+```yaml ```blocks, and an optional## AppendixAC cross-reference table. The format spec is also embedded verbatim intoWRITE_PLAN_SYSTEM_PROMPTandbuild_write_plan_prompt()as_WORKFLOW_PLAN_FORMATso theplan_writersubagent has an unambiguous reference without reading a file. - Dependency Resolution: Tasks are executed in topological order based on their
dependencieslist. - Review Verdict Lifecycle: Each phase holds at most one verdict in
review_verdicts. Upsert semantics: non-approved verdicts replace earlier non-approved verdicts for the same phase;approvedis terminal and cannot be overwritten.approvedverdicts always havenotes=None. Theplan_versionfield has been removed fromReviewVerdict. - Status Lifecycle: Whenever the state machine transitions to a new phase,
statusis set to"active". Terminal handlers (handle_task_abort→"aborted",handle_plan_finalize→"ready",handle_task_replanwith scope change →"needs_review") overridestatusafter the transition. - Token Prefix Caching: This example uses
ApiFormat.OPENAI_RESPONSESwithenable_store=True(default config). The DashScope Responses API endpoint (/api/v2/apps/protocols/compatible-mode/v1) returnsusage.input_tokens_details.cached_tokens, whichnormalize_openai_usage()maps tocached_input_tokens. On warm calls where the system prompt prefix is cached,cached_input_tokens > 0andplan_task_llm_cache_statswill be emitted with a non-zerocache_hit_rate. The DashScope Chat Completions API (/compatible-mode/v1) does not return cached token counts and does not support the Responses protocol — switching back to it would make cache observability unavailable.