- File:line —
chat_agent.py:401–416 - Description — The legacy
run_chat_agent(*, task=..., adapter=..., config=...)form emits only aDeprecationWarningand continues. Any caller that passesadapter=Nonealong with keyword-only args will silently useNoneas the adapter, causing a downstreamAttributeErrorinside_run_chat_agent_impl. - Impact — Potential
AttributeErroratadapter.complete()with a confusing traceback. - Mitigation — Callers should use the positional form. The deprecation warning is not enforced.
- File:line —
chat_agent.py:272–286 - Description — When JSON parse fails on all retries, the fallback checks if the task "looks like a simple question". If heuristics match, the raw LLM text is returned as a
FinalAnswerwithout any tool call validation or schema enforcement. - Impact — Could return nonsense or partial LLM output as an answer for simple factual tasks. No audit event records the fallback decision.
- Mitigation —
metadata={'decision_fallback': 'plain_text_final_answer'}is set but callers must check it.
- File:line —
chat_agent.py:596–604 - Description —
AgentRunneris not a frozen dataclass, but the patternobject.__setattr__(runner, '_plan_contract', ...)bypasses normal attribute assignment. IfAgentRunner's__init__already sets_plan_contract, this works; ifAgentRunnerchanges its initialization to be truly immutable, this will break silently or raiseAttributeError. - Impact — Fragile coupling between
chat_agent.pyandAgentRunnerinternals.
- File:line —
chat_agent.py:474–522 - Description —
load_skills_with_report()returns skipped/warning items but does not raise on partial failures. Skipped skills are only recorded in theskill_loadaudit event. The caller has no way to detect that an expected skill was silently dropped. - Impact — Silent skill degradation that may change agent behavior without any user-visible error.
- File:line —
chat_agent.py:438–465 - Description — When a
registryis passed in,_registry_fresh = Falseand none of the optional tools (code analysis, git, browser, MCP trust) are registered. This is the intended behavior for subagent calls, but callers who pass a pre-built registry and expect these tools to be registered will find them absent. - Impact — Subagent calls or custom registry callers may silently lack tools.
- File:line —
chat_agent.py:666–686 - Description — The deduplication check (
catalog.list(limit=50)) and thecatalog.add()call are not atomic. Under concurrent runs sharing the same workspace root, duplicate auto-curated entries can be written. - Impact — Memory catalog grows with duplicate run summaries.
- File:line —
chat_agent.py:575–623 - Description — The
finallyblock does stop the heartbeat, but ifobject.__setattr__(line 596) raises (e.g.,AttributeError), the exception occurs beforerunner.run(), and the heartbeat thread may have been started (line 580) but only stopped if thefinallyruns. Thefinallydoes run in this case — this is safe. However if the heartbeatstart()itself raises,heartbeatremains non-None andstop()will be called on a potentially corrupt object. - Impact — Low probability; heartbeat thread leak.
- File:line —
chat_agent.py:523–527 - Description — A value of
-1falls through the conditionif config.max_estimated_cost_cents >= 0toRunBudget().max_estimated_cost_cents. This is documented behavior but is easy to misunderstand:-1does NOT mean "no limit", it means "use RunBudget default". - Impact — Unexpected cost caps if callers pass
-1intending unlimited.
| Scenario | Behavior |
|---|---|
task is an empty string |
ModelDecisionEngine.decide() builds a prompt with an empty task; LLM receives a near-empty user message |
selected_skills=frozenset() with skill_prompt_mode='eager' |
load_skills_with_report() returns empty skills list; model receives no skill prompts |
initial_observations passed to a fresh (non-resumed) run |
Observations are appended to runner context before the first LLM call |
chat_messages is a non-empty list with no prior context |
Used as-is; the model sees historical messages before the current task |
skill_source_profile='custom' with empty skill_search_dirs=[] |
Raises ValueError at line 472 (checked: not config.skill_search_dirs) |
| ID | Description | Location |
|---|---|---|
| CG-03 (partial) | Cost on decision retries accumulates correctly in context['_cost_cents'] but subagent calls create their own _run_chat_agent_impl frames with separate context dicts — session-level cost aggregation depends on the caller |
chat_agent.py:238–245 |
| Decision fallback has no audit event | FinalAnswer with decision_fallback metadata is returned but not recorded in the audit log as a distinct event |
chat_agent.py:263–269 |
_subagent_error is defined but never called |
The function at line 655 builds an error dict but no call site in this file uses it; it may be dead code or intended for future use | chat_agent.py:655–663 |