Fix #283: token budget + graceful degradation for /deep-research by ericleepi314 · Pull Request #316 · agentforce314/clawcodex

ericleepi314 · 2026-06-12T03:10:46Z

Closes #283

Stacked on #315 (deep stack down to #304). Merge in order; GitHub retargets automatically.

Summary

/deep-research had no token ceiling — a verbose model (deepseek) burned ~888k tokens in Search+Verify before the user saw anything.

Engine

meta.default_budget — a workflow may declare its own ceiling, applied only when the caller set no budget (explicit budget_total and inherited parent Budgets win; nested workflow() children share the parent's budget and are unaffected).
Per-workflow env override CLAWCODEX_<NAME>_TOKEN_BUDGET — deep-research reads exactly the env var the issue names, CLAWCODEX_DEEP_RESEARCH_TOKEN_BUDGET (0 disables; malformed values ignored).

deep-research script (`default_budget: 400000`)

Verify gating: launches only as many verifiers as the remaining budget affords (per-verifier cost estimated from the observed Search spend; a 40k Synthesize reserve held back). Unaffordable claims pass through unverified — logged, never silent (an unrun check contradicts nothing under the "supported unless contradicted" verdict contract).
None verdicts keep their claims (crashed verifier or ceiling trip) instead of silently dropping them — a pre-existing bug fixed along the way.
Overshoot fallback (critic-reproduced failure: spend within an already-launched wave is uncapped, and a ceiling trip at the final Synthesize call previously failed the whole run with no report after full spend): the script re-checks the budget before Synthesize and falls back to returning the raw surviving claims — expensive something instead of expensive nothing.
Per-stage spend surfaced via log() lines (the progress-UI narrator).

Test plan

14 tests: default-budget resolution units (env precedence/zero/malformed, bool/string rejection, bundled meta declares 400k), meta-default-reaches-script + explicit-budget-wins, and degradation through the real engine + real bundled script with deterministic runners: tight budget → 0 verifiers but synthesize runs; partial → exactly 2 of 4 verified; no budget → meta default, all verified; crashed verifier → claim retained; incident-profile overshoot (4×120k vs 400k) → raw-claims fallback, error is None, exactly 4 agent calls
Workflow suite: 186 passed
Full suite on the stack: 7879 passed, 0 failed, 5 skipped
Critic review loop: APPROVE after 1 revision round (the Synthesize-trip failure was found and reproduced there, on both verbose-search and cheap-search/verbose-verify profiles)

Follow-ups noted in review (non-gating): chunked verify waves with budget recomputation, exposing workflow error classes in the sandbox builtins, a user-facing budget parameter on the Workflow tool.

🤖 Generated with Claude Code

A verbose model burned ~888k tokens in Search+Verify with no ceiling and no warning. - engine: a workflow's meta may declare default_budget, applied when the caller set no budget (explicit budget_total and inherited parent Budgets win; nested workflow() children unaffected). Per-workflow env override CLAWCODEX_<NAME>_TOKEN_BUDGET — deep-research reads exactly CLAWCODEX_DEEP_RESEARCH_TOKEN_BUDGET (0 disables) - deep-research declares default_budget=400000 and degrades instead of dying: the Verify fan-out only launches as many verifiers as the remaining budget affords (estimated from the observed Search spend, Synthesize reserve held back); unaffordable claims pass through UNVERIFIED with a log line; a None verdict (crashed verifier or ceiling trip) keeps its claim instead of silently dropping it; and if the already-launched waves overshot the whole budget, Synthesize falls back to returning the raw surviving claims rather than failing after full spend - per-stage spend surfaced via log lines (the progress UI narrator) Closes #283 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ericleepi314 mentioned this pull request Jun 12, 2026

Fix #285: make 1h prompt caching reachable via config #317

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #283: token budget + graceful degradation for /deep-research#316

Fix #283: token budget + graceful degradation for /deep-research#316
ericleepi314 wants to merge 1 commit into
fix/issue-282-structured-output-coercionfrom
feature/issue-283-research-budget

ericleepi314 commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ericleepi314 commented Jun 12, 2026

Summary

Engine

deep-research script (default_budget: 400000)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

deep-research script (`default_budget: 400000`)