You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Inventory row #5 closure per user authorization 2026-04-27. The
iter-23/24/25 orphan-results gap is structurally closed:
``Campaign.run()`` now auto-invokes the deep-analysis pipeline
(``analysis/campaign.py:generate_campaign_analysis``) after the round
loop completes, persisting markdown + JSON to
``<campaign_dir>/analysis/campaign-analysis-<ts>.{md,json}``.
Implementation:
* New ``CampaignConfig.auto_emit_campaign_analysis: bool = True``
field. Defaults to True so new campaigns gain the behavior with no
config change. Backward-compatible: legacy ``campaign_state.yaml``
files round-trip correctly (the field defaults to True when absent
via ``cfg_data.get("auto_emit_campaign_analysis", True)``).
* New ``Campaign._emit_campaign_analysis(log)`` method invoked after
``_run_analysis_phase`` (PRD-INFRA-062 FR05's transcript-LLM
analysis) and before final_report.yaml emission.
* Fail-open by design: analyzer crashes are logged
(``campaign_analysis_emit_generate_failed`` /
``..._discover_failed`` / ``..._write_failed`` /
``..._import_failed``) but never fail the campaign — primary
deliverables (per-cell score.json + final_report.yaml) unaffected.
* Skips on shutdown (partial state shouldn't be analyzed under
the assumption of clean completion) and on empty campaign_dir.
Tests: 8 new in trw-eval/tests/test_campaign_auto_analyzer.py covering
happy path (md+json written), opt-out, no-runs skip, shutdown skip,
analyzer-crash fail-open, default-True semantics, state round-trip,
and legacy-checkpoint backward-compat. All 45 campaign-suite tests
pass.
Operating-directive compliance: docs/eval/META-TUNE-LOG.md appended
with a 2026-04-27 entry documenting the change since this fix touches
trw-eval campaign behavior. The note covers what changed, why
(iter-23/24/25 gap), zero impact on prior campaigns (auto-emit only
applies to NEW campaigns from this commit forward), backward-compat
strategy, test coverage, and the substrate-touchpoint rationale.
NFR-04 of PRD-INFRA-018 ("zero modifications to trw-eval source") was
explicitly relaxed for this scope per user authorization. Inventory
row #5 was the targeted scope; no other trw-eval/src changes
accompanied it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0 commit comments