Commit 46a288f
authored
fix(meta-findings): fully resolve AI-native-Systems-Research#242 — ledger floor + eager retry_log + nous reports (AI-native-Systems-Research#243)
* fix(meta-findings): add ledger.json failure floor + missing-artifact detector (AI-native-Systems-Research#242)
The existing nous_asks heuristics all key off retry_log.jsonl or
llm_metrics.jsonl. When a campaign's dispatcher dies before writing
those artifacts (e.g., a single-iteration campaign that fails at the
SDK call), every heuristic short-circuits and meta_findings.json
reports 0/0/0 across all three named streams — even though the
failure is recorded plainly in ledger.json.
Acceptance fixture: paper-burst.post-204-rerun.1779882732/. Its
ledger.json has iter-1 status="FAILED" error="SDK returned error
after 1 attempt(s): None", but retry_log.jsonl, llm_metrics.jsonl,
and runs/iter-1/findings.json are all absent. Pre-fix the campaign's
emitted meta_findings.json had 0 entries across all named streams;
post-fix it surfaces ≥1 nous_asks entry.
Two new pure-Python detectors in orchestrator/meta_findings.py:
- _detect_nous_asks_from_ledger_failures(ledger): one nous_ask per
iterations[*].status == "FAILED" row, kind="dispatch", citing
iter-N and a 120-char-truncated error string.
- _detect_nous_asks_from_missing_artifacts(work_dir, state, ledger):
one nous_ask when state.iteration >= 1 and last_entered_phase
!= "IDLE" but retry_log.jsonl and runs/iter-N/findings.json are
absent. kind="observability".
ledger.json is the right substrate for the failure floor — it's
written by the orchestrator itself, not by the dispatcher subprocess,
so it survives dispatcher-side crashes by construction. Both
detectors degrade silently on missing or malformed input, mirroring
the existing detectors.
Schema unchanged. Both kinds ("dispatch", "observability") are
already in the meta_findings.schema.json nous_ask enum.
Two adjacent items raised in the issue body are explicitly out of
scope here and will be tracked as separate follow-ups: eager
initialisation of retry_log.jsonl at iteration start, and an
on-demand `nous reports <run_id>` subcommand.
Tests: 13 new (TestLedgerFailureDetection, TestMissingArtifactDetection,
TestPost204RerunAcceptanceFixture). Full suite: 1229 passed, 1 skipped.
Closes AI-native-Systems-Research#242
* fix(meta-findings): add eager retry_log init + nous reports subcommand (AI-native-Systems-Research#242)
Expands the original ledger-floor PR to fully resolve AI-native-Systems-Research#242 by adding
the two follow-ups the issue body called out as adjacent items.
## Eager retry_log.jsonl init at iteration start
orchestrator/iteration.py:setup_work_dir now touches retry_log.jsonl
empty alongside state.json/ledger.json/principles.json. Before this
fix, retry_log.jsonl was created lazily by orchestrator.metrics on
first dispatch failure — meaning a dispatcher-side crash before any
retry left no parseable artifact at all, blinding every retry-log-keyed
heuristic in meta_findings.py to the failure. The eager touch
guarantees downstream tooling always sees a parseable artifact.
The missing-artifact detector in meta_findings.py is updated to
recognize an empty retry_log.jsonl as semantically equivalent to "no
dispatch retries logged" (the original signal it cared about). The
canonical post-AI-native-Systems-Research#242 catastrophic-failure shape is now: ledger row
FAILED + empty retry_log + missing findings.json — and the detector
fires on it.
## `nous reports` subcommand
A new `nous reports <target>` CLI subcommand re-emits meta_findings.json
on demand for any work_dir, regardless of whether the campaign reached
a clean terminal transition. Pure-Python; zero LLM tokens. Useful for:
- Legacy campaigns that pre-date the in-line emitter wired into
campaign.py.
- Aborted campaigns that never reached the four call sites that
invoke the emitter automatically.
- Re-emission after this PR's heuristics changes — the post-204-rerun
campaign goes from 0 nous_asks to 2 (one dispatch ask citing the
ledger FAILED row, one observability ask citing missing artifacts).
When the target work_dir is not at phase=DONE/STOPPED, the emitted
meta_findings.json is annotated with a `notes` field flagging
non-terminal state, so triage tooling doesn't conflate on-demand
emission with a clean terminal record.
Target accepts a campaign.yaml (preferred — supplies target_system
context for instrumentation/documentation heuristics) or a work_dir /
run_id resolvable via NOUS_CAMPAIGN_PARENT.
## Tests added (+7)
- TestMissingArtifactDetection.test_empty_retry_log_still_triggers —
pins the post-AI-native-Systems-Research#242 catastrophic-failure shape.
- TestSetupWorkDirLegacyDefault.test_creates_empty_retry_log_jsonl —
asserts setup_work_dir touches the file empty.
- TestSetupWorkDirLegacyDefault.test_retry_log_existing_content_not_clobbered
— idempotency under repeated setup_work_dir calls.
- TestCmdReports (4 tests) — work_dir target, yaml target, partial
state annotation, terminal state non-annotation.
End-to-end: running `nous reports` against the actual
paper-burst.post-204-rerun.1779882732/ campaign now emits 2 nous_asks
(was 0 before this PR), each citing iter-1 status=FAILED and the
missing per-iteration artifacts.
Full suite: 1236 passed (+7), 1 skipped, 0 failures.1 parent 18303a4 commit 46a288f
6 files changed
Lines changed: 765 additions & 2 deletions
File tree
- orchestrator
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
562 | 562 | | |
563 | 563 | | |
564 | 564 | | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
565 | 648 | | |
566 | 649 | | |
567 | 650 | | |
| |||
847 | 930 | | |
848 | 931 | | |
849 | 932 | | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
| 937 | + | |
| 938 | + | |
| 939 | + | |
| 940 | + | |
| 941 | + | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
850 | 946 | | |
851 | 947 | | |
852 | 948 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
799 | 799 | | |
800 | 800 | | |
801 | 801 | | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
802 | 813 | | |
803 | 814 | | |
804 | 815 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
345 | 345 | | |
346 | 346 | | |
347 | 347 | | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
348 | 448 | | |
349 | 449 | | |
350 | 450 | | |
| |||
453 | 553 | | |
454 | 554 | | |
455 | 555 | | |
456 | | - | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
457 | 561 | | |
458 | 562 | | |
459 | 563 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| |||
379 | 379 | | |
380 | 380 | | |
381 | 381 | | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
0 commit comments