Skip to content

[F17] Auto-capture reproducibility metadata (repo commit, hardware_config sha, deps) at INIT #262

Description

@sriumcp

Problem

nous captures per-experiment data well: bundle.yaml, patches/.patch, inputs/.yaml, results/*.json, findings.json, principles.json, gate summaries, agent logs. But it does NOT capture environment metadata that's required for external reproduction:

  • Target repo commit hash at experiment time.
  • go.sum / go.mod (or the language equivalent — requirements.txt, package-lock.json, Cargo.lock) snapshot.
  • hardware_config.json (or the equivalent latency-config file) content snapshot. This is critical: the MFU values directly affect π/δ in the latency model. If anyone edits the file between runs, results aren't deterministic from the work_dir alone.
  • Model latency-config files (α/β coefficients live in target-repo files, not in the work_dir).
  • Python version / Go version / Node version / etc.
  • gpu_memory_utilization (vLLM-style; affects K derivation).
  • The experiment branch (e.g., nous-exp-iter-1-<id>) — these get cleaned up at completion; only the patch survives in work_dir.

For the campaign author with a local repo, this isn't a blocker — they have the target-repo clone in place. But for an external reviewer or paper artifact-evaluator trying to reproduce the work, the patch may not apply cleanly to a future main, and the same code at a different hardware_config.json could produce different numbers.

Desired behavior

A reproducibility_metadata block in campaign.yaml schema, automatically populated by nous at INIT (before DESIGN even sees the campaign) and preserved across iterations:

reproducibility_metadata:
  repo_commit: "<auto-filled at INIT, e.g., abc1234>"
  repo_dirty: false               # true if working tree had uncommitted changes
  hardware_config_sha256: "<sha256 of hardware_config.json>"
  latency_config_files:           # snapshot of any α/β-defining files
    - "model-configs/llama-3.1-8b/latency.json"
  go_sum_sha256: "<sha256 of go.sum>"
  language_versions:
    go: "1.22.0"
    python: "3.11.x"
  gpu_memory_utilization: 0.9
  captured_at: "2026-05-31T..."

Behavior:

  1. Auto-populated by nous run at INIT, before DESIGN.
  2. Validates that the repo isn't dirty (warns if it is — proceeding with uncommitted changes is allowed but flagged).
  3. Snapshots the hardware-config + latency-config files into the work_dir at runs/iter-N/snapshots/.
  4. Preserved through iterations — every iteration inherits this block. If any tracked file changes during the campaign, that's a separate bundle_amendments.jsonl-style record.

Suggested implementation sketch

  1. Add reproducibility_metadata to the campaign.yaml schema (auto-populated, not user-set).
  2. In nous run's INIT phase, populate the block: git rev-parse HEAD, hash key files, capture language versions.
  3. Snapshot the latency-config files into <work_dir>/runs/iter-N/snapshots/ at iteration start.
  4. Surface the metadata in nous status and nous report.
  5. Document the convention in docs/reproducibility.md.

Acceptance criteria

  • reproducibility_metadata block is auto-populated on every new run.
  • Hardware-config and latency-config files are snapshotted per-iteration.
  • Repo-dirty state is flagged (warning, not fail).
  • nous status surfaces the metadata.
  • Friction report F17 row in the tracking issue checks off.

Severity

HIGH for paper-grade reproducibility — the 20% of reproducibility nous doesn't give for free.

Source

friction-report.md F17, paper-memorytime-mirage campaign (2026-05).


Part of friction-report tracking issue #245.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestfriction-reportFrom external campaign-author friction reports

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions