Skip to content

[codex] strengthen dogfood evidence validation#34

Merged
baskduf merged 1 commit into
mainfrom
codex/dogfood-evidence-validation
Jun 6, 2026
Merged

[codex] strengthen dogfood evidence validation#34
baskduf merged 1 commit into
mainfrom
codex/dogfood-evidence-validation

Conversation

@baskduf
Copy link
Copy Markdown
Owner

@baskduf baskduf commented Jun 6, 2026

What changed

  • Added a dogfood evidence adoption checklist and linked it from evaluation, validation, and the component map.
  • Extended effectiveness-plan validation to reject stale aggregate completion language, countable templates, placeholder task outcomes, and inconsistent inclusion flags.
  • Synced the generic template checker and added unit coverage plus dogfood failure/decision memory.

Checks

  • python3 -m unittest discover -s tests
  • python3 -m py_compile scripts/apply_harness.py scripts/check_docs_drift.py scripts/check_structure.py scripts/check_encoding_hygiene.py scripts/check_effectiveness_plan.py scripts/check_failure_memory.py scripts/check_decision_memory.py scripts/harness_doctor.py
  • python3 scripts/check_docs_drift.py
  • python3 scripts/check_structure.py
  • python3 scripts/check_encoding_hygiene.py
  • python3 scripts/check_effectiveness_plan.py
  • python3 scripts/check_failure_memory.py
  • python3 scripts/check_decision_memory.py
  • python3 scripts/harness_doctor.py --target .
  • cmp -s scripts/check_effectiveness_plan.py templates/generic/scripts/check_effectiveness_plan.py

Review

Direct review and sub-agent review completed before commit; latest sub-agent review returned no findings.

Remaining risk

Harnessed-only dogfood evidence still does not prove effectiveness improvement without a comparable baseline or later comparison window.

@baskduf baskduf merged commit 387dbfa into main Jun 6, 2026
2 checks passed
@baskduf baskduf deleted the codex/dogfood-evidence-validation branch June 6, 2026 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant