Skip to content

Commit 068410f

Browse files
authored
Validate make and just command references (#37)
1 parent e98b6f4 commit 068410f

9 files changed

Lines changed: 1075 additions & 17 deletions

ROADMAP.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -106,9 +106,9 @@ guidance, Next.js App Router notes, failure-memory verification, decision-memory
106106
warnings, deterministic behavior gate placement, and trigger-based task outcome
107107
evidence. Useful next additions include:
108108

109-
- command existence validation for `make`, `just`, Maven, Gradle, Go, and other
110-
profile-relevant task runners referenced by failure-memory or effectiveness
111-
records
109+
- command existence validation beyond package scripts, root `make` targets, and
110+
root `just` recipes for Maven, Gradle, Go, and other profile-relevant task
111+
runners referenced by failure-memory or effectiveness records
112112
- more fixture-backed examples for provider-specific request shape, response
113113
envelopes, redaction, zero-result behavior, and provider errors
114114
- clearer ADR and failure-record boundary examples for small changes so

docs/decisions/0004-link-failure-memory-to-regression-checks.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -78,17 +78,21 @@ post-push, or unknown.
7878
JavaScript package scripts and avoids passing a root command only because a
7979
nested workspace package has the same script, but it still does not prove that
8080
the script asserts the specific failure axis.
81+
- Root `make target` and `just recipe` checks are verified against checked-in
82+
root Makefile variants and justfile variants. This closes the same
83+
fake-command gap for common task-runner commands, but it still does not parse
84+
included makefiles, option-heavy invocations such as `make -C app check`, or
85+
prove that the target asserts the specific failure axis.
8186
- Other command-shaped checks are still recognized mostly by shape. The checker
82-
does not yet verify that `make`, `just`, Python module commands, Gradle,
83-
Maven, Go, Rust, .NET, or other task-runner commands exist in the target
84-
configuration.
87+
does not yet verify that Python module commands, Gradle, Maven, Go, Rust,
88+
.NET, or other task-runner commands exist in the target configuration.
8589
- Monorepo and workspace-specific commands need explicit target adaptation when
8690
the intended command is not runnable from the repository root.
8791
- Detection-link validation is regex-based. It blocks known non-committal
8892
phrases, but future wording may require additional test cases.
89-
- Generic command coverage is still biased toward common JavaScript and Python
90-
commands. Add explicit coverage before relying on this gate for Go, Rust,
91-
Java, .NET, or Gradle-heavy targets.
93+
- Generic command coverage is still biased toward common JavaScript, Make,
94+
Just, and Python-shaped commands. Add explicit coverage before relying on
95+
this gate for Go, Rust, Java, .NET, or Gradle-heavy targets.
9296
- Target repositories with pre-existing non-kit `docs/failures/*.md` schemas
9397
may need adoption-specific adaptation instead of blindly applying the generic
9498
checker.
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
schema_version: 1
2+
3+
target:
4+
repository: baskduf/harness-starter-kit
5+
repository_ref: codex/make-just-command-validation working tree based on e98b6f40f134bcacbab0bb9cb12178f1e071bb63
6+
stack_or_framework: Python and Markdown harness kit
7+
date: 2026-06-06
8+
agent_or_model: Codex
9+
reviewer: Codex primary agent plus read-only subagent reviewer
10+
11+
task:
12+
id: make-just-command-validation
13+
run_id: harness-starter-kit-005
14+
prompt_summary: Extend command existence validation for root make targets and just recipes.
15+
prompt_ref: current Codex thread request to proceed with option 1 and review findings until resolved
16+
prompt_hash: not recorded
17+
comparable_task_group: harness-maintenance
18+
condition: harnessed-only
19+
expected_boundary:
20+
- ROADMAP.md
21+
- docs/decisions/0004-link-failure-memory-to-regression-checks.md
22+
- docs/examples/task-outcomes/**
23+
- scripts/check_effectiveness_plan.py
24+
- scripts/check_failure_memory.py
25+
- templates/generic/scripts/check_effectiveness_plan.py
26+
- templates/generic/scripts/check_failure_memory.py
27+
- tests/test_check_effectiveness_plan.py
28+
- tests/test_check_failure_memory.py
29+
- tests/test_repository_hygiene.py
30+
known_failure_mode: Failure memory or task outcome evidence can cite fake make or just commands that look concrete but are not declared in the target repository.
31+
32+
harness_context:
33+
harness_doctor_score: previously 98/100, not treated as effectiveness proof
34+
harness_source:
35+
kit_url: https://github.com/baskduf/harness-starter-kit
36+
kit_commit: e98b6f40f134bcacbab0bb9cb12178f1e071bb63
37+
source_tracking_ref: none; this repository is the kit source
38+
relevant_instructions:
39+
- AGENTS.md
40+
- docs/decisions/0004-link-failure-memory-to-regression-checks.md
41+
- ROADMAP.md
42+
relevant_constraints:
43+
- python3 -m unittest tests.test_check_failure_memory tests.test_check_effectiveness_plan tests.test_repository_hygiene
44+
- python3 -m py_compile scripts/check_failure_memory.py scripts/check_effectiveness_plan.py templates/generic/scripts/check_failure_memory.py templates/generic/scripts/check_effectiveness_plan.py
45+
- python3 scripts/check_effectiveness_plan.py
46+
- python3 scripts/check_failure_memory.py
47+
relevant_memory_records:
48+
- docs/decisions/0004-link-failure-memory-to-regression-checks.md
49+
- docs/decisions/0006-trigger-task-outcome-evidence-for-substantial-harness-work.md
50+
- docs/failures/0005-failure-memory-was-not-linked-to-regression-checks.md
51+
- docs/failures/0007-dogfood-first-pass-failures-lacked-memory-decision.md
52+
53+
outcome:
54+
files_changed:
55+
- ROADMAP.md
56+
- docs/decisions/0004-link-failure-memory-to-regression-checks.md
57+
- docs/examples/task-outcomes/005-make-just-command-validation.yaml
58+
- scripts/check_effectiveness_plan.py
59+
- scripts/check_failure_memory.py
60+
- templates/generic/scripts/check_effectiveness_plan.py
61+
- templates/generic/scripts/check_failure_memory.py
62+
- tests/test_check_effectiveness_plan.py
63+
- tests/test_check_failure_memory.py
64+
wrong_file_edits: 0
65+
repeated_known_mistake: false
66+
verification_command: python3 -m unittest tests.test_check_failure_memory tests.test_check_effectiveness_plan tests.test_repository_hygiene && python3 -m py_compile scripts/check_failure_memory.py scripts/check_effectiveness_plan.py templates/generic/scripts/check_failure_memory.py templates/generic/scripts/check_effectiveness_plan.py
67+
first_pass_verification:
68+
result: failed_then_passed
69+
drift_violations_detected: []
70+
human_rework_minutes: 0
71+
reverted_files: []
72+
notes: The validation is intentionally scoped to root Makefile and justfile variants; option-heavy or included-file task runners remain follow-up scope. Primary-agent review fixed a test assertion indentation issue; subagent review found and the final loop fixed task-outcome false-inclusion validation, GNU makefile precedence, make variable assignments including dotted values, just default-parameter parsing, and an inline-code backtick regex miss before final validation.
73+
74+
follow_up:
75+
harness_change_needed: false
76+
decision_or_failure_record: docs/decisions/0004-link-failure-memory-to-regression-checks.md; no failure record because this closed a known validation gap rather than fixing a recurring failed check from this run.
77+
include_in_effectiveness_report: false
78+
include_in_comparable_product_task_count: false

scripts/check_effectiveness_plan.py

Lines changed: 153 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,11 @@
9797
re.compile(r"\b(?:tests?|specs?|fixtures?|scripts?)/[^\s,.;)]+"),
9898
re.compile(r"`?\.github/workflows/[^\s,.;)`]+`?"),
9999
re.compile(r"\b(?:npm|pnpm|yarn|bun)\s+run\s+[\w:./-]+"),
100-
re.compile(r"\b(?:make|just)\s+[\w:./-]+"),
100+
re.compile(
101+
r"\bmake(?:\s+[\w.-]+=[^\s,;)`\]}]+)*\s+(?!-)[\w:./-]+"
102+
r"(?=$|[\s,.;)`\]}])"
103+
),
104+
re.compile(r"\bjust\s+(?!-)[\w:./-]+"),
101105
re.compile(r"\bpython3?\s+(?:-m\s+[\w.:-]+|scripts?/[^\s,.;)]+)"),
102106
re.compile(r"\bpytest\s+(?:-[\w-]+|tests?/[^\s,.;)]+|[\w/.-]+)"),
103107
re.compile(r"\b(?:vitest|jest|ruff|mypy|eslint)\s+[\w/.:@-]+"),
@@ -116,6 +120,13 @@
116120
PACKAGE_SCRIPT_COMMAND_RE = re.compile(
117121
r"\b(?P<manager>npm|pnpm|yarn|bun)\s+run\s+(?P<script>[\w:./-]+)"
118122
)
123+
MAKE_COMMAND_RE = re.compile(
124+
r"\bmake(?:\s+[\w.-]+=[^\s,;)`\]}]+)*\s+(?!-)(?P<target>[\w:./-]+)"
125+
r"(?=$|[\s,.;)`\]}])"
126+
)
127+
JUST_COMMAND_RE = re.compile(r"\bjust\s+(?!-)(?P<recipe>[\w:./-]+)")
128+
MAKEFILE_NAMES = ("GNUmakefile", "makefile", "Makefile")
129+
JUSTFILE_NAMES = ("justfile", "Justfile", ".justfile")
119130

120131
FAILURE_RECORD_RE = re.compile(
121132
r"`?(docs/failures/[^\s,;)`]+)`?",
@@ -429,6 +440,10 @@ def normalize_package_script(value: str) -> str:
429440
return value.rstrip(".,;)]}")
430441

431442

443+
def normalize_command_target(value: str) -> str:
444+
return value.rstrip(".,;)]}")
445+
446+
432447
def root_package_scripts(root: Path) -> set[str]:
433448
package_json = root / "package.json"
434449
if not package_json.exists():
@@ -443,6 +458,71 @@ def root_package_scripts(root: Path) -> set[str]:
443458
return {str(name) for name in package_scripts}
444459

445460

461+
def root_make_targets(root: Path) -> set[str]:
462+
targets: set[str] = set()
463+
path = next(
464+
(root / name for name in MAKEFILE_NAMES if (root / name).exists()),
465+
None,
466+
)
467+
if path is None:
468+
return targets
469+
try:
470+
lines = path.read_text(encoding="utf-8").splitlines()
471+
except (OSError, UnicodeDecodeError):
472+
return targets
473+
for raw_line in lines:
474+
if not raw_line or raw_line[:1].isspace():
475+
continue
476+
line = raw_line.split("#", 1)[0].rstrip()
477+
if ":" not in line:
478+
continue
479+
target_part, rule_part = line.split(":", 1)
480+
if not target_part.strip() or "=" in target_part:
481+
continue
482+
if rule_part.lstrip().startswith("="):
483+
continue
484+
for target in target_part.split():
485+
if target and "%" not in target and not target.startswith("."):
486+
targets.add(target)
487+
return targets
488+
489+
490+
def root_just_recipes(root: Path) -> set[str]:
491+
recipes: set[str] = set()
492+
for name in JUSTFILE_NAMES:
493+
path = root / name
494+
if not path.exists():
495+
continue
496+
try:
497+
lines = path.read_text(encoding="utf-8").splitlines()
498+
except (OSError, UnicodeDecodeError):
499+
continue
500+
for raw_line in lines:
501+
if not raw_line or raw_line[:1].isspace():
502+
continue
503+
line = raw_line.split("#", 1)[0].rstrip()
504+
alias_match = re.match(r"alias\s+(?P<name>[\w.-]+)\s*:=", line)
505+
if alias_match is not None:
506+
recipes.add(alias_match.group("name"))
507+
continue
508+
if ":" not in line:
509+
continue
510+
recipe_part, rule_part = line.split(":", 1)
511+
if not recipe_part.strip():
512+
continue
513+
if rule_part.lstrip().startswith("="):
514+
continue
515+
recipe_part = recipe_part.strip()
516+
while recipe_part.startswith("[") and "]" in recipe_part:
517+
recipe_part = recipe_part.split("]", 1)[1].strip()
518+
if not recipe_part:
519+
continue
520+
recipe = recipe_part.split()[0].lstrip("@")
521+
if recipe and not recipe.startswith("["):
522+
recipes.add(recipe)
523+
return recipes
524+
525+
446526
def missing_package_script_commands(root: Path, value: str | None) -> list[str]:
447527
if value is None:
448528
return []
@@ -463,6 +543,38 @@ def missing_package_script_commands(root: Path, value: str | None) -> list[str]:
463543
]
464544

465545

546+
def missing_make_commands(root: Path, value: str | None) -> list[str]:
547+
if value is None:
548+
return []
549+
commands = sorted(
550+
{
551+
normalize_command_target(match.group("target"))
552+
for match in MAKE_COMMAND_RE.finditer(value)
553+
}
554+
)
555+
if not commands:
556+
return []
557+
558+
targets = root_make_targets(root)
559+
return [f"make {target}" for target in commands if target not in targets]
560+
561+
562+
def missing_just_commands(root: Path, value: str | None) -> list[str]:
563+
if value is None:
564+
return []
565+
commands = sorted(
566+
{
567+
normalize_command_target(match.group("recipe"))
568+
for match in JUST_COMMAND_RE.finditer(value)
569+
}
570+
)
571+
if not commands:
572+
return []
573+
574+
recipes = root_just_recipes(root)
575+
return [f"just {recipe}" for recipe in commands if recipe not in recipes]
576+
577+
466578
def says_no_failure_record(value: str | None) -> bool:
467579
if value is None:
468580
return False
@@ -626,6 +738,26 @@ def validate_adoption_report(root: Path, path: Path, text: str) -> list[Finding]
626738
),
627739
)
628740
)
741+
for command in missing_make_commands(root, detection_value):
742+
findings.append(
743+
Finding(
744+
path,
745+
(
746+
"failure-memory detection references missing "
747+
f"Makefile target: {command}"
748+
),
749+
)
750+
)
751+
for command in missing_just_commands(root, detection_value):
752+
findings.append(
753+
Finding(
754+
path,
755+
(
756+
"failure-memory detection references missing "
757+
f"justfile recipe: {command}"
758+
),
759+
)
760+
)
629761

630762
return findings
631763

@@ -646,7 +778,7 @@ def validate_effectiveness_report(path: Path, text: str) -> list[Finding]:
646778
return findings
647779

648780

649-
def validate_task_outcome(path: Path, text: str) -> list[Finding]:
781+
def validate_task_outcome(root: Path, path: Path, text: str) -> list[Finding]:
650782
report_include_value = yaml_field_value(text, "include_in_effectiveness_report")
651783
comparable_count_value = yaml_field_value(
652784
text, "include_in_comparable_product_task_count"
@@ -679,6 +811,22 @@ def validate_task_outcome(path: Path, text: str) -> list[Finding]:
679811
)
680812
)
681813

814+
verification_command = yaml_field_value(text, "verification_command")
815+
for command in missing_make_commands(root, verification_command):
816+
findings.append(
817+
Finding(
818+
path,
819+
f"task outcome verification references missing Makefile target: {command}",
820+
)
821+
)
822+
for command in missing_just_commands(root, verification_command):
823+
findings.append(
824+
Finding(
825+
path,
826+
f"task outcome verification references missing justfile recipe: {command}",
827+
)
828+
)
829+
682830
truthy_include_fields = [
683831
field
684832
for field in TASK_OUTCOME_INCLUDE_FIELDS
@@ -790,7 +938,9 @@ def check_effectiveness_plan(root: Path, require_report: bool) -> int:
790938
findings.extend(validate_effectiveness_report(path, text))
791939

792940
for path in iter_task_outcomes(root):
793-
findings.extend(validate_task_outcome(path, path.read_text(encoding="utf-8")))
941+
findings.extend(
942+
validate_task_outcome(root, path, path.read_text(encoding="utf-8"))
943+
)
794944

795945
for finding in findings:
796946
print(f"{finding.path.relative_to(root)}: {finding.message}")

0 commit comments

Comments
 (0)