fix(ci): improve Agentic CI daily audit reliability#632
Conversation
…ntic-ci-docs-audit-turn-limit
Code Review: PR #632 — fix(ci): improve Agentic CI daily audit reliabilitySummaryThis PR is a focused reliability fix for the Agentic CI daily audit workflow. It does five things:
Scope is correct: only FindingsCode correctness
Project conventions
Risks / things to watch
Test coverage
Performance / security
Suggestions (non-blocking)
VerdictApprove. Targeted reliability fixes with a clear paper trail (the "Recent failure scan" section maps each change to a specific historical failure). The risks are well-contained: scope is limited to CI configuration, no product code is touched, and the failure modes added by batching are explicitly bounded (3-file cap, atomic abandon). The advisory-only nature of the in-recipe turn budgets is the weakest part — but pairing them with the early-partial-report pattern means even a non-compliant agent run produces a usable artifact, which is the actual reliability goal. |
Greptile SummaryThis PR improves reliability of the Agentic CI daily audit workflows by fixing a recipe/workflow
|
| Filename | Overview |
|---|---|
| .github/workflows/agentic-ci-daily.yml | Adds per-recipe max_turns parsing via awk + grep pipeline with inline-comment and quoted-value safety; fix phase correctly retains hardcoded 50; curl pre-flight raised to 30s. |
| .github/workflows/agentic-ci-issue-triage.yml | Single-line change: curl pre-flight timeout raised from 10s to 30s, consistent with daily and PR-review workflows. |
| .github/workflows/agentic-ci-pr-review.yml | Single-line change: curl pre-flight timeout raised from 10s to 30s, aligned with other workflow files. |
| .agents/recipes/_fix-policy.md | Adds batching support: step 4.1 collects siblings, step 4.2 re-verifies and removes stale primary/siblings, crash recovery now parses multiple markers; all internal references are consistent. |
| .agents/recipes/_phase-fix.md | Step-number reference to _fix-policy.md removed (making it resilient to future renumbering); batch PR recording guidance added, consistent with _fix-policy.md. |
| .agents/recipes/_runner.md | Added 'No subagents' rule to prevent CI failures from delegated agent model-access errors. |
| .agents/recipes/structure/recipe.md | max_turns raised to 50; missing-future category opted into batching with 3-file cap, same-test-target grouping, and one marker+entry per file — consistent with _fix-policy.md. |
| .agents/recipes/code-quality/recipe.md | max_turns raised from 30 to 50 based on recent successful run history (31 turns used). |
| .agents/recipes/docs-and-references/recipe.md | New turn-budget section added: writes partial report immediately, stops after 20 tool calls or 2 new findings per section, ensuring a usable artifact even if interrupted. |
| .agents/recipes/test-health/recipe.md | Same turn-budget section added as docs-and-references: early partial report, bounded sampling, and explicit stop conditions. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Daily workflow triggered] --> B[Select recipe suite]
B --> C[Parse max_turns from recipe frontmatter\nawk + grep pipeline]
C --> D{max_turns found?}
D -- No --> E[Default to 50]
D -- Yes --> F[Use recipe value]
E --> G[Run audit phase\nclaude --max-turns MAX_TURNS]
F --> G
G --> H{Audit success?}
H -- No --> Z[End]
H -- Yes --> I[Check fix_backlog size]
I --> J{Backlog > 0\nand suite eligible?}
J -- No --> Z
J -- Yes --> K[Snapshot attempted_fixes]
K --> L[Run fix phase\nclaude --max-turns 50]
L --> M[Select primary candidate]
M --> N{Category batchable?}
N -- Yes --> O[Collect siblings with same\ntest_target, batch <= 3]
N -- No --> P[Single finding]
O --> Q[Re-verify all findings still apply]
P --> Q
Q --> R{All valid?}
R -- Primary stale --> S[Remove primary from fix_backlog\nnext candidate]
R -- Sibling stale --> T[Remove sibling from fix_backlog\ncontinue with smaller batch]
T --> R
S --> M
R -- All valid --> U[Apply fix / batch]
U --> V[Run package tests]
V --> W[Push branch, open PR\none hidden marker per finding]
W --> X[Record one attempted_fixes entry\nper fixed finding]
X --> Y[Validate fix scope gate]
Y --> Z[End]
Reviews (5): Last reviewed commit: "fix(ci): harden agentic max turns parsin..." | Re-trigger Greptile
|
Thanks for putting this together, @andreatgretel! SummaryThis tightens Agentic CI budgets and run instructions, makes daily audit FindingsSuggestions — Take it or leave it
MAX_TURNS=$(awk -F': *' '
/^---$/ { section++; next }
section == 1 && $1 == "max_turns" { print $2; exit }
section == 2 { exit }
' "${RECIPE_DIR}/recipe.md" | grep -oE '[0-9]+' | head -n1 || true)
MAX_TURNS=${MAX_TURNS:-50}
What Looks Good
VerdictShip it (with nits) — No blocking issues. The suggestions above are small hardening/clarity improvements. This review was generated by an AI assistant. |
|
Implemented the two review suggestions and pushed them in |
Summary
Improves Agentic CI reliability in focused places:
max_turns.they run out of turns.
successful run history.
inaccessible model.
staying under the existing scope gates.
workflows still used the old shorter probe.
Changes
Changed
.github/workflows/agentic-ci-daily.ymlto readmax_turnsfromrecipe frontmatter instead of always passing
50to Claude.max_turnsparsing so inline comments or quoted YAML values do notbreak
claude --max-turns.curl --max-timefrom 10s to 30s in daily,repository triage, and PR review workflows.
.agents/recipes/docs-and-references/recipe.mdso it writes apartial report early and samples bounded docs/source sets.
.agents/recipes/test-health/recipe.md.structureandcode-qualityrecipe budgets to 50 after recentsuccessful runs used 34 and 31 turns respectively.
.agents/recipes/_runner.mdto keep CIrecipes in the main agent session instead of delegated/local agents.
.agents/recipes/_fix-policy.mdand.agents/recipes/_phase-fix.mdto allow suite-declared batchable mechanicalfixes.
structure / missing-futureinto batching in.agents/recipes/structure/recipe.md, capped batches at 3 files, anddocumented batch grouping by package test target.
Why
daily audit execution.
exploring and leave no useful artifact.
effective budget below recent successful runs.
CI key cannot access, then the parent agent keeps running until max turns.
the combined diff still satisfies the localized-fix bar.
responds slower than 10 seconds.
Recent failure scan
passed on retry. Covered by 30s daily pre-flight.
task model, and the main recipe later hit 50 turns with no report. Covered by
no-subagent runner guidance plus docs turn-budget changes.
error_max_turnsafter 50 turns with no report.Covered by docs turn-budget changes and recipe
max_turnsenforcement.Reached max turns (30)with no report.Covered by test-health turn-budget changes.
during the old model/config period. Current health-probe covers CLI/model
compatibility; this PR also aligns triage timeout with the health probe.
Claude review
Claude review found no blocking issues. Follow-ups addressed in this PR:
max_turnsparsingValidation
make install-dev.venv/bin/ruff check --fix ..venv/bin/ruff format .max_turnsvalues for all recipes.git diff --checkconflict, mixed line ending. Ruff hooks skipped on the latest commits because
no Python files changed.