Skip to content

Commit 05dccb3

Browse files
committed
Expound on full stage retry rationale in HISTORY.rst
1 parent 15f02e1 commit 05dccb3

2 files changed

Lines changed: 19 additions & 6 deletions

File tree

.github/workflows/pr.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ jobs:
6969
strategy:
7070
fail-fast: false
7171
matrix:
72-
python-version: ["3.10", "3.11", "3.12"]
72+
python-version: ["3.10", "3.11", "3.12", "3.13"]
7373

7474
steps:
7575
- uses: actions/checkout@v4

HISTORY.rst

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,24 @@ QA checklist hardening
2828
Full stage retry on QA exhaustion
2929
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3030
* **Full stage retry when QA remediation fails** -- when QA remediation
31-
exhausts all attempts for a stage, the build retries the entire stage
32-
from scratch instead of stopping. Previous QA findings are injected
33-
into the new generation prompt so the model avoids the same classes of
34-
mistakes. Controlled by ``_MAX_FULL_STAGE_ATTEMPTS`` (default 2:
35-
1 initial + 1 retry).
31+
exhausts all attempts for a stage, the build now retries the entire
32+
stage from scratch (clean artifacts, regenerate, QA) instead of
33+
stopping the build immediately. Previous QA findings are injected
34+
into the new generation prompt — framed as guidance rather than
35+
file-specific instructions — so the model avoids the same classes
36+
of mistakes on the fresh attempt.
37+
38+
In practice, the same generation prompt produces passing code ~90%
39+
of the time. The remaining ~10% failure rate is stochastic — not a
40+
systematic prompt deficiency — meaning a fresh generation with
41+
knowledge of what went wrong almost always succeeds. Without this
42+
retry, that 10% forces the user to manually re-run the entire build,
43+
losing the progress of all previously generated stages. The retry
44+
doubles the token cost of one stage in the worst case, but saves
45+
the full cost of restarting a 16-stage build from scratch.
46+
47+
Controlled by ``_MAX_FULL_STAGE_ATTEMPTS`` (default 2: 1 initial
48+
+ 1 fresh retry).
3649

3750
Generation prompt improvements
3851
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)