Bump skill turn caps (review 30→200, gen 500→1000) by rdimitrov · Pull Request #789 · stacklok/docs-website

rdimitrov · 2026-04-22T15:29:53Z

Context

PR #788 (Update stacklok/toolhive to v0.24.0) failed the augmentation run when `skill_review` hit the 30-turn cap mid-edit.

From run 24786157448:

Session	Turns	Cost (USD)	Outcome
`skill_gen`	67	$4.72	✅ Success (under 500 cap), committed `42267da Document local vMCP CLI mode`
`skill_review`	31	$1.86	❌ Exit `error_max_turns` at turn 31 / 30

Review's failure cascaded the workflow to `failure`, skipping autofix and PR body augmentation — so PR #788 is stuck with an incomplete rendered body and no run-cost table, even though the skill's content work actually landed.

Why the cap was too tight

My original review cap of 30 was sized against silent / no-changes release runs (baseline 4-6 turns). For a real multi-file content release like v0.24.0, the editorial pass genuinely walks each edited file and makes tightening edits — ~30-100 turns is legitimate working range.

Changes

Session	Old	New	Headroom rationale
`skill_review`	30	200	2x-6x over observed 30-100 working range
`skill_gen`	500	1000	Defensive doubling; 500 never hit in production, but 1000 keeps us well clear of the 397-turn v3-test anomaly

Caps still clip genuine runaways loudly; we raise them deliberately if a release ever genuinely needs more.

Follow-up for PR #788

Once this merges, PR #788 needs a retry via `gh workflow run upstream-release-docs.yml -f pr_number=788` to complete the augmentation step. The existing skill content commit (`42267da`) stays; retry re-runs everything but should now finish review + autofix + body augmentation cleanly.

PR #788 (Update stacklok/toolhive to v0.24.0) failed the run when skill_review hit the 30-turn cap mid-edit. Details from run 24786157448: - skill_gen: 67 turns / $4.72 (under 500 cap) -- succeeded, committed `42267da Document local vMCP CLI mode` - skill_review: 31 turns / $1.86 -- exited with subtype `error_max_turns`, cascading workflow to `failure`, which skipped autofix and PR body augmentation My initial cap of 30 for review was sized against silent/no- changes releases (4-6 turns baseline). For a real multi-file content review the editorial pass walks each edited file and makes tightening edits; ~30-100 turns is legitimate working range. Changes: - skill_review: 30 -> 200 (2x-6x headroom over working range) - skill_gen: 500 -> 1000 (defensive doubling; 500 was never hit in production, but 1000 keeps us well clear of the 397-turn v3-test anomaly) Hitting either cap still produces a loud failure; raise deliberately if a release genuinely needs more. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-04-22T15:29:59Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs-website	Ready	Preview, Comment	Apr 22, 2026 3:30pm

Copilot

Pull request overview

Adjusts the upstream-release-docs GitHub Actions workflow to reduce failures from hitting Claude Code session turn caps during larger, multi-file release doc updates (notably the review pass that previously capped at 30 turns).

Changes:

Increase skill_gen (generation) --max-turns from 500 to 1000.
Increase skill_review (editorial review) --max-turns from 30 to 200.
Update inline workflow comments documenting the rationale for the new caps.

Copilot · 2026-04-22T15:32:51Z

+          # --max-turns 1000: observed gen baselines are 20 turns
+          # (silent) to 152 (full content rebuild). 500 was the
+          # initial cap; bumped to 1000 for extra headroom on
+          # multi-feature releases and to stay well above the
+          # suspected-looping 397-turn v3-test run (still clips
+          # genuine runaways). Hitting the cap produces a loud
+          # failure -- raise deliberately if a release needs more.


The updated rationale mixes two different “observed baseline” ranges (20–152) while also referencing a 397-turn run. This is internally inconsistent and makes it hard to interpret the headroom goal. Consider rephrasing to separate “typical/observed” from “anomalous” runs (or update the baseline range to include the 397 outlier if it’s being treated as legitimate).

Suggested change

# --max-turns 1000: observed gen baselines are 20 turns

# (silent) to 152 (full content rebuild). 500 was the

# initial cap; bumped to 1000 for extra headroom on

# multi-feature releases and to stay well above the

# suspected-looping 397-turn v3-test run (still clips

# genuine runaways). Hitting the cap produces a loud

# failure -- raise deliberately if a release needs more.

# --max-turns 1000: typical observed gen runs span 20

# turns (silent) to 152 (full content rebuild). 500 was

# the initial cap; bumped to 1000 for extra headroom on

# multi-feature releases. Separately, a suspected-looping

# v3-test run reached 397 turns, so 1000 stays well above

# that anomalous case while still clipping genuine

# runaways. Hitting the cap produces a loud failure --

# raise deliberately if a release needs more.

Copilot · 2026-04-22T15:32:51Z

+          # --max-turns 200: initial cap of 30 was sized against
+          # silent-release baselines (4-6 turns) and was too tight
+          # for real content reviews. v0.24.0 (PR #788) hit it at
+          # turn 31 mid-review and failed the run; the editorial
+          # pass genuinely needs ~30-100 turns to walk a multi-
+          # file content PR. 200 gives 2x-6x headroom over that
+          # working range while still clipping a runaway.


The new --max-turns 200 rationale says the editorial pass can legitimately take ~30–100 turns, but an earlier nearby comment still states review is “4–5 turns in every run so far” and refers to “10 / 30” buffers. This creates conflicting documentation in the same step; please update the earlier comment (or remove/replace it) so the timeout/turn-cap guidance is consistent.

Copilot AI review requested due to automatic review settings April 22, 2026 15:29

rdimitrov enabled auto-merge (squash) April 22, 2026 15:30

vercel Bot deployed to Preview April 22, 2026 15:30 View deployment

ChrisJBurns approved these changes Apr 22, 2026

View reviewed changes

rdimitrov merged commit 41520bc into main Apr 22, 2026
5 checks passed

rdimitrov deleted the bump-skill-turn-caps branch April 22, 2026 15:30

Copilot AI reviewed Apr 22, 2026

View reviewed changes

rdimitrov mentioned this pull request Apr 22, 2026

Gitignore SUMMARY.md signal file #790

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump skill turn caps (review 30→200, gen 500→1000)#789

Bump skill turn caps (review 30→200, gen 500→1000)#789
rdimitrov merged 1 commit intomainfrom
bump-skill-turn-caps

rdimitrov commented Apr 22, 2026

Uh oh!

vercel Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-          # --max-turns 1000: observed gen baselines are 20 turns
-          # (silent) to 152 (full content rebuild). 500 was the
-          # initial cap; bumped to 1000 for extra headroom on
-          # multi-feature releases and to stay well above the
-          # suspected-looping 397-turn v3-test run (still clips
-          # genuine runaways). Hitting the cap produces a loud
-          # failure -- raise deliberately if a release needs more.
+          # --max-turns 1000: typical observed gen runs span 20
+          # turns (silent) to 152 (full content rebuild). 500 was
+          # the initial cap; bumped to 1000 for extra headroom on
+          # multi-feature releases. Separately, a suspected-looping
+          # v3-test run reached 397 turns, so 1000 stays well above
+          # that anomalous case while still clipping genuine
+          # runaways. Hitting the cap produces a loud failure --
+          # raise deliberately if a release needs more.

Conversation

rdimitrov commented Apr 22, 2026

Context

Why the cap was too tight

Changes

Follow-up for PR #788

Uh oh!

vercel Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel Bot commented Apr 22, 2026 •

edited

Loading