From 4e9fbc7662e6c80b5c9aabb0857a4bc48bd4e940 Mon Sep 17 00:00:00 2001 From: Radoslav Dimitrov Date: Wed, 22 Apr 2026 18:29:29 +0300 Subject: [PATCH] Bump skill turn caps after hitting review limit on v0.24.0 PR #788 (Update stacklok/toolhive to v0.24.0) failed the run when skill_review hit the 30-turn cap mid-edit. Details from run 24786157448: - skill_gen: 67 turns / $4.72 (under 500 cap) -- succeeded, committed `42267da Document local vMCP CLI mode` - skill_review: 31 turns / $1.86 -- exited with subtype `error_max_turns`, cascading workflow to `failure`, which skipped autofix and PR body augmentation My initial cap of 30 for review was sized against silent/no- changes releases (4-6 turns baseline). For a real multi-file content review the editorial pass walks each edited file and makes tightening edits; ~30-100 turns is legitimate working range. Changes: - skill_review: 30 -> 200 (2x-6x headroom over working range) - skill_gen: 500 -> 1000 (defensive doubling; 500 was never hit in production, but 1000 keeps us well clear of the 397-turn v3-test anomaly) Hitting either cap still produces a loud failure; raise deliberately if a release genuinely needs more. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/upstream-release-docs.yml | 28 +++++++++++++-------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/.github/workflows/upstream-release-docs.yml b/.github/workflows/upstream-release-docs.yml index 95299582..0baf5097 100644 --- a/.github/workflows/upstream-release-docs.yml +++ b/.github/workflows/upstream-release-docs.yml @@ -548,14 +548,16 @@ jobs: # authors write at open time. GH_TOKEN for auth is already # in the job env at the top of this workflow. # - # --max-turns 500: observed gen baselines are 89 turns - # (silent) to 397 (full content rebuild). 500 gives headroom - # over the worst legitimate run, while clipping a genuine - # runaway before it spirals. Hitting the cap produces a - # loud failure -- raise deliberately if a release needs more. + # --max-turns 1000: observed gen baselines are 20 turns + # (silent) to 152 (full content rebuild). 500 was the + # initial cap; bumped to 1000 for extra headroom on + # multi-feature releases and to stay well above the + # suspected-looping 397-turn v3-test run (still clips + # genuine runaways). Hitting the cap produces a loud + # failure -- raise deliberately if a release needs more. claude_args: | --model claude-opus-4-7 - --max-turns 500 + --max-turns 1000 --allowed-tools "Bash(gh:*)" prompt: | You are running in GitHub Actions with no interactive user. Follow @@ -738,12 +740,18 @@ jobs: display_report: true # gh access parallels skill_gen so the review pass can # re-verify claims against PR descriptions and linked - # issues if needed. --max-turns 30 is 6x the 4-5-turn - # baseline; if review ever needs more, the cap fails - # loudly and we raise it. + # issues if needed. + # + # --max-turns 200: initial cap of 30 was sized against + # silent-release baselines (4-6 turns) and was too tight + # for real content reviews. v0.24.0 (PR #788) hit it at + # turn 31 mid-review and failed the run; the editorial + # pass genuinely needs ~30-100 turns to walk a multi- + # file content PR. 200 gives 2x-6x headroom over that + # working range while still clipping a runaway. claude_args: | --model claude-opus-4-7 - --max-turns 30 + --max-turns 200 --allowed-tools "Bash(gh:*)" prompt: | You are running in GitHub Actions with no interactive user. Follow