Skip to content

Commit dc643db

Browse files
volinskeyclaude
andcommitted
ci(publish): idempotent + transient-tolerant npm publish steps
A lockstep publish (v2.25.0) stranded mid-flight: run402-mcp published, then `run402` hit a transient sigstore rekor TLOG timeout (TLOG_CREATE_ENTRY_ERROR), failing the run before run402/@run402/sdk published and before the version-bump commit/tag. A naive re-run would 409 on the already-published run402-mcp. Each publish step now: retries transient provenance/network errors (rekor TLOG, ECONNRESET, socket hang up) up to 3x, and treats "already published at this version" as success — so a re-run after a partial publish completes the remaining packages at the same lockstep version instead of failing on the one that already landed. Any other error still fails loudly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent a463b49 commit dc643db

1 file changed

Lines changed: 71 additions & 9 deletions

File tree

.github/workflows/publish.yml

Lines changed: 71 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -187,23 +187,85 @@ jobs:
187187
(cd "$SMOKE/sdk/package" && node -e "import('./dist/node/index.js').then(m => { if (typeof m.run402 !== 'function') { console.error('node run402 export is not a function'); process.exit(1); } console.log('SDK /node smoke OK'); }).catch(e => { console.error('Import failed:', e.message); process.exit(1); })")
188188
(cd "$SMOKE/sdk/package" && node -e "import('./dist/index.js').then(m => { if (typeof m.Run402 !== 'function') { console.error('iso Run402 export is not a function'); process.exit(1); } console.log('SDK iso smoke OK'); }).catch(e => { console.error('Import failed:', e.message); process.exit(1); })")
189189
190+
# Idempotent + transient-tolerant publish. --provenance is optional under
191+
# OIDC (provenance is generated implicitly) but makes intent explicit and
192+
# fails loudly if OIDC isn't wired. The helper:
193+
# - retries transient provenance/network errors (sigstore rekor TLOG
194+
# timeouts, ECONNRESET) up to 3x — these stranded a lockstep publish
195+
# mid-flight once (run402-mcp published, run402 hit a rekor abort);
196+
# - treats "already published at this version" as success, so a re-run
197+
# after a partial publish completes the remaining packages and reaches
198+
# a consistent lockstep version instead of 409-ing on the one that
199+
# already landed;
200+
# - fails loudly on any other error.
190201
- name: Publish run402-mcp to npm
191202
if: ${{ !inputs.dry_run }}
192-
# --provenance is optional under OIDC (provenance is generated
193-
# implicitly), but the flag makes the intent explicit and would
194-
# cause the publish to fail loudly if OIDC isn't actually wired
195-
# — which is exactly what we want during the first few CI runs.
196-
# --access public is redundant for these public packages but
197-
# harmless and self-documenting.
198-
run: npm publish --access public --provenance
203+
run: |
204+
publish_idempotent() {
205+
local label="$1"; local attempt=0 out
206+
while [ "$attempt" -lt 3 ]; do
207+
attempt=$((attempt + 1))
208+
if out=$(npm publish --access public --provenance 2>&1); then
209+
echo "$out"; echo "[$label] published (attempt $attempt)"; return 0
210+
fi
211+
echo "$out"
212+
if echo "$out" | grep -qiE "cannot publish over|previously published|EPUBLISHCONFLICT"; then
213+
echo "[$label] already published at this version — idempotent skip."; return 0
214+
fi
215+
if echo "$out" | grep -qiE "TLOG_CREATE_ENTRY_ERROR|rekor\.sigstore\.dev|tlog entry|ETIMEDOUT|ECONNRESET|socket hang up|aborted"; then
216+
echo "[$label] transient provenance/network error — retry $attempt/3"; sleep 10; continue
217+
fi
218+
echo "[$label] non-retryable publish error."; return 1
219+
done
220+
echo "[$label] exhausted retries."; return 1
221+
}
222+
publish_idempotent run402-mcp
199223
200224
- name: Publish run402 to npm
201225
if: ${{ !inputs.dry_run }}
202-
run: cd cli && npm publish --access public --provenance
226+
run: |
227+
publish_idempotent() {
228+
local label="$1"; local attempt=0 out
229+
while [ "$attempt" -lt 3 ]; do
230+
attempt=$((attempt + 1))
231+
if out=$( (cd cli && npm publish --access public --provenance) 2>&1); then
232+
echo "$out"; echo "[$label] published (attempt $attempt)"; return 0
233+
fi
234+
echo "$out"
235+
if echo "$out" | grep -qiE "cannot publish over|previously published|EPUBLISHCONFLICT"; then
236+
echo "[$label] already published at this version — idempotent skip."; return 0
237+
fi
238+
if echo "$out" | grep -qiE "TLOG_CREATE_ENTRY_ERROR|rekor\.sigstore\.dev|tlog entry|ETIMEDOUT|ECONNRESET|socket hang up|aborted"; then
239+
echo "[$label] transient provenance/network error — retry $attempt/3"; sleep 10; continue
240+
fi
241+
echo "[$label] non-retryable publish error."; return 1
242+
done
243+
echo "[$label] exhausted retries."; return 1
244+
}
245+
publish_idempotent run402
203246
204247
- name: Publish @run402/sdk to npm
205248
if: ${{ !inputs.dry_run }}
206-
run: cd sdk && npm publish --access public --provenance
249+
run: |
250+
publish_idempotent() {
251+
local label="$1"; local attempt=0 out
252+
while [ "$attempt" -lt 3 ]; do
253+
attempt=$((attempt + 1))
254+
if out=$( (cd sdk && npm publish --access public --provenance) 2>&1); then
255+
echo "$out"; echo "[$label] published (attempt $attempt)"; return 0
256+
fi
257+
echo "$out"
258+
if echo "$out" | grep -qiE "cannot publish over|previously published|EPUBLISHCONFLICT"; then
259+
echo "[$label] already published at this version — idempotent skip."; return 0
260+
fi
261+
if echo "$out" | grep -qiE "TLOG_CREATE_ENTRY_ERROR|rekor\.sigstore\.dev|tlog entry|ETIMEDOUT|ECONNRESET|socket hang up|aborted"; then
262+
echo "[$label] transient provenance/network error — retry $attempt/3"; sleep 10; continue
263+
fi
264+
echo "[$label] non-retryable publish error."; return 1
265+
done
266+
echo "[$label] exhausted retries."; return 1
267+
}
268+
publish_idempotent "@run402/sdk"
207269
208270
- name: Commit version bump
209271
if: ${{ !inputs.dry_run }}

0 commit comments

Comments
 (0)