Skip to content

Commit 06063e0

Browse files
GvieveGenevieve Nuebel
andauthored
Fix publish and release (#79)
Co-authored-by: Genevieve Nuebel <genevieve.nuebel@mx.com>
1 parent 84f96c0 commit 06063e0

4 files changed

Lines changed: 105 additions & 89 deletions

File tree

.github/workflows/on-push-master.yml

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -65,20 +65,16 @@ jobs:
6565
version_directory: v20111101
6666
secrets: inherit
6767

68-
# Gate job to handle serial ordering when v20111101 is modified
69-
# Depends on release-v20111101 to enforce serial ordering (waits for v20111101 to complete)
70-
# Uses always() to continue even when release-v20111101 is skipped (when v20111101 wasn't modified)
71-
# This ensures: v20111101 publishes first (serially) → then v20250224 publishes (serially)
72-
gate-v20111101-complete:
68+
delay-for-v20250224:
7369
runs-on: ubuntu-latest
74-
needs: [check-skip-publish, detect-changes, release-v20111101]
75-
if: always() && needs.check-skip-publish.outputs.skip_publish == 'false'
70+
needs: [check-skip-publish, detect-changes]
71+
if: needs.check-skip-publish.outputs.skip_publish == 'false'
7672
steps:
77-
- name: Gate reached - v20111101 release complete (or skipped)
78-
run: echo "Ready to proceed with v20250224 publication"
73+
- name: Brief delay to stagger v20250224 publish
74+
run: sleep 2
7975

8076
publish-v20250224:
81-
needs: [check-skip-publish, detect-changes, gate-v20111101-complete]
77+
needs: [check-skip-publish, detect-changes, delay-for-v20250224]
8278
if: needs.check-skip-publish.outputs.skip_publish == 'false' && needs.detect-changes.outputs.v20250224 == 'true'
8379
uses: ./.github/workflows/publish.yml
8480
with:

docs/Adding-a-New-API-Version.md

Lines changed: 25 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
**Document Purpose**: Step-by-step guide for adding support for a new API version (e.g., `v20300101`) to the mx-platform-node repository.
44

5-
**Last Updated**: January 28, 2026
5+
**Last Updated**: January 29, 2026
66
**Time to Complete**: 30-45 minutes
77
**Prerequisites**: Familiarity with the multi-version architecture (see [Multi-Version-SDK-Flow.md](Multi-Version-SDK-Flow.md))
88

@@ -238,15 +238,15 @@ Add a new publish job for your version (copy and modify the existing v20250224 j
238238

239239
```yaml
240240
publish-v20300101:
241-
needs: [check-skip-publish, detect-changes, gate-v20250224-complete]
241+
needs: [check-skip-publish, detect-changes, delay-for-v20300101]
242242
if: needs.check-skip-publish.outputs.skip_publish == 'false' && needs.detect-changes.outputs.v20300101 == 'true'
243243
uses: ./.github/workflows/publish.yml
244244
with:
245245
version_directory: v20300101
246246
secrets: inherit
247247
```
248248

249-
**Important**: The `needs` array must include the **previous version's gate job** to enforce serial ordering. This ensures v20250224 finishes before v20300101 starts publishing.
249+
**Important**: The `needs` array must include the **delay job for this version** to enforce staggered publishing. This creates a small delay before your version starts publishing, ensuring previous versions get first chance at npm registry.
250250

251251
**Location 4: Add release job for new version**
252252

@@ -262,38 +262,40 @@ release-v20300101:
262262
secrets: inherit
263263
```
264264

265-
**Location 5: Add gate job for previous version**
265+
**Location 5: Add delay job for new version**
266266

267-
Add a new gate job after the previous version's release to handle serial ordering:
267+
Add a new delay job before the publish job to create staggered publishing:
268268

269269
```yaml
270-
gate-v20250224-complete:
270+
delay-for-v20300101:
271271
runs-on: ubuntu-latest
272-
needs: [check-skip-publish, detect-changes, release-v20250224]
273-
if: always() && needs.check-skip-publish.outputs.skip_publish == 'false'
272+
needs: [check-skip-publish, detect-changes]
273+
if: needs.check-skip-publish.outputs.skip_publish == 'false'
274274
steps:
275-
- name: Gate reached - v20250224 release complete (or skipped)
276-
run: echo "Ready to proceed with v20300101 publication"
275+
- name: Brief delay to stagger v20300101 publish
276+
run: sleep 2
277277
```
278278

279279
**Critical implementation details**:
280280

281-
1. **Each publish job** depends on the **previous version's gate job** (not the previous release directly)
282-
- This prevents race conditions when multiple versions are modified
283-
- Ensures strict serial ordering at the npm registry level
281+
1. **Each delay job** is independent and depends only on safety checks
282+
- Does NOT depend on the previous version
283+
- Always runs (assuming `[skip-publish]` flag not set)
284+
- Provides a 2-second window for previous versions to start publishing
284285

285-
2. **Each release job** depends on its corresponding publish job
286-
- Ensures publication completes before creating release
286+
2. **Each publish job** depends on its corresponding delay job
287+
- This naturally staggers version publishes without complex dependencies
288+
- When only one version is modified, its delay still runs (no blocking)
289+
- When multiple versions are modified, they publish sequentially with 2-second gaps
287290

288-
3. **Each gate job** uses `needs: [check-skip-publish, detect-changes, release-v<VERSION>]`
289-
- Waits for the previous version's release to complete
290-
- The `if: always()` condition ensures the gate continues running even when the release job is **skipped**
291-
- This is crucial: when the previous version isn't modified, its release is skipped, but the gate still runs and unblocks the next version
291+
3. **Each release job** depends on its corresponding publish job
292+
- Ensures publication completes before creating release
292293

293-
4. **Each publish/release if condition** uses `needs.detect-changes.outputs.v<VERSION> == 'true'`
294-
- This is more reliable than the older `contains()` pattern
295-
- Uses the path-filter outputs to determine which versions changed
296-
- Prevents false publishes when only docs change
294+
4. **Simple, non-blocking design**:
295+
- No `always()` conditions needed
296+
- No dependencies on other versions' jobs
297+
- Delay job always runs independently
298+
- Prevents race conditions through simple timing, not complex job logic
297299

298300
### 2.5 Verify Workflow Syntax
299301

docs/Troubleshooting-Guide.md

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
**Document Purpose**: Quick reference for diagnosing and fixing issues in the multi-version SDK generation, publishing, and release workflows.
44

5-
**Last Updated**: January 28, 2026
5+
**Last Updated**: January 29, 2026
66
**Audience**: Developers debugging workflow failures
77

88
---
@@ -234,32 +234,37 @@ fatal: A release with this tag already exists
234234

235235
**Expected Behavior**: `publish-v20250224` should run when only v20250224 is modified
236236

237-
**Root Cause**: Previous versions of the workflow had a dependency chain that broke when intermediate jobs were skipped. This has been fixed with the gate job pattern.
237+
**Root Cause**: Previous versions of the workflow had dependencies that broke when intermediate jobs were skipped. This has been fixed with the delay job pattern.
238238

239-
**Current Implementation** (uses gate job pattern):
240-
- `gate-v20111101-complete` uses GitHub Actions `always()` condition
241-
- This job runs even when v20111101 jobs are skipped
242-
- It unblocks downstream v20250224 jobs
239+
**Current Implementation** (uses delay job pattern):
240+
- `delay-for-v20250224` runs independently of other versions
241+
- This delay job always runs (depends only on safety checks, not other versions)
242+
- It provides a 2-second window for previous versions to start publishing first
243+
- v20250224 publish depends on this delay (not on v20111101's release)
243244
- Result: Publishing works correctly whether one or both versions are modified
244245

245246
**If You're Still Seeing This Issue**:
246247
1. Verify you have the latest `on-push-master.yml`:
247248
```bash
248-
grep -A 3 "gate-v20111101-complete" .github/workflows/on-push-master.yml
249+
grep -A 5 "delay-for-v20250224" .github/workflows/on-push-master.yml
249250
```
250-
2. Confirm the gate job uses `always()` condition:
251+
2. Confirm the delay job is independent:
251252
```yaml
252-
gate-v20111101-complete:
253-
if: always() && needs.check-skip-publish.outputs.skip_publish == 'false'
253+
delay-for-v20250224:
254+
needs: [check-skip-publish, detect-changes]
255+
if: needs.check-skip-publish.outputs.skip_publish == 'false'
256+
steps:
257+
- name: Brief delay to stagger v20250224 publish
258+
run: sleep 2
254259
```
255-
3. Ensure `publish-v20250224` depends on the gate job:
260+
3. Ensure `publish-v20250224` depends on the delay job:
256261
```yaml
257262
publish-v20250224:
258-
needs: [check-skip-publish, gate-v20111101-complete]
263+
needs: [check-skip-publish, detect-changes, delay-for-v20250224]
259264
```
260265
4. If not present, update workflow from latest template
261266

262-
**Technical Details**: See [Workflow-and-Configuration-Reference.md](Workflow-and-Configuration-Reference.md#step-3-gate-job---unblock-v20250224-publishing) in the "Publishing via on-push-master.yml" section for full gate job implementation details.
267+
**Technical Details**: See [Workflow-and-Configuration-Reference.md](Workflow-and-Configuration-Reference.md#step-3-delay-job---stagger-v20250224-publishing) in the "Publishing via on-push-master.yml" section for full delay job implementation details.
263268

264269
---
265270

docs/Workflow-and-Configuration-Reference.md

Lines changed: 56 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
**Document Purpose**: Detailed technical reference for the multi-version SDK generation, publishing, and release workflows. Covers implementation details, configuration files, and system architecture.
44

5-
**Last Updated**: January 28, 2026
5+
**Last Updated**: January 29, 2026
66
**Audience**: Developers who need to understand or modify the implementation
77

88
---
@@ -145,16 +145,17 @@ strategy:
145145
4. Path-based filtering ensures only modified versions are published, never in parallel
146146
147147
**Serialization Chain** (for race condition prevention):
148-
- v20111101 publish runs first (depends on check-skip-publish)
148+
- v20111101 publish runs immediately (depends on check-skip-publish)
149149
- v20111101 release runs second (depends on publish) - waits for npm registry confirmation
150-
- **gate-v20111101-complete** runs (uses `always()`, runs even if v20111101 jobs are skipped) ⭐ **Critical: Enables single-version publishing**
151-
- v20250224 publish runs third (depends on gate job) ← **Serial ordering enforced**
150+
- **delay-for-v20250224** waits 2 seconds (prevents race condition by staggering v20250224) ⭐ **Critical: Enables single-version publishing**
151+
- v20250224 publish runs third (depends on delay) ← **Serial ordering enforced**
152152
- v20250224 release runs fourth (depends on v20250224 publish) - waits for npm registry confirmation
153153
154154
**Why This Order Matters**:
155155
- Each version publishes to npm sequentially, never in parallel
156156
- npm registry expects sequential API calls; parallel publishes can cause conflicts
157-
- Gate job ensures this ordering works correctly whether 1 or 2 versions are modified
157+
- Delay job ensures v20250224 doesn't start immediately, giving v20111101 time to publish
158+
- This ordering works correctly whether 1 or 2 versions are modified
158159
- Release jobs complete before the next version starts publishing
159160
160161
---
@@ -289,42 +290,52 @@ publish:
289290
-**Harder to understand**: New developers see one job with matrix logic; harder to reason about sequence
290291
-**Less flexible**: Adding safety checks per version becomes complicated with matrix expansion
291292

292-
#### Why Serial Conditionals (Our Choice)
293+
#### Why Serial Conditionals with Delay (Our Choice)
293294

294295
**Serial Approach** (Explicit, safe, maintainable):
295296
```yaml
296297
publish-v20111101:
297298
needs: [check-skip-publish, detect-changes]
298299
if: needs.check-skip-publish.outputs.skip_publish == 'false' && needs.detect-changes.outputs.v20111101 == 'true'
299300

301+
delay-for-v20250224:
302+
needs: [check-skip-publish, detect-changes]
303+
if: needs.check-skip-publish.outputs.skip_publish == 'false'
304+
steps:
305+
- run: sleep 2
306+
300307
publish-v20250224:
301-
needs: [check-skip-publish, detect-changes, gate-v20111101-complete] # Must wait for gate
308+
needs: [check-skip-publish, detect-changes, delay-for-v20250224] # Wait for delay
302309
if: needs.check-skip-publish.outputs.skip_publish == 'false' && needs.detect-changes.outputs.v20250224 == 'true'
303310
```
304311
305312
**Advantages**:
306-
- ✅ **Safe**: v20250224 cannot start publishing until v20111101 finishes
307-
- Gate job ensures serial ordering at job level, not just workflow level
313+
- ✅ **Safe**: v20250224 cannot start publishing until delay completes
314+
- 2-second delay ensures v20111101 has time to publish first
308315
- npm registry sees sequential requests, no conflicts
309-
- Clear happens-before relationship in GitHub Actions UI
316+
- Works whether 1 or 2 versions are modified
317+
- ✅ **Simple**: No complex gate job logic with `always()` conditions
318+
- Just a straightforward 2-second delay between version publishes
319+
- Delay job always runs (depends only on safety checks)
320+
- Less mental overhead for future developers
310321
- ✅ **Visible**: Each version has individual jobs that are easy to identify
311322
- GitHub Actions shows separate rows for each version
312-
- Failures are obvious: "publish-v20250224 failed" vs "publish[v20250224] in matrix"
323+
- Failures are obvious
313324
- Each job can have version-specific comments and documentation
314325
- ✅ **Debuggable**: Clear dependencies make it obvious what blocks what
315-
- When only v20250224 is modified, you see: `publish-v20111101 (skipped)` → `gate (runs)` → `publish-v20250224 (runs)`
316-
- Matrix approach would be harder to understand why certain jobs run/skip
317-
- ✅ **Maintainable**: Adding a new version requires adding 3 explicit jobs (publish, release, gate)
318-
- More code, but each job is self-documenting
319-
- No complex matrix expansion logic to understand
320-
- Future developers can see the pattern easily: "oh, each version gets 3 jobs"
326+
- v20250224 waits for delay, which doesn't depend on v20111101
327+
- When only v20250224 is modified, you see: `delay (runs)` → `publish-v20250224 (runs)`
328+
- When both are modified, you see: `publish-v20111101` and `delay` run in parallel, then `publish-v20250224` waits
329+
- ✅ **Maintainable**: Minimal code addition (one simple delay job)
330+
- More explicit than matrix approach
331+
- Future developers immediately understand: "oh, there's a delay between version publishes"
321332
- ✅ **Future-proof**: When you lock master, this structure stays the same
322-
- Matrix would need version list hardcoded; serial jobs just live alongside each other
333+
- Simple delay job that can be extended if needed
323334

324335
**Tradeoff we accepted**:
325-
- We have more code (repetition): `publish-v20111101`, `publish-v20250224`, etc.
326-
- BUT: The repetition is worth it for safety, clarity, and debuggability
327-
- This is a conscious choice: **explicitness over DRY** for critical infrastructure
336+
- Slight overhead: 2-second delay added to every publish flow (negligible)
337+
- BUT: Worth it for simplicity, clarity, and the ability to publish single versions without gates
338+
- This is a conscious choice: **simplicity over clever infrastructure** for critical workflows
328339

329340

330341

@@ -340,7 +351,7 @@ Include `[skip-publish]` in commit message to prevent publish/release for this p
340351

341352
**Workflow**: `.github/workflows/on-push-master.yml`
342353

343-
**Architectural Approach**: Serial job chaining with gate job pattern ensures single-version and multi-version publishing both work correctly while preventing npm race conditions.
354+
**Architectural Approach**: Serial job chaining with delay job ensures single-version and multi-version publishing both work correctly while preventing npm race conditions.
344355

345356
#### Step 1: Check Skip-Publish Flag
346357

@@ -377,47 +388,49 @@ Include `[skip-publish]` in commit message to prevent publish/release for this p
377388
1. Publish job calls `publish.yml` with `version_directory: v20111101`
378389
2. Release job calls `release.yml` after publish completes
379390

380-
#### Step 3: Gate Job - Unblock v20250224 Publishing
391+
#### Step 3: Delay Job - Stagger v20250224 Publishing
381392

382-
**Job**: `gate-v20111101-complete`
393+
**Job**: `delay-for-v20250224`
383394

384395
```yaml
385-
gate-v20111101-complete:
396+
delay-for-v20250224:
386397
runs-on: ubuntu-latest
387-
needs: [check-skip-publish, detect-changes, release-v20111101]
388-
if: always() && needs.check-skip-publish.outputs.skip_publish == 'false'
398+
needs: [check-skip-publish, detect-changes]
399+
if: needs.check-skip-publish.outputs.skip_publish == 'false'
389400
steps:
390-
- name: Gate complete - ready for v20250224
391-
run: echo "v20111101 release workflow complete (or skipped)"
401+
- name: Brief delay to stagger v20250224 publish
402+
run: sleep 2
392403
```
393404

394-
**Key Feature**: Uses `always()` condition - runs even when `release-v20111101` is skipped
405+
**Key Feature**: Simple 2-second delay between version publishes
395406

396407
**Why This Pattern Exists**:
397408

398-
The gate job solves a critical dependency problem in serial publishing:
409+
The delay job solves the dependency problem while keeping things simple:
399410

400411
1. **The Problem**:
401-
- If v20250224 publish job depends on `release-v20111101`, it fails when v20111101 is skipped (not modified)
412+
- If v20250224 publish depends directly on `publish-v20111101`, it fails when v20111101 is skipped (not modified)
402413
- When only v20250224 is modified, we want it to publish, but it's blocked by skipped v20111101 job
403414
- This would cause the workflow to hang/fail when only one version is modified
415+
- A gate job with `always()` is complex and hard to understand
404416

405417
2. **The Solution**:
406-
- Gate job uses `always()` so it runs whether v20111101 succeeds, fails, or is skipped
407-
- v20250224 jobs depend on the gate job (which always runs), not on v20111101 (which might be skipped)
408-
- This unblocks v20250224 while maintaining serial ordering when both versions are modified
418+
- Simple delay job that doesn't depend on v20111101
419+
- v20250224 publish depends on the delay (which always runs)
420+
- 2-second delay gives v20111101 time to start publishing before v20250224 does
421+
- This staggering prevents npm registry race conditions without complex job logic
409422

410423
3. **The Behavior**:
411-
- **Both versions modified**: publish v20111101 → release v20111101 → gate (runs)publish v20250224 → release v20250224
412-
- **Only v20250224 modified**: (v20111101 jobs skipped)gate (always runs, unblocks) → publish v20250224 → release v20250224
413-
- **Only v20111101 modified**: publish v20111101 → release v20111101 → gate (always runs) → publish v20250224 (skipped) → release v20250224 (skipped)
424+
- **Both versions modified**: publish v20111101 starts immediately, delay job starts immediatelyafter 2s, publish v20250224 runs
425+
- **Only v20250224 modified**: delay job runsafter 2s, publish v20250224 runs (v20111101 jobs skipped, don't block)
426+
- **Only v20111101 modified**: publish v20111101 runs, delay runs but is unused (no harm)
414427

415428
**Why Not Use Direct Dependencies?**
416-
If v20250224 jobs depended directly on v20111101's release job, the workflow would fail whenever v20111101 was skipped (not modified). The gate job pattern enables:
429+
If v20250224 jobs depended directly on v20111101's publish job, the workflow would fail whenever v20111101 was skipped (not modified). The delay job pattern enables:
417430
- ✅ Correct behavior in single-version and multi-version scenarios
418-
- ✅ Maintains serial ordering when both versions change
431+
- ✅ Maintains serial ordering by staggering version publishes
419432
- ✅ Prevents race conditions at npm registry level
420-
- Clear, explicit dependency chain in GitHub Actions UI
433+
- Simple, easy-to-understand logic
421434

422435
#### Step 4: Publish and Release v20250224 (Second in Serial Chain)
423436

@@ -426,7 +439,7 @@ If v20250224 jobs depended directly on v20111101's release job, the workflow wou
426439
**publish-v20250224 executes when**:
427440
- No `[skip-publish]` flag
428441
- Files in `v20250224/**` were changed
429-
- **AND** `gate-v20111101-complete` completes (ensures serial ordering)
442+
- **AND** `delay-for-v20250224` completes (ensures staggered publishing)
430443

431444
**release-v20250224 executes when**:
432445
- No `[skip-publish]` flag
@@ -437,7 +450,7 @@ If v20250224 jobs depended directly on v20111101's release job, the workflow wou
437450
1. Publish job calls `publish.yml` with `version_directory: v20250224`
438451
2. Release job calls `release.yml` after publish completes
439452

440-
**Serial Chain Benefit**: Even though both versions could publish in parallel, the gate job ensures v20250224 waits for v20111101 release, preventing npm registry race conditions when both versions are modified.
453+
**Serial Chain Benefit**: The 2-second delay before v20250224 starts publishing ensures v20111101 gets first chance at npm registry, preventing race conditions when both versions are modified.
441454

442455
---
443456

0 commit comments

Comments
 (0)