Skip to content

Commit 2b14c58

Browse files
authored
fix(cicd): allow skipped Deploy-to-Morpheus-P-Node to satisfy C-Node deploy needs (#715)
## Summary Hotfix for the first v7 `test`-branch deployment attempt ([Actions run #1447 / run id 24796856145](https://github.com/MorpheusAIs/Morpheus-Lumerin-Node/actions/runs/24796856145)). The previous PR (#714#713) introduced a new job sequence: ``` Drain-Morpheus-C-Node → Deploy-to-Morpheus-P-Node → Deploy-to-Morpheus-C-Node ``` On push to `test`, `Deploy-to-Morpheus-P-Node` is correctly **skipped** (its own `if` restricts it to `main` — no dev P-Node exists). That skip then propagated onto `Deploy-to-Morpheus-C-Node` through the implicit `success()` guard that every job has when no explicit `if` tolerates a skipped dependency. Net effect: - `Drain-Morpheus-C-Node` ✅ removed the running dev C-Node task from both NLB target groups. - `Deploy-to-Morpheus-C-Node` ❌ silently skipped — no new task registered, no re-register of the existing task. - Dev C-Node endpoints (`router.dev.mor.org:8082`/`:8545`) stopped serving traffic until we manually re-registered the old task to the TGs. ## What this PR changes `.github/workflows/build.yml`, `Deploy-to-Morpheus-C-Node` job only: ### Explicit `if` guard that tolerates the P-Node skip ```yaml if: | !cancelled() && github.repository == 'MorpheusAIs/Morpheus-Lumerin-Node' && needs.GHCR-Build-and-Push.result == 'success' && needs.Drain-Morpheus-C-Node.result == 'success' && (needs.Deploy-to-Morpheus-P-Node.result == 'success' || needs.Deploy-to-Morpheus-P-Node.result == 'skipped') && ( (github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/test')) || (github.event_name == 'workflow_dispatch' && github.event.inputs.build_all_os == 'true' && github.event.inputs.create_deployment == 'true') ) ``` - Still requires real success from the drain + GHCR jobs. - Explicitly accepts `success` OR `skipped` for the P-Node dependency. - `!cancelled()` short-circuits the job if someone cancels the run, so we don't try to redeploy a half-cancelled sequence. ### Comment rewrite The previous inline comment claimed skipped jobs are treated as successful for dependency resolution. That's the opposite of how GitHub Actions actually behaves — the note is replaced with an accurate explanation and a reference to this incident so future-us (or other maintainers) won't step on the same rake. ## What this PR does NOT change - No change to the drain job or the controlled-traffic C-Node deploy sequence (register→dereg→hold→rereg→wait). - No change to the P-Node job or its `if` gating. - No change to any other workflow or deploy script. This is a pure guard-correctness fix. ## Test plan - [x] YAML syntax validated (`yaml.safe_load`). - [x] Manually walked through `needs` outcomes: - On push to `test`: P-Node `skipped`, drain + GHCR `success` → C-Node **runs**. ✅ - On push to `main`: all three `success` → C-Node **runs**. ✅ - Drain fails → C-Node **skipped** (won't redeploy while the TG state is ambiguous). ✅ - GHCR fails → C-Node **skipped**. ✅ - P-Node fails on main (not skipped) → C-Node **skipped** (we want to bail rather than leave prd with a v-mismatch between providers and the consumer). ✅ - [ ] Merge to `dev`, promote to `test`, confirm `Deploy-to-Morpheus-Consumer` actually runs this time and completes the drain → hold → re-register → /healthcheck cycle end-to-end. - [ ] Once green on test, cut `test` → `main` for first prd exercise. ## Related - Previous PR: #714 (dev → test) carrying #713 (initial drain/sequence/hold). - Companion IAM + planning doc already applied in `Morpheus-Infra`. Made with [Cursor](https://cursor.com)
2 parents 68e39db + c8a0e07 commit 2b14c58

1 file changed

Lines changed: 17 additions & 3 deletions

File tree

.github/workflows/build.yml

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1297,18 +1297,32 @@ jobs:
12971297
12981298
Deploy-to-Morpheus-C-Node:
12991299
name: Deploy to Morpheus Consumer via GitHub
1300+
# NOTE on `needs` + skipped dependencies:
1301+
# GitHub Actions' default job condition is implicit `success()`, which returns
1302+
# false when ANY needed job was skipped. P-Node is main-only (see its own `if`),
1303+
# so on a `test` deploy `Deploy-to-Morpheus-P-Node` is always skipped. Without
1304+
# an explicit `if` that tolerates that skip, the C-Node deploy gets skipped
1305+
# too — which is exactly what bit us on the first v7 test run: the drain job
1306+
# ran and deregistered the C-Node, then the actual deploy silently skipped,
1307+
# leaving dev without a replacement task.
1308+
#
1309+
# This `if` explicitly allows the skipped P-Node outcome, while still gating
1310+
# on drain + GHCR success and requiring we're on a deploy-eligible branch.
1311+
# `!cancelled()` keeps the guard working if a user cancels mid-run.
13001312
if: |
1313+
!cancelled() &&
13011314
github.repository == 'MorpheusAIs/Morpheus-Lumerin-Node' &&
1315+
needs.GHCR-Build-and-Push.result == 'success' &&
1316+
needs.Drain-Morpheus-C-Node.result == 'success' &&
1317+
(needs.Deploy-to-Morpheus-P-Node.result == 'success' || needs.Deploy-to-Morpheus-P-Node.result == 'skipped') &&
13021318
(
1303-
(github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/test'))||
1319+
(github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/test')) ||
13041320
(github.event_name == 'workflow_dispatch' && github.event.inputs.build_all_os == 'true' && github.event.inputs.create_deployment == 'true')
13051321
)
13061322
needs:
13071323
- Generate-Tag
13081324
- GHCR-Build-and-Push
13091325
- Drain-Morpheus-C-Node
1310-
# P-Node is main-only; on test this job is skipped and the `needs` is still
1311-
# satisfied (skipped jobs are treated as successful for dependency resolution).
13121326
- Deploy-to-Morpheus-P-Node
13131327
runs-on: ubuntu-latest
13141328
environment: ${{ github.ref_name }}

0 commit comments

Comments
 (0)