-
Notifications
You must be signed in to change notification settings - Fork 0
238 lines (223 loc) · 11.1 KB
/
Copy pathe2e-prod.yml
File metadata and controls
238 lines (223 loc) · 11.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
# Real-backend (LIVE) E2E against PRODUCTION (api.instanode.dev) using an
# ephemeral, cohort-scoped account minted on the fly. This is the prod sibling
# of e2e-live.yml (which targets STAGING) — DO NOT delete that one.
#
# Plan: docs/sessions/2026-06-04 TEST-ACCOUNTS-AND-NR-SYNTHETICS-PLAN.md.
#
# WHY this is safe to run against prod (and e2e-live.yml is not):
# - The api guards a mint endpoint (PR #260) that creates an account with
# is_test_cohort=true. The live worker skip-guards neuter
# billing/churn/email/quota for that team, so a LIVE run can never charge a
# card, burn a real quota budget, send a "we miss you" email, or churn a
# real customer.
# - The account + every resource it creates is reaped: this job DELETEs the
# minted account (DELETE /internal/e2e/account/{team_id}) AND runs the
# per-run ledger reaper (npm run reap:live) in an `if: always()` teardown.
# The reaper exits non-zero on any leak, failing the job loudly (rule 24).
# - cohort.ts assertSafeApiTarget() ALLOWS a prod E2E_API_URL only when a mint
# token / minted session is present (a sanctioned run); an un-tokened prod
# target is still refused, so a stray invocation can never hammer prod.
#
# HOW it mints/runs/reaps:
# 1. MINT — POST https://api.instanode.dev/internal/e2e/account with header
# X-E2E-Token: $E2E_ACCOUNT_TOKEN and body {"tier":"pro"} →
# {team_id, user_id, email, tier, session_jwt, expires_at}. The
# session_jwt + team_id are masked and exported to later steps.
# 2. RUN — E2E_LIVE=1 E2E_API_URL=https://api.instanode.dev
# E2E_SESSION_JWT=<minted> npx playwright test
# --config=playwright.live.config.ts. The authed legs use the
# minted account (cohort.ts mintedSession()); anon legs run as-is.
# 3. REAP — (always) DELETE the minted account, then npm run reap:live to
# sweep any spec-created resources from the on-disk ledger.
#
# Triggers:
# - workflow_dispatch (operator on demand).
# - schedule every 30 min (continuous prod integration signal).
# - repository_dispatch type `e2e-prod-from-deploy` (post-deploy hook the api
# repo can fire after a prod rollout).
#
# Guard: if secrets.E2E_ACCOUNT_TOKEN is empty (not yet configured) the job
# no-ops cleanly with a ::notice:: — it NEVER reds when unconfigured. The
# workflow ships before the secret exists and goes green only once the operator
# sets E2E_ACCOUNT_TOKEN and the api mint endpoint is deployed.
name: E2E LIVE (prod, minted account)
on:
workflow_dispatch: {}
schedule:
# Every 30 minutes — continuous prod integration signal. This is the
# controlled cadence for the FULL live suite: it provisions real customer
# DBs on shared prod infra, so it must NOT run on every build. (We tried
# push:[main] + an api per-deploy dispatch; on a busy day that fired the
# full provision suite dozens of times, accumulating customer DBs faster
# than reaps cleaned them and degrading prod provisioning — self-DoS. The
# cheap per-build coverage lives elsewhere: e2e-pr-smoke.yml runs the
# contract-only leg on every web PR, and the api's webhook-injection unit
# test + auth-contract dispatch cover the money path per api build.)
- cron: '*/30 * * * *'
repository_dispatch:
# Kept as an ON-DEMAND hook (operators can fire it deliberately). NOT wired
# to fire on every api deploy anymore — see the schedule rationale above.
types: [e2e-prod-from-deploy]
concurrency:
# One prod LIVE run at a time: they mint a real cohort account + create real
# resources; overlapping runs could interleave ledger writes / dedup state.
group: e2e-prod-${{ github.workflow }}
cancel-in-progress: false
permissions:
contents: read
jobs:
e2e-prod:
name: LIVE against prod via minted account + reap
runs-on: ubuntu-latest
timeout-minutes: 15
env:
# Fixed prod target — this workflow is prod-only by design.
E2E_API_URL: https://api.instanode.dev
E2E_LIVE_RUN_ID: ${{ github.run_id }}
# The mint-endpoint guard token. Empty until the operator configures it →
# the gate step below no-ops the job cleanly.
E2E_ACCOUNT_TOKEN: ${{ secrets.E2E_ACCOUNT_TOKEN }}
steps:
- name: Gate on configured mint token
# No token configured → no-op cleanly (never a false red). Sets RUN=0
# so every later step is skipped.
run: |
set -euo pipefail
if [ -z "${E2E_ACCOUNT_TOKEN:-}" ]; then
echo "::notice::secrets.E2E_ACCOUNT_TOKEN not configured — skipping prod LIVE E2E (no-op)."
echo "RUN=0" >> "$GITHUB_ENV"
else
echo "RUN=1" >> "$GITHUB_ENV"
fi
- uses: actions/checkout@v6
if: env.RUN == '1'
- uses: actions/setup-node@v6
if: env.RUN == '1'
with:
node-version: '22'
cache: 'npm'
- name: Install deps
if: env.RUN == '1'
run: npm ci
- name: Install Chromium
if: env.RUN == '1'
run: npx playwright install --with-deps chromium
- name: Mint ephemeral cohort account
id: mint
if: env.RUN == '1'
# POST the guarded mint endpoint → capture session_jwt + team_id, mask
# them, and export to later steps. Fails the job (non-2xx) so a broken
# mint endpoint surfaces immediately rather than running un-authed.
run: |
set -euo pipefail
resp="$(curl -sS -w '\n%{http_code}' \
-X POST "${E2E_API_URL}/internal/e2e/account" \
-H "X-E2E-Token: ${E2E_ACCOUNT_TOKEN}" \
-H 'Content-Type: application/json' \
-d '{"tier":"pro"}')"
code="$(printf '%s' "$resp" | tail -n1)"
body="$(printf '%s' "$resp" | sed '$d')"
if [ "$code" != "200" ]; then
echo "::error::mint endpoint returned HTTP $code (expected 200). Body: $body"
exit 1
fi
jwt="$(printf '%s' "$body" | jq -r '.session_jwt // empty')"
team="$(printf '%s' "$body" | jq -r '.team_id // empty')"
email="$(printf '%s' "$body" | jq -r '.email // empty')"
tier="$(printf '%s' "$body" | jq -r '.tier // empty')"
if [ -z "$jwt" ] || [ -z "$team" ]; then
echo "::error::mint response missing session_jwt or team_id. Body: $body"
exit 1
fi
# Mask the secrets so they never appear in logs.
echo "::add-mask::$jwt"
echo "::add-mask::$team"
# session_jwt + team_id are secret-ish → env only (not step outputs).
# team_id is also a non-secret output for the reap step's `if`.
{
echo "MINTED_SESSION_JWT=$jwt"
echo "MINTED_TEAM_ID=$team"
echo "MINTED_EMAIL=$email"
echo "MINTED_TIER=$tier"
} >> "$GITHUB_ENV"
echo "minted=1" >> "$GITHUB_OUTPUT"
echo "Minted cohort account (tier=$tier) — session + team_id masked."
- name: Run LIVE E2E against prod (minted account)
if: env.RUN == '1' && steps.mint.outputs.minted == '1'
env:
E2E_LIVE: '1'
# The minted account drives the authed legs (cohort.ts mintedSession);
# anon legs run as-is. assertSafeApiTarget() permits the prod target
# because E2E_SESSION_JWT is present (a sanctioned run).
E2E_SESSION_JWT: ${{ env.MINTED_SESSION_JWT }}
E2E_TEAM_ID: ${{ env.MINTED_TEAM_ID }}
E2E_ACCOUNT_EMAIL: ${{ env.MINTED_EMAIL }}
E2E_ACCOUNT_TIER: ${{ env.MINTED_TIER }}
# Fingerprint bypass for the ANON provision legs: prod does NOT trust
# X-Forwarded-For, so the runner's anon provisions share one real
# fingerprint and trip the free-tier recycle gate (402). The api's
# X-E2E-Test-Token header skips the per-fingerprint cap when it matches
# this secret (api internal/middleware/fingerprint.go).
E2E_TEST_TOKEN: ${{ secrets.E2E_TEST_TOKEN }}
# Wave 4b: arm the FULL Razorpay TEST-card payment leg in
# live-ui-payment.spec.ts. Sourced from a repo VARIABLE so it is
# INERT until the operator wires rzp_test_* keys on the api AND flips
# vars.E2E_RAZORPAY_TEST_MODE=1. Until then the card-entry test
# skips clean; the @pr-smoke contract-only leg always runs. See
# docs/ci/01-CI-INTEGRATION-DESIGN.md §"Razorpay test-card payment".
E2E_RAZORPAY_TEST_MODE: ${{ vars.E2E_RAZORPAY_TEST_MODE }}
run: npm run test:e2e:live
- name: Reap minted account (teardown)
# ALWAYS runs (even on test failure/cancel) so the minted account + its
# resources are deleted out-of-band. Idempotent: 404 == already gone.
if: always() && env.RUN == '1' && env.MINTED_TEAM_ID != ''
run: |
set -euo pipefail
code="$(curl -sS -o /dev/null -w '%{http_code}' \
-X DELETE "${E2E_API_URL}/internal/e2e/account/${MINTED_TEAM_ID}" \
-H "X-E2E-Token: ${E2E_ACCOUNT_TOKEN}")"
case "$code" in
200|202|204|404|410)
echo "Reaped minted account (HTTP $code)." ;;
*)
echo "::error::DELETE minted account returned HTTP $code — possible leak."
exit 1 ;;
esac
- name: Reap cohort resources from ledger (teardown)
# The per-run ledger reaper sweeps any resource a spec created. Exits
# non-zero on any leak, failing the job loudly (rule 24).
if: always() && env.RUN == '1'
run: npm run reap:live
- name: Upload LIVE trace + ledger on failure
if: failure() && env.RUN == '1'
uses: actions/upload-artifact@v7
with:
name: e2e-prod-trace-${{ github.run_id }}
path: |
test-results/
playwright-report-live/
e2e/.cleanup-ledger.json
if-no-files-found: ignore
retention-days: 14
# Wave 5 — record the live-prod E2E outcome in NR (suite=e2e-prod). This
# is the user-facing-journeys-against-prod signal; the e2e-prod-suite-
# failing NR alert (P1) reads it. Gated on RUN == '1' so a secret-less
# no-op run does NOT report a misleading pass. if: always() captures the
# journey failure and the leak-reaper failure. No-ops without the NR
# secret.
- name: Emit e2e-prod result to New Relic
if: always() && env.RUN == '1'
uses: ./.github/actions/nr-ci-event
with:
license-key: ${{ secrets.NEW_RELIC_LICENSE_KEY }}
account-id: ${{ secrets.NEW_RELIC_ACCOUNT_ID }}
result: ${{ job.status == 'success' && 'pass' || 'fail' }}
suite: e2e-prod
failed-step: ${{ job.status != 'success' && 'live-prod E2E journeys / mint / reap' || '' }}
repo: ${{ github.repository }}
workflow: ${{ github.workflow }}
branch: ${{ github.ref_name }}
commit-sha: ${{ github.sha }}
log-url: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
event-name: ${{ github.event_name }}
actor: ${{ github.actor }}