Skip to content

Commit 4369a86

Browse files
ci(nr): emit CI test-run results to New Relic (Wave 5) (#266)
* ci(nr): emit CI test-run results to New Relic (Wave 5) Wave 5 of the CI integration plan (docs/ci/01-CI-INTEGRATION-DESIGN.md §NR observability): push CI test/gate/deploy results to New Relic so ANY red run is studyable from an NR dashboard, not just the GitHub Actions log. Adds a reusable composite action .github/actions/nr-ci-event that POSTs an InstantCITestRun custom event on every gated job (always) plus an InstantCITestFailure event on failure, via the NR Event API (insights-collector .../v1/accounts/<acct>/events) authenticated with the ingest license key. Wired into build-and-test (ci.yml), coverage (coverage.yml), and the deploy gate (deploy.yml) as an `if: always()` step. No-op contract: when NEW_RELIC_LICENSE_KEY or NEW_RELIC_ACCOUNT_ID is absent (fork PRs, unprovisioned repo) the action prints the payload it WOULD send and exits 0 — observability never reds a PR. Free-form values flow through env, not shell interpolation (injection-safe). Additive only; gates unchanged. Schema InstantCITestRun{repo,workflow,branch,commit_sha,pr_number,result, duration_ms,suite,...}; InstantCITestFailure{...,failed_step,log_url}. NR dashboard + alerts land in the infra repo (instanode-ci-health). Operator action: provision NEW_RELIC_LICENSE_KEY + NEW_RELIC_ACCOUNT_ID as GitHub Actions secrets on the api repo (license key = same k8s instant-secrets value; account id per infra/newrelic/README.md). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(ci): composite action cannot read github/secrets/job — caller passes them as inputs The nr-ci-event composite action referenced github.*/secrets.*/job.status in its own env: block; GitHub rejects those contexts inside a composite action (TemplateValidationException 'Unrecognized named-value'). Move all resolution to the caller's with: block (which CAN read those contexts) and have the action read only inputs.*. Adds event-name + actor inputs. Callers now pass repo/workflow/ branch/commit-sha/log-url/event-name/actor from the github context. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(ci): strip ${{ }} from composite-action input descriptions GitHub evaluates ${{ }} even inside input description: strings; the example '${{ secrets... }}' text triggered Unrecognized-named-value. Plain text now; only runs: keeps inputs.* expressions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(ci): point nr-ci-event at ./api/... in coverage+deploy (nested checkout) coverage.yml and deploy.yml check the api repo out into ./api (path: api), so a local-action 'uses: ./.github/actions/nr-ci-event' resolves to the workspace root (empty) and 404s ('Can't find action.yml'). Reference the action at its real nested path ./api/.github/actions/nr-ci-event in those two workflows. ci.yml checks out at the root, so it keeps ./.github/actions/nr-ci-event. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 61afc5b commit 4369a86

4 files changed

Lines changed: 274 additions & 0 deletions

File tree

Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
name: nr-ci-event
2+
description: >-
3+
Emit a CI test-run result to New Relic so a red CI run (test, e2e, smoke,
4+
deploy gate) is studyable from NR dashboards, not just GitHub logs. Posts an
5+
InstantCITestRun custom event on every gated job, plus an InstantCITestFailure
6+
event when result=fail. No-ops cleanly (logs the payload it WOULD send) when
7+
the NR secret/account is absent — never fails the calling job because NR is
8+
unreachable (fork PRs, secret not yet provisioned).
9+
10+
# Mechanism (CLAUDE.md design ref docs/ci/01-CI-INTEGRATION-DESIGN.md §NR
11+
# observability): the NR Event API is a single HTTP POST to
12+
# https://insights-collector.newrelic.com/v1/accounts/<acct>/events
13+
# authenticated with the ingest license key (the SAME NEW_RELIC_LICENSE_KEY the
14+
# Go agents use at runtime — it is a valid Insert Key for the Event API). The
15+
# account id is the numeric NEW_RELIC_ACCOUNT_ID. Both are passed as inputs from
16+
# repo secrets by the caller. When EITHER is empty the action prints the payload
17+
# and exits 0 (the no-op-without-secret contract — rule: never red a PR because
18+
# NR is down). EU-region accounts override the collector host via nr-region.
19+
20+
inputs:
21+
# --- NR credentials (caller passes from secrets; empty => no-op) ---
22+
license-key:
23+
description: >-
24+
NR ingest license key (Insert Key for the Event API). Pass
25+
`secrets.NEW_RELIC_LICENSE_KEY`. Empty => action no-ops (dry-run log).
26+
required: false
27+
default: ''
28+
account-id:
29+
description: >-
30+
Numeric NR account id. Pass `secrets.NEW_RELIC_ACCOUNT_ID`.
31+
Empty => action no-ops (dry-run log).
32+
required: false
33+
default: ''
34+
nr-region:
35+
description: 'US (default) or EU — selects the insights-collector host.'
36+
required: false
37+
default: 'US'
38+
39+
# --- event payload (caller fills from the GitHub context + job result) ---
40+
result:
41+
description: 'pass | fail — usually `job.status == ''success'' && ''pass'' || ''fail''`.'
42+
required: true
43+
suite:
44+
description: >-
45+
Logical suite name, e.g. build-and-test, coverage, playwright, pr-smoke,
46+
e2e-prod, deploy-gate. The dashboard FACETs on this.
47+
required: true
48+
# NOTE: composite actions cannot read the `github` context in their own
49+
# expressions, so the caller MUST pass these from its `with:` block (e.g.
50+
# repo: `github.repository`). The defaults below only apply when a caller
51+
# omits them entirely.
52+
repo:
53+
description: 'Repository (owner/name). Caller passes `github.repository`.'
54+
required: false
55+
default: ''
56+
workflow:
57+
description: 'Workflow name. Caller passes `github.workflow`.'
58+
required: false
59+
default: ''
60+
branch:
61+
description: 'Branch ref name. Caller passes `github.ref_name`.'
62+
required: false
63+
default: ''
64+
commit-sha:
65+
description: 'Commit SHA under test. Caller passes `github.sha`.'
66+
required: false
67+
default: ''
68+
pr-number:
69+
description: 'PR number (empty on push). Caller passes `github.event.pull_request.number`.'
70+
required: false
71+
default: ''
72+
duration-ms:
73+
description: 'Suite duration in milliseconds (0 if not measured).'
74+
required: false
75+
default: '0'
76+
failed-step:
77+
description: 'On failure: the step/phase that failed (free text, no PII). Empty on pass.'
78+
required: false
79+
default: ''
80+
log-url:
81+
description: 'URL to the run logs for triage. Caller passes the run URL.'
82+
required: false
83+
default: ''
84+
event-name:
85+
description: 'GitHub event name. Caller passes `github.event_name`.'
86+
required: false
87+
default: ''
88+
actor:
89+
description: 'GitHub actor. Caller passes `github.actor`.'
90+
required: false
91+
default: ''
92+
93+
runs:
94+
using: composite
95+
steps:
96+
- name: Emit CI result to New Relic (no-op without secret)
97+
shell: bash
98+
env:
99+
# Composite actions may reference ONLY `inputs` (+ env/runner/steps) in
100+
# their expressions — `secrets`, `job`, and `github` are NOT available
101+
# here, so the CALLER resolves those and passes them as inputs. All
102+
# untrusted/free-form values flow through env, never interpolated into
103+
# the shell body (injection-safe — same posture as ci.yml's
104+
# dispatch-auth-contract-e2e job).
105+
NR_LICENSE_KEY: ${{ inputs.license-key }}
106+
NR_ACCOUNT_ID: ${{ inputs.account-id }}
107+
NR_REGION: ${{ inputs.nr-region }}
108+
EV_RESULT: ${{ inputs.result }}
109+
EV_SUITE: ${{ inputs.suite }}
110+
EV_REPO: ${{ inputs.repo }}
111+
EV_WORKFLOW: ${{ inputs.workflow }}
112+
EV_BRANCH: ${{ inputs.branch }}
113+
EV_COMMIT: ${{ inputs.commit-sha }}
114+
EV_PR: ${{ inputs.pr-number }}
115+
EV_DURATION_MS: ${{ inputs.duration-ms }}
116+
EV_FAILED_STEP: ${{ inputs.failed-step }}
117+
EV_LOG_URL: ${{ inputs.log-url }}
118+
EV_EVENT_NAME: ${{ inputs.event-name }}
119+
EV_ACTOR: ${{ inputs.actor }}
120+
run: |
121+
set -uo pipefail
122+
123+
# Normalise the result to the pass|fail enum the dashboard FACETs on.
124+
# Anything that isn't exactly "pass" is treated as "fail" so a typo or
125+
# a cancelled job reads as a non-pass (conservative — never a false green).
126+
case "${EV_RESULT}" in
127+
pass) RESULT="pass" ;;
128+
*) RESULT="fail" ;;
129+
esac
130+
131+
# Build the InstantCITestRun event (always) and, on fail, the
132+
# InstantCITestFailure event. jq composes the JSON so every value is
133+
# passed as an argument (no shell concatenation of free-form text).
134+
DURATION="${EV_DURATION_MS}"
135+
case "${DURATION}" in ''|*[!0-9]*) DURATION=0 ;; esac
136+
137+
RUN_EVENT=$(jq -n -c \
138+
--arg eventType "InstantCITestRun" \
139+
--arg repo "${EV_REPO}" \
140+
--arg workflow "${EV_WORKFLOW}" \
141+
--arg branch "${EV_BRANCH}" \
142+
--arg commit_sha "${EV_COMMIT}" \
143+
--arg pr_number "${EV_PR}" \
144+
--arg result "${RESULT}" \
145+
--arg suite "${EV_SUITE}" \
146+
--arg event_name "${EV_EVENT_NAME}" \
147+
--arg actor "${EV_ACTOR}" \
148+
--arg log_url "${EV_LOG_URL}" \
149+
--argjson duration_ms "${DURATION}" \
150+
'{eventType:$eventType, repo:$repo, workflow:$workflow, branch:$branch,
151+
commit_sha:$commit_sha, pr_number:$pr_number, result:$result,
152+
suite:$suite, event_name:$event_name, actor:$actor, log_url:$log_url,
153+
duration_ms:$duration_ms}')
154+
155+
PAYLOAD="[${RUN_EVENT}]"
156+
if [ "${RESULT}" = "fail" ]; then
157+
FAIL_EVENT=$(jq -n -c \
158+
--arg eventType "InstantCITestFailure" \
159+
--arg repo "${EV_REPO}" \
160+
--arg workflow "${EV_WORKFLOW}" \
161+
--arg branch "${EV_BRANCH}" \
162+
--arg commit_sha "${EV_COMMIT}" \
163+
--arg pr_number "${EV_PR}" \
164+
--arg suite "${EV_SUITE}" \
165+
--arg failed_step "${EV_FAILED_STEP}" \
166+
--arg log_url "${EV_LOG_URL}" \
167+
--arg event_name "${EV_EVENT_NAME}" \
168+
'{eventType:$eventType, repo:$repo, workflow:$workflow, branch:$branch,
169+
commit_sha:$commit_sha, pr_number:$pr_number, suite:$suite,
170+
failed_step:$failed_step, log_url:$log_url, event_name:$event_name}')
171+
PAYLOAD="[${RUN_EVENT},${FAIL_EVENT}]"
172+
fi
173+
174+
# No-op-without-secret contract: print what WOULD be sent and exit 0 so
175+
# a fork PR (no secret) or an unprovisioned repo never reds because NR
176+
# is unreachable.
177+
if [ -z "${NR_LICENSE_KEY}" ] || [ -z "${NR_ACCOUNT_ID}" ]; then
178+
echo "::notice title=nr-ci-event::NEW_RELIC_LICENSE_KEY or NEW_RELIC_ACCOUNT_ID absent — dry-run only (no event sent)."
179+
echo "would POST to NR Event API the following payload:"
180+
echo "${PAYLOAD}" | jq .
181+
exit 0
182+
fi
183+
184+
case "$(echo "${NR_REGION}" | tr '[:lower:]' '[:upper:]')" in
185+
EU) HOST="insights-collector.eu01.nr-data.net" ;;
186+
*) HOST="insights-collector.newrelic.com" ;;
187+
esac
188+
URL="https://${HOST}/v1/accounts/${NR_ACCOUNT_ID}/events"
189+
190+
echo "POSTing ${RESULT} result for suite='${EV_SUITE}' to NR account ${NR_ACCOUNT_ID} (${HOST})"
191+
HTTP_CODE=$(curl -sS -o /tmp/nr_ci_event.out -w '%{http_code}' \
192+
-X POST "${URL}" \
193+
-H "Content-Type: application/json" \
194+
-H "Api-Key: ${NR_LICENSE_KEY}" \
195+
--data-binary "${PAYLOAD}" || echo "000")
196+
197+
echo "NR Event API responded HTTP ${HTTP_CODE}"
198+
cat /tmp/nr_ci_event.out 2>/dev/null || true
199+
echo
200+
201+
# NR returns 200 on accept. Any other code (incl. network failure 000)
202+
# is logged as a warning but NEVER fails the job — observability must
203+
# not gate the pipeline.
204+
if [ "${HTTP_CODE}" != "200" ]; then
205+
echo "::warning title=nr-ci-event::NR Event API returned ${HTTP_CODE} (expected 200). CI result not recorded in NR; not failing the job."
206+
fi
207+
exit 0

.github/workflows/ci.yml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,29 @@ jobs:
179179
# the BillingHandler.ensureRazorpayFns data race.
180180
- run: go test ./... -short -race -count=1 -p 1
181181

182+
# Wave 5 — push the gated-test result to New Relic so a red run is
183+
# studyable from an NR dashboard, not just the GitHub Actions log.
184+
# if: always() so a FAILED `go test` step still records the failure
185+
# (InstantCITestRun result=fail + InstantCITestFailure). No-ops cleanly
186+
# when the NR secret/account is absent (fork PRs) — never reds the PR.
187+
- name: Emit CI result to New Relic
188+
if: always()
189+
uses: ./.github/actions/nr-ci-event
190+
with:
191+
license-key: ${{ secrets.NEW_RELIC_LICENSE_KEY }}
192+
account-id: ${{ secrets.NEW_RELIC_ACCOUNT_ID }}
193+
result: ${{ job.status == 'success' && 'pass' || 'fail' }}
194+
suite: build-and-test
195+
pr-number: ${{ github.event.pull_request.number }}
196+
failed-step: ${{ job.status != 'success' && 'go build / vet / test (-short -race -p 1)' || '' }}
197+
repo: ${{ github.repository }}
198+
workflow: ${{ github.workflow }}
199+
branch: ${{ github.ref_name }}
200+
commit-sha: ${{ github.sha }}
201+
log-url: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
202+
event-name: ${{ github.event_name }}
203+
actor: ${{ github.actor }}
204+
182205
# E2E requires a live Kubernetes stack (see repo CLAUDE.md). This job does not
183206
# run on push/PR — only on schedule or manual dispatch — so default CI stays fast.
184207
e2e:

.github/workflows/coverage.yml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -206,3 +206,25 @@ jobs:
206206
echo "Total project coverage: ${total}%"
207207
awk -v t="$total" 'BEGIN { exit (t+0 >= 95) ? 0 : 1 }' \
208208
|| { echo "::error::Production coverage ${total}% is below the 95% floor"; exit 1; }
209+
210+
# Wave 5 — record the coverage-gate outcome in New Relic (suite=coverage)
211+
# so a coverage red is visible alongside the test red on the CI-health
212+
# dashboard. if: always() captures both the patch-gate and floor-gate
213+
# failures above. No-ops without the NR secret.
214+
- name: Emit coverage result to New Relic
215+
if: always()
216+
uses: ./api/.github/actions/nr-ci-event
217+
with:
218+
license-key: ${{ secrets.NEW_RELIC_LICENSE_KEY }}
219+
account-id: ${{ secrets.NEW_RELIC_ACCOUNT_ID }}
220+
result: ${{ job.status == 'success' && 'pass' || 'fail' }}
221+
suite: coverage
222+
pr-number: ${{ github.event.pull_request.number }}
223+
failed-step: ${{ job.status != 'success' && 'coverage gate (100% patch / 95% floor)' || '' }}
224+
repo: ${{ github.repository }}
225+
workflow: ${{ github.workflow }}
226+
branch: ${{ github.ref_name }}
227+
commit-sha: ${{ github.sha }}
228+
log-url: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
229+
event-name: ${{ github.event_name }}
230+
actor: ${{ github.actor }}

.github/workflows/deploy.yml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -323,3 +323,25 @@ jobs:
323323
done
324324
echo "::error::live /healthz never reported commit_id=${SHORT_SHA}"
325325
exit 1
326+
327+
# Wave 5 — record the deploy outcome in New Relic (suite=deploy) so a
328+
# failed deploy (test gate, image build, rollout, or the /healthz
329+
# build-SHA gate above) is studyable from the CI-health dashboard, not
330+
# only the Actions log. if: always() captures the failure path.
331+
# No-ops without the NR secret.
332+
- name: Emit deploy result to New Relic
333+
if: always()
334+
uses: ./api/.github/actions/nr-ci-event
335+
with:
336+
license-key: ${{ secrets.NEW_RELIC_LICENSE_KEY }}
337+
account-id: ${{ secrets.NEW_RELIC_ACCOUNT_ID }}
338+
result: ${{ job.status == 'success' && 'pass' || 'fail' }}
339+
suite: deploy
340+
failed-step: ${{ job.status != 'success' && 'deploy (test gate / build / rollout / healthz SHA gate)' || '' }}
341+
repo: ${{ github.repository }}
342+
workflow: ${{ github.workflow }}
343+
branch: ${{ github.ref_name }}
344+
commit-sha: ${{ github.sha }}
345+
log-url: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
346+
event-name: ${{ github.event_name }}
347+
actor: ${{ github.actor }}

0 commit comments

Comments
 (0)