Skip to content

fix: fail-fast config validation and post-deploy smoke test#77

Merged
chrisjwalk-bot merged 3 commits into
mainfrom
fix/startup-validation-smoke-test-75
Mar 17, 2026
Merged

fix: fail-fast config validation and post-deploy smoke test#77
chrisjwalk-bot merged 3 commits into
mainfrom
fix/startup-validation-smoke-test-75

Conversation

@chrisjwalk-bot
Copy link
Copy Markdown
Collaborator

Closes #75

Changes

Program.cs — fail-fast startup validation

Checks Jwt:Key, Jwt:Issuer, and Jwt:Audience at startup and throws InvalidOperationException with a clear message listing which values are missing. The app refuses to start rather than booting successfully and returning 500s at runtime.

deploy.yml — post-deploy smoke test

After every deploy, curls /api/weatherforecasts up to 5 times (15s apart). The workflow fails if HTTP 200 is not returned, so a broken deploy is caught in CI before any user sees it.

Root Cause Analysis

See issue #76 for the full incident RCA.

Add startup validation to Program.cs that throws immediately if any
required JWT config values are absent. This surfaces missing env vars
as a clear error in Azure logs on boot rather than as silent 500s at
runtime.

Add a post-deploy smoke test to deploy.yml that retries /api/weatherforecasts
up to 5 times (15s apart) after deployment. The workflow fails if the
endpoint does not return HTTP 200, preventing a broken deploy from
going undetected.

Root cause documented in issue #76.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread .github/workflows/deploy.yml Outdated
@github-actions
Copy link
Copy Markdown

Azure Static Web Apps: Your stage site is ready! Visit it here: https://green-water-08792290f-77.eastus2.2.azurestaticapps.net

@chrisjwalk
Copy link
Copy Markdown
Owner

This PR addresses the production incident from 2026-03-16 where PR #72 deployed successfully but caused ~3 hours of 500s on all auth endpoints. Full RCA in issue #76.

What went wrong: Three env vars (JWT_KEY, Jwt__Issuer, Jwt__Audience) were introduced by PR #72 but never provisioned in Azure App Service. The app started fine and the deploy pipeline reported success — the problem only surfaced when a user hit an auth endpoint.

Two layers of defence added:

1. Fail-fast startup validation (Program.cs)
The app now checks all three required JWT config values at boot and throws immediately with a clear message if any are absent:

Required configuration values are missing: Jwt:Key.
Set them as environment variables (e.g. JWT_KEY, Jwt__Issuer, Jwt__Audience) before starting the application.

Azure surfaces this as a startup crash in the logs — no more silent 500s.

2. Post-deploy smoke test (deploy.yml)
After every deployment, the workflow retries GET /api/weatherforecasts up to 5 times (15s apart). If HTTP 200 isn't returned the workflow fails, meaning a misconfigured deploy is caught in CI before any user sees it. With this in place, the PR #72 incident would have failed the pipeline within ~75s of deploying.

Add /health/live (liveness, no DB check) and /health/ready (readiness,
includes SQL Server check) using ASP.NET Core built-in health checks +
AspNetCore.HealthChecks.SqlServer.

The deploy smoke test hits /health/live so a sleeping serverless Azure
SQL DB does not cause false failures. /health/ready is available for
uptime monitoring and alerting.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

Azure Static Web Apps: Your stage site is ready! Visit it here: https://green-water-08792290f-77.eastus2.2.azurestaticapps.net

Comment thread apps/api/Api/Program.cs Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

Azure Static Web Apps: Your stage site is ready! Visit it here: https://green-water-08792290f-77.eastus2.2.azurestaticapps.net

@chrisjwalk-bot chrisjwalk-bot merged commit ce57668 into main Mar 17, 2026
7 checks passed
@chrisjwalk-bot chrisjwalk-bot deleted the fix/startup-validation-smoke-test-75 branch March 17, 2026 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: fail-fast config validation and post-deploy smoke test

2 participants