Skip to content
This repository was archived by the owner on Mar 30, 2026. It is now read-only.

Commit f7046de

Browse files
committed
feat: Enhance documentation with architecture overview, design decisions, and security considerations
1 parent 9cd2923 commit f7046de

File tree

8 files changed

+297
-0
lines changed

8 files changed

+297
-0
lines changed

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,13 @@ outputs/*
3535
*.tif
3636
*.svg
3737

38+
# Documentation assets: allow images that are intentionally part of the docs.
39+
!docs/assets/**/*.png
40+
!docs/assets/**/*.jpg
41+
!docs/assets/**/*.jpeg
42+
!docs/assets/**/*.tif
43+
!docs/assets/**/*.svg
44+
3845
# The following was created by https://www.toptal.com/developers/gitignore/api/macos,windows,r,python
3946
# Edit at https://www.toptal.com/developers/gitignore?templates=macos,windows,r,python
4047

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ A GitHub App backend for secure webhook processing and automation, deployed on G
77

88
If you want to replicate the full setup (GCP remote state, Terraform apply phases, GitHub App creation, deploy + verification), follow the end-to-end tutorial in `docs/tutorial/README.md`.
99

10+
For architecture and security background (intended for technical architects, security colleagues, and interested engineers), see `docs/architecture/README.md`.
11+
1012
## Features
1113

1214
- FastAPI-based webhook handler (`/webhooks/github`)

docs/architecture/README.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# Architecture overview (ons-github-app)
2+
3+
This folder documents how `ons-github-app` is put together, why key design decisions were made, and what the security posture looks like.
4+
5+
Audience:
6+
7+
- Technical architects (cloud topology, IaC, operational model)
8+
- Security colleagues (threat model, key controls, “what could go wrong”)
9+
- Interested engineers (how the pieces fit)
10+
11+
If you want to *deploy* the system end-to-end, start with the setup tutorial at `docs/tutorial/README.md`.
12+
13+
## What the system does (today)
14+
15+
`ons-github-app` is a small FastAPI service that receives GitHub webhooks at `POST /webhooks/github`.
16+
17+
Current demo behavior:
18+
19+
- Verify webhook authenticity using `X-Hub-Signature-256` (HMAC-SHA256)
20+
- For `pull_request` events with `action=opened`, post a comment back to the PR as the GitHub App
21+
22+
## High-level topology
23+
24+
Inbound webhook traffic flows through API Gateway to Cloud Run.
25+
26+
```mermaid
27+
flowchart LR
28+
GH[GitHub Webhooks] -->|HTTPS POST /webhooks/github| GW[GCP API Gateway]
29+
GW -->|HTTPS| CR[Cloud Run service]
30+
CR --> APP[FastAPI app]
31+
32+
APP -->|Reads secrets as files| SM[GCP Secret Manager]
33+
APP -->|JWT -> installation token| GHA[GitHub REST API]
34+
```
35+
36+
Key components:
37+
38+
- **FastAPI app** (runtime): `src/app.py`, `src/webhook.py`, `src/github_app.py`, `src/config.py`
39+
- **Cloud Run** (compute): runs the container and scales based on request volume
40+
- **API Gateway** (ingress): routes `POST /webhooks/github` to Cloud Run
41+
- **Secret Manager** (secrets): stores the GitHub App private key and webhook secret
42+
- **Artifact Registry** (images): stores container images deployed to Cloud Run
43+
- **Terraform** (IaC): declares and provisions all the above
44+
45+
## Endpoints
46+
47+
- `GET /healthz` — liveness/health endpoint
48+
- `POST /webhooks/github` — GitHub webhook receiver
49+
50+
## Data flow: webhook verification and response
51+
52+
1. GitHub sends a webhook payload and includes `X-Hub-Signature-256: sha256=<digest>`
53+
2. The app reads the raw request body bytes
54+
3. The app recomputes the expected digest using the shared webhook secret
55+
4. The app compares digests using a constant-time comparison
56+
5. If valid and the event is accepted, the app executes the handler (e.g. post PR comment)
57+
58+
## Secrets and configuration model
59+
60+
This repo documents a single recommended approach:
61+
62+
- Secrets are **stored as files**
63+
- The app receives **non-secret** environment variables that point at those files:
64+
- `GITHUB_PRIVATE_KEY_FILE`
65+
- `GITHUB_WEBHOOK_SECRET_FILE`
66+
67+
Local development:
68+
69+
- Keep secret files under `./local-secrets/` (ignored by git)
70+
71+
Cloud Run:
72+
73+
- Terraform provisions the Secret Manager *secret containers*
74+
- Secret *values* are added outside Terraform using `gcloud secrets versions add ...`
75+
- Terraform mounts the secrets into the container filesystem, and sets the `*_FILE` env vars
76+
77+
## Infrastructure (Terraform) shape
78+
79+
This repo is designed for a two-phase apply, described in `docs/tutorial/README.md`:
80+
81+
1. **Bootstrap/shared infra** (no image)
82+
- APIs enabled
83+
- service account
84+
- Artifact Registry
85+
- Secret Manager secrets (containers only)
86+
87+
2. **Deploy** (image provided)
88+
- Cloud Run service
89+
- API Gateway
90+
- IAM binding granting API Gateway permission to invoke Cloud Run
91+
92+
## Operational notes
93+
94+
- Logging: the app logs to stdout/stderr; on Cloud Run this lands in Cloud Logging.
95+
- Scaling: Cloud Run scales horizontally with incoming requests. (No explicit autoscaling tuning is configured in Terraform today.)
96+
- CI: GitHub Actions runs Checkov and Trivy filesystem scans; locally, `pre-commit` includes detect-secrets and terraform checks.
97+
98+
## Design decisions (where to read more)
99+
100+
- See `docs/architecture/design-decisions.md` for the short “why we chose X” list.
101+
- See `docs/architecture/security.md` for a security-focused view (threats, controls, and open gaps).
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Design decisions
2+
3+
This document captures the key architectural decisions in `ons-github-app` and the rationale behind them.
4+
5+
## Cloud Run for compute
6+
7+
Decision:
8+
9+
- Run the webhook handler as a container on **Google Cloud Run**.
10+
11+
Why:
12+
13+
- Stateless request/response workload suits serverless containers.
14+
- Scales down when idle and scales up when webhook volume increases.
15+
- Avoids VM management and reduces operational overhead.
16+
17+
Trade-offs:
18+
19+
- Cold starts can add latency for the first request after idle.
20+
- Long-running background work is not a good fit; heavy jobs should be offloaded.
21+
22+
## API Gateway in front of Cloud Run
23+
24+
Decision:
25+
26+
- Put **GCP API Gateway** in front of the Cloud Run service.
27+
28+
Why:
29+
30+
- Central place to route and evolve HTTP ingress without exposing Cloud Run URLs directly.
31+
- Keeps the public interface stable while Cloud Run revisions change.
32+
33+
Trade-offs / notes:
34+
35+
- API Gateway is currently configured as a simple reverse proxy using an OpenAPI spec.
36+
- Rate limiting / auth policies are not configured here today; webhook authenticity is enforced at the application layer.
37+
38+
## GitHub webhook authenticity via HMAC signature verification
39+
40+
Decision:
41+
42+
- Treat the request body as untrusted and verify `X-Hub-Signature-256` using HMAC-SHA256.
43+
44+
Why:
45+
46+
- Prevents spoofed/fabricated webhook events from triggering automation.
47+
- Verification occurs before parsing JSON (avoids trusting payload content prematurely).
48+
49+
Trade-offs:
50+
51+
- Does not stop replay by itself (see security doc for mitigations/opportunities).
52+
53+
## GitHub App authentication: JWT + installation tokens
54+
55+
Decision:
56+
57+
- Authenticate to GitHub as a GitHub App:
58+
- sign a short-lived JWT using the app private key
59+
- exchange it for an installation access token scoped to the installation
60+
61+
Why:
62+
63+
- Avoids long-lived personal access tokens.
64+
- Installation token scope is bounded to installed repos and expires.
65+
66+
## Secrets strategy: files + Secret Manager
67+
68+
Decision:
69+
70+
- Secrets are not stored in Terraform variables.
71+
- Secret *values* live in **Secret Manager**, mounted as **files** into Cloud Run.
72+
- Locally, secrets are also files under `./local-secrets/`.
73+
74+
Why:
75+
76+
- Keeps secret values out of:
77+
- git history
78+
- container images
79+
- Terraform state
80+
- Using files matches Cloud Run’s secret mount model.
81+
82+
## Two-phase Terraform apply
83+
84+
Decision:
85+
86+
- Use a two-phase workflow:
87+
1) apply shared infra with `image = ""`
88+
2) push image + add secret versions + apply again
89+
90+
Why:
91+
92+
- Allows provisioning to succeed before an image exists.
93+
- Ensures secret versions can be added outside Terraform.
94+
95+
## CI security scanning
96+
97+
Decision:
98+
99+
- Run IaC and repo scanning in CI:
100+
- Checkov (IaC policy scanning)
101+
- Trivy filesystem scan (dependency and config scanning)
102+
103+
Why:
104+
105+
- Prevents obvious misconfigurations and vulnerable dependencies from landing unreviewed.

docs/architecture/security.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Security considerations
2+
3+
This document is a security-focused view of `ons-github-app`.
4+
5+
## Threat model (what we assume)
6+
7+
- The webhook endpoint is internet reachable (via API Gateway).
8+
- Attackers may discover the endpoint URL.
9+
- Attackers may attempt:
10+
- spoofing webhook events
11+
- tampering with payloads in transit
12+
- replaying valid payloads
13+
- causing excessive request volume (resource/cost pressure)
14+
- stealing secrets from source control, Terraform state, build logs, or runtime environment
15+
16+
## Primary controls implemented
17+
18+
### Webhook signature verification
19+
20+
- The app verifies `X-Hub-Signature-256` using an HMAC-SHA256 digest of the raw request body.
21+
- Uses constant-time comparison.
22+
- Rejects invalid signatures with HTTP 401.
23+
24+
Outcome:
25+
- Prevents forged payloads that don’t possess the shared webhook secret.
26+
27+
### Short-lived GitHub App auth
28+
29+
- The app signs a JWT (RS256) using the GitHub App private key.
30+
- Exchanges it for an installation access token.
31+
- Uses the installation token to call GitHub’s API.
32+
33+
Outcome:
34+
- Avoids long-lived PATs and bounds token blast radius.
35+
36+
### Secrets kept out of Terraform state and git
37+
38+
- Terraform creates Secret Manager *secret containers* only.
39+
- Secret versions are added with `gcloud secrets versions add ...` outside Terraform.
40+
- Cloud Run mounts secrets as files; the app reads them via `*_FILE` env vars.
41+
- `./local-secrets/` is ignored via `.gitignore`.
42+
43+
Outcome:
44+
- Reduces likelihood of accidental credential leakage.
45+
46+
### Minimal exposed endpoints
47+
48+
- The public API surface is intentionally small:
49+
- `GET /healthz`
50+
- `POST /webhooks/github`
51+
52+
## Operational security notes
53+
54+
- Logging: avoid logging secret material or full request bodies. This repo logs the GitHub event type/action only.
55+
- IAM: Cloud Run uses a dedicated service account. That service account is granted Secret Manager access only for the required secrets.
56+
57+
## Known gaps / future hardening ideas
58+
59+
These are not necessarily required for a demo, but are common asks from security review.
60+
61+
- Replay resistance: GitHub webhooks include delivery IDs (e.g. `X-GitHub-Delivery`). Consider storing recent delivery IDs and rejecting duplicates.
62+
- Rate limiting / abuse controls: API Gateway is currently acting as a router. Consider adding quotas/rate limits and alerting on abnormal volume.
63+
- Structured audit logging: capture request metadata (delivery ID, event type, signature verification result) in a structured form.
64+
- Dependency management: requirements are pinned; consider adding dependency update automation and/or vulnerability gates on container images.
65+
66+
## Where to look in code
67+
68+
- Signature verification: `src/webhook.py`
69+
- Secret loading: `src/config.py`
70+
- GitHub App auth + API calls: `src/github_app.py`
71+
- Webhook handler routing: `src/app.py`
26.3 KB
Loading

docs/tutorial/04-deploy-and-verify.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,11 @@ Set your GitHub App webhook URL to that value.
210210

211211
1) Open a pull request in the repo where the app is installed.
212212
2) In GitHub → your PR, you should see a comment posted by the app.
213+
214+
Example output:
215+
216+
![Example PR comment created by the GitHub App](../assets/screenshot-bot-pr-comment.png)
217+
213218
3) If it doesn’t appear, check:
214219

215220
- Cloud Run logs (look for `event=pull_request action=opened`)

docs/tutorial/README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,9 @@ This repo is designed for a **two-phase Terraform apply**:
3030
- 02: Remote state bootstrap — see 02-remote-state.md
3131
- 03: GitHub App setup — see 03-github-app.md
3232
- 04: Deploy & verify (local + cloud) — see 04-deploy-and-verify.md
33+
34+
## Architecture and security
35+
36+
- Architecture overview — see ../architecture/README.md
37+
- Design decisions — see ../architecture/design-decisions.md
38+
- Security considerations — see ../architecture/security.md

0 commit comments

Comments
 (0)