Skip to content

Commit c8b901c

Browse files
constkclaude
andcommitted
docs: HARNESS, INVARIANTS, BOUNDARIES, DEVELOPMENT, EVAL_HARNESS, SECURITY, ARCHITECTURE + README, CONTRIBUTING, CLAUDE.md, CHANGELOG, TASKS.md (#25, #26)
Closes #25 + #26 in one PR (the README references docs/* paths and TASKS.md references everything else; landing them separately would mean a half-broken docs tree in between). docs/* (all written for the template, not Teller-flavoured): - HARNESS.md — umbrella table mapping every layer to its config file and to the meta-gate(s) that catch drift in it. - INVARIANTS.md — five portable rules with numbered slots 6+ for project additions. - BOUNDARIES.md — ASCII layer diagram + the import-linter contract spec + how to add a layer cleanly. - DEVELOPMENT.md — prereqs, first-time setup, dev stack, justfile recipes table, branching diagram, commit-prefix table, CI workflow inventory, agent-hook setup, branch-protection token setup. - EVAL_HARNESS.md — runner architecture, three tolerance modes, wiring your agent / LLM client, adding a case, opt-in for nightly schedule. - SECURITY.md — threat model table + defence-in-depth ASCII map + container hardening notes + explicit out-of-scope list (auth, WAF, rate-limit, secret manager). - ARCHITECTURE.md — scaffold component diagram, request lifecycle, frontend lifecycle, slots that fill in as the project grows. Top-level docs: - README.md — what ships / quickstart / why-a-harness / docs index / versions table / license. - CONTRIBUTING.md — branching diagram, commit-prefix table, PR template callouts, "adding a check" recipe. - CLAUDE.md — agent project instructions: read-first list, workflow, code conventions, what-not-to-do, skills inventory. - CHANGELOG.md — release-drafter seed; first Unreleased entry summarises the harness extraction. - docs/TASKS.md — full ticket table with phase grouping + status emoji, matches the GitHub Project board. Closes #25 Closes #26 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 00880e5 commit c8b901c

12 files changed

Lines changed: 876 additions & 2 deletions

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
The format is loosely based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6+
7+
Released versions are drafted automatically by [release-drafter](https://github.com/release-drafter/release-drafter); see `.github/release-drafter.yml` and `.github/workflows/release-drafter.yml`. Each entry on the GitHub Releases page corresponds to a tag of the form `vX.Y.Z`.
8+
9+
## Unreleased
10+
11+
### Added
12+
13+
- Initial harness scaffold (Python 3.14 + FastAPI + Pydantic v2 + OpenTelemetry; React 19.2 + Vite + TypeScript strict).
14+
- 15 required CI status checks (lint, typecheck, tests, coverage ≥ 75 %, import-linter, pre-commit, frontend build/quality, security suite, two meta-gates, PR-title lint).
15+
- Release pipeline: tag-triggered build, push to GHCR, CycloneDX SBOM, GitHub Release publish.
16+
- Eval harness scaffold (provider-agnostic runner + LLM-judge Protocol + 1 example golden case + workflow_dispatch nightly).
17+
- `.claude/` agent integration (3 hooks, 6 auto-activating skills, settings example).
18+
19+
### Notes
20+
21+
- This template was extracted from a financial-agent take-home (Teller) and generalised. The harness is the product; the scaffold exists so every gate has something to operate on.

CLAUDE.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# CLAUDE.md — agent project instructions
2+
3+
You are working in `harness-python-react`, a template repo whose harness IS the product. Code quality here is enforced mechanically — every gate fails CI, not just tests. Keep that bar as you work.
4+
5+
## What this repo is
6+
7+
A production-quality LLM-driven coding harness over a minimal FastAPI + React scaffold. The point isn't the features (one `/health`, one `/echo`, one hello page); the point is that every layer of the pipeline — lint, types, architecture, security, eval, agent hooks — catches a different failure class without anyone remembering to run it.
8+
9+
## Read first
10+
11+
- [`docs/HARNESS.md`](docs/HARNESS.md) — umbrella; the controls and where they live.
12+
- [`docs/INVARIANTS.md`](docs/INVARIANTS.md) — the load-bearing rules. Every PR is checked against them.
13+
- [`docs/BOUNDARIES.md`](docs/BOUNDARIES.md) — layered import-linter contract; reverse imports fail CI.
14+
- [`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) — branching, commit format, justfile, CI overview.
15+
16+
## Workflow
17+
18+
- One issue per change. Branch name: `feat|fix|chore|docs|test|refactor/<issue-number>-<kebab-title>`.
19+
- One PR per branch, base `develop`. PR title = the conventional-commit subject.
20+
- `develop → main` happens via a `release:` PR.
21+
- The pre-push gate is `just check` (lint + typecheck + architecture + tests). Run it before pushing.
22+
- For frontend changes, also run `just frontend-check`.
23+
24+
## Code conventions
25+
26+
- **Python:** 3.14, `uv run --frozen` everywhere, mypy `--strict`, ruff with the wide select set (`E W F I N UP B SIM TCH S RUF`).
27+
- **Type hints:** every public function. `from __future__ import annotations` at module top.
28+
- **Models:** anything crossing a module / process seam inherits from `StrictModel` (`src/models/_base.py`). `extra="forbid"`. Add `strict=True` to the class when you want strict type coercion (rejecting `"3.14"` → float).
29+
- **API:** every route under `/api/v1/`. Typed Pydantic responses, not raw dicts.
30+
- **Layer flow:** one-way. Reverse imports are a CI failure. See `docs/BOUNDARIES.md`.
31+
- **Observability:** OTel `agent_span(...)` for any operation in the request path; semconv-defined attribute keys only (constants at the top of `src/observability/spans.py`).
32+
- **Frontend:** React 19 + TS strict; functional components + hooks; never `dangerouslySetInnerHTML` on backend output; SSE consumers use the typed primitive at `frontend/src/lib/api/client.ts`.
33+
34+
## What NOT to do
35+
36+
- Don't bypass gates. `--no-verify` / `--no-hooks` / `--no-gpg-sign` are blocked by `pretooluse_bash.py` for a reason. If a hook is wrong, fix the hook.
37+
- Don't introduce a new commit-type prefix without updating both `pyproject.toml`'s commitizen schema AND `pr-title.yml` (the `Commit-type sync` meta-gate will fail otherwise).
38+
- Don't add a CI job without listing it in `.github/branch-protection/{develop,main}.json` (the `Branch-protection contexts sync` meta-gate will fail).
39+
- Don't skip the architecture contract by accident — `lint-imports` runs in CI and locally via `just architecture`.
40+
- Don't write code without tests. Coverage gate is 75% on `src/`.
41+
- Don't hand-roll secrets into config. Use env / `.env` (gitignored) → `Settings` from `src/models/config.py`.
42+
- Don't create files unless they're necessary. The scaffold has no dead modules.
43+
44+
## Use the skills
45+
46+
The agent-side skills in `.claude/skills/` auto-activate based on context:
47+
48+
- `architect` — when designing module boundaries, API contracts, layer-flow decisions.
49+
- `code-reviewer` — after writing/editing code; runs the 10-point review checklist.
50+
- `devops` — when touching Docker, CI, pyproject.toml, observability config.
51+
- `frontend` — when working in `frontend/` (React 19 + TS + Vite).
52+
- `qa-engineer` — when writing tests or extending the eval harness.
53+
- `technical-writer` — when updating docs / READMEs.
54+
55+
Trust their guidance — they encode this project's conventions.
56+
57+
## When in doubt
58+
59+
- If the change touches a gate, update the meta-gate inputs (`branch-protection/*.json`, `pr-title.yml`, `check_required_contexts.py`'s exemption list).
60+
- If the change touches an invariant, decide whether the invariant is wrong (update `docs/INVARIANTS.md` in the same PR) or the change is wrong (rework).
61+
- If a CI job is failing for a reason that doesn't match the change, dig — don't reroll. Recent fix patterns: tag-vs-commit SHA in pinned action references, `if: hashFiles(...)` startup failures (see project-memory), pytest exit-5 on empty test suites.
62+
63+
The harness exists to make sloppy work hard. Lean into it — when a gate trips, it's protecting the next person reading this codebase.

CONTRIBUTING.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Contributing
2+
3+
Thanks for taking a look. This template's harness is the product, so the contribution flow is opinionated — every change goes through the same gates as a feature.
4+
5+
## Branching
6+
7+
```
8+
main ◄── release PR ◄── develop ◄── feat/123-short-name
9+
◄── fix/124-bug-name
10+
◄── chore/125-config-change
11+
```
12+
13+
- `main` is the release line. Protected: 15 required status checks, code-owner approval, no force pushes.
14+
- `develop` is the integration branch. Same gates, less strict (PRs don't need rebases).
15+
- Feature branches are short-lived and named `<type>/<issue-number>-<kebab-title>`. Open one issue per branch so the project board stays usable.
16+
17+
## Commit messages
18+
19+
Seven prefixes (enforced in three places — `[tool.commitizen]` in `pyproject.toml`, `pr-title.yml`, `check_commit_types.py`):
20+
21+
| Prefix | When |
22+
|---|---|
23+
| `feat:` | New capability |
24+
| `fix:` | Bug fix |
25+
| `docs:` | Documentation only |
26+
| `test:` | Tests / eval harness |
27+
| `refactor:` | Internal change with no behaviour delta |
28+
| `chore:` | Tooling, deps, infra |
29+
| `release:` | `develop → main` release PRs only |
30+
31+
The subject is **lowercase** after the colon. Title Case prose (`Add the thing`) is rejected; all-caps initialisms (`CI failure`, `SDK upgrade`) are fine.
32+
33+
## Pull requests
34+
35+
1. Open the issue first. Use a feature/bug template; fill every section.
36+
2. Branch off `develop` with the matching name.
37+
3. Land one logical change per PR. Stack PRs if the work is naturally split.
38+
4. The PR template asks five things — answer each (`None` is valid where applicable):
39+
- **What & why** (1–3 lines)
40+
- **Test plan** (checkbox list; CI covers most of it)
41+
- **Invariants affected** — cite numbered rules from `docs/INVARIANTS.md`
42+
- **New deps / actions / external surface** (anchor for supply-chain review)
43+
- **Screenshots** (UI changes only)
44+
5. Wait for green CI + a code-owner review before merging.
45+
46+
## Local pre-push gate
47+
48+
```sh
49+
just check # ruff + mypy + import-linter + pytest
50+
cd frontend && npm run lint && npm run format:check && npm run check && npm run test && npm run build
51+
uv run pre-commit run --all-files
52+
```
53+
54+
A green pre-push run is a high-confidence predictor of a green CI run. The `just check` gate is intentionally a subset of CI — fast feedback over coverage.
55+
56+
## Adding a check
57+
58+
When the harness grows a new gate:
59+
60+
1. Add the workflow job in `.github/workflows/`.
61+
2. If it's a required gate, add the job's display name to the `contexts` arrays in `.github/branch-protection/{develop,main}.json`.
62+
3. If it's NOT required (scheduled / dispatch-only / push-to-main-only), add the workflow filename to `EXEMPT_WORKFLOWS` in `.github/scripts/check_required_contexts.py`.
63+
4. Update `docs/HARNESS.md` and `docs/SECURITY.md` (if security-relevant).
64+
5. Land in one PR — the meta-gate `Branch-protection contexts sync` will fail if you skip step 2 or 3.
65+
66+
## Code of conduct
67+
68+
Be kind. Disagree on substance, not on people. If review feedback gets sharp, take it offline and come back when both sides are ready.
69+
70+
## Reporting security issues
71+
72+
If you find a vulnerability that affects users of the template, **do not open a public issue**. Email the maintainer (see commit history for contact). Include:
73+
74+
- Repro steps
75+
- Affected version / commit SHA
76+
- Severity assessment (informational / low / medium / high / critical)
77+
- Suggested fix if you have one
78+
79+
We'll acknowledge within 72 hours.

README.md

Lines changed: 69 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,72 @@
11
# harness-python-react
22

3-
Production-quality coding harness for Python (FastAPI) backends and Vite + React + TypeScript frontends. Designed for LLM-driven development: every gate — lint, types, architecture, security, eval — is enforced mechanically so code quality stays consistent across many human and AI contributors.
3+
> A production-quality coding harness for Python (FastAPI) + Vite/React/TypeScript projects. Designed for LLM-driven development: every gate — lint, types, architecture, security, eval — is enforced mechanically so code quality stays consistent across many human and AI contributors.
44
5-
> **Status:** bootstrap. Full documentation, scaffolding, and the harness itself land across [issues #1#28](https://github.com/constk/harness-python-react/issues). Track progress on the [project board](https://github.com/users/constk/projects/3).
5+
## What ships
6+
7+
- **Backend:** Python 3.14, FastAPI, Pydantic v2 (`StrictModel` base), `uv` deps, OpenTelemetry SDK + OTLP exporter, structured JSON logs, generic tool-registry pattern.
8+
- **Frontend:** Node 24 LTS, React 19.2, Vite 8, TypeScript strict, ESLint 10 flat config, Prettier, Vitest + jsdom + Testing Library.
9+
- **Eval harness:** provider-agnostic runner + LLM-judge `Protocol`, three tolerance modes (exact / numeric / semantic), one example golden case, nightly workflow (disabled by default).
10+
- **CI:** 15 required status checks across `ci.yml` (lint/format, mypy strict, unit tests, coverage ≥75%, import-linter architecture, pre-commit, frontend build, frontend quality, branch-protection sync, commit-type sync) + `security.yml` (gitleaks, pip-audit, npm audit, trivy) + PR-title lint.
11+
- **Release:** tag-triggered workflow that builds the image, pushes to `ghcr.io`, generates a CycloneDX SBOM, and publishes the GitHub Release.
12+
- **Agent integration:** `.claude/hooks/` (forbidden-flag blocker, secret scan, formatter dispatch, SessionStart context) + six auto-activating skills (architect / code-reviewer / devops / frontend / qa-engineer / technical-writer).
13+
- **Docker:** multi-stage Dockerfile (non-root, healthcheck), `docker compose up` boots app + frontend + Jaeger.
14+
15+
## Quickstart
16+
17+
```sh
18+
git clone https://github.com/constk/harness-python-react.git
19+
cd harness-python-react
20+
21+
uv sync --extra dev
22+
uv run pre-commit install --hook-type pre-commit --hook-type commit-msg
23+
(cd frontend && npm ci)
24+
25+
docker compose up # backend :8000, frontend :5173, Jaeger :16686
26+
```
27+
28+
The pre-push gate is `just check` (= ruff + mypy + import-linter + pytest). For frontend changes add `just frontend-check`.
29+
30+
## Why a harness
31+
32+
The differentiator isn't the scaffold — it's that every layer of the pipeline catches a different failure class **without relying on the human or LLM coder remembering to run anything**. The same posture protects code regardless of who wrote it.
33+
34+
See [`docs/HARNESS.md`](docs/HARNESS.md) for the full umbrella. Highlights:
35+
36+
- **Pydantic `StrictModel` everywhere a contract crosses a seam** (rejects unknown keys at construction).
37+
- **`import-linter` enforces one-way layer flow** (`api | eval → agent → tools → data → observability → models`).
38+
- **Three independent secret scans** (PreToolUse hook → pre-commit gitleaks → CI gitleaks).
39+
- **Two meta-gates** that catch *drift in the gates themselves*: `Branch-protection contexts sync` (workflow jobs vs branch-protection JSON) and `Commit-type sync` (commitizen schema vs PR-title allowlist).
40+
- **CycloneDX SBOM attached to every release** for supply-chain attestation.
41+
42+
## Documentation
43+
44+
| File | Purpose |
45+
|---|---|
46+
| [`docs/HARNESS.md`](docs/HARNESS.md) | Umbrella: every control + where it lives |
47+
| [`docs/INVARIANTS.md`](docs/INVARIANTS.md) | The numbered load-bearing rules |
48+
| [`docs/BOUNDARIES.md`](docs/BOUNDARIES.md) | Module layering + the import-linter contracts |
49+
| [`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) | Local setup, branching, justfile, CI |
50+
| [`docs/EVAL_HARNESS.md`](docs/EVAL_HARNESS.md) | Eval flywheel + opt-in for the nightly workflow |
51+
| [`docs/SECURITY.md`](docs/SECURITY.md) | Threat model + defence-in-depth map |
52+
| [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) | Scaffold-level component view |
53+
| [`CONTRIBUTING.md`](CONTRIBUTING.md) | Branching, commit format, PR flow |
54+
| [`CLAUDE.md`](CLAUDE.md) | Agent-facing project instructions |
55+
56+
## Versions
57+
58+
Verified April 2026 (`endoflife.date`):
59+
60+
| Layer | Version | Sunset |
61+
|---|---|---|
62+
| Python | 3.14.4 | active feature release |
63+
| Node LTS | 24.15.0 | through 2028-04-30 |
64+
| React | 19.2.5 | current stable |
65+
| Vite | 8.x | current stable |
66+
| TypeScript | 6.x | current stable |
67+
68+
Bump together (Python in `pyproject.toml`, Node in `frontend/package.json`, both in `Dockerfile` + the CI matrix). Document the bump in `docs/DEVELOPMENT.md`.
69+
70+
## License
71+
72+
[MIT](LICENSE).

0 commit comments

Comments
 (0)