Merge pull request #535 from nanotaboada/docs/architecture-decision-records

nanotaboada · web-flow · commit 0810c52ddf8b · 2026-03-22T13:20:57.000-03:00
docs(adr): initialize Architecture Decision Records
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -42,6 +42,13 @@ This project uses famous football coaches as release codenames, following an A-Z
 
 ## [Unreleased]
 
+### Added
+
+- Architecture Decision Records (ADRs) in `docs/adr/` documenting key
+  decisions: SQLite, no Alembic, UUID primary key, squad number as
+  mutation key, full replace PUT, in-memory caching, integration-only
+  tests (#482)
+
 ---
 
 ## [2.0.0 - Capello] - 2026-03-17
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -37,21 +37,41 @@ We value **incremental, detail‑first contributions** over big rewrites or abst
   - Run `pytest` before pushing.
   - Ensure coverage isn’t regressing.
 
-## 3. Pull Request Workflow
+## 3. Architecture Decision Records
+
+Significant architectural decisions are documented as ADRs in
+`docs/adr/`. Before changing or replacing a decision captured in an
+ADR, write a new ADR that supersedes it — do not edit the original.
+
+**When to write an ADR:** apply the three-part test:
+1. A real fork existed — a genuine alternative was considered and
+   rejected.
+2. The code doesn't explain the why — a new contributor reading the
+   source cannot infer the reasoning.
+3. Revisiting it would be costly — changing it requires significant
+   rework.
+
+All three must be true. Process conventions (commit format, branch
+naming) and project principles (audience, tone) do not belong in ADRs.
+
+ADRs are append-only. To change a decision: write a new ADR with
+status `SUPERSEDED by ADR-NNNN` on the old one.
+
+## 4. Pull Request Workflow
 
 - **One logical change per PR.**
 - **Rebase or squash** before opening to keep history clean.
 - **Title & Description**
   - Use Conventional Commit format.
   - Explain _what_ and _why_ concisely in the PR body.
 
-## 4. Issue Reporting
+## 5. Issue Reporting
 
 - Search open issues before creating a new one.
 - Include clear steps to reproduce and environment details.
 - Prefer **focused** issues—don’t bundle multiple topics.
 
-## 5. Automation & Checks
+## 6. Automation & Checks
 
 All PRs and pushes go through CI:
 
@@ -62,7 +82,7 @@ All PRs and pushes go through CI:
 
 PRs must pass all checks to be reviewed.
 
-## 6. Code of Conduct & Support
+## 7. Code of Conduct & Support
 
 - See `CODE_OF_CONDUCT.md` for guidelines and reporting.
 - For questions or planning, open an issue and use the `discussion` label, or mention a maintainer.
diff --git a/README.md b/README.md
@@ -29,6 +29,7 @@ Proof of Concept for a RESTful API built with [Python 3](https://www.python.org/
 - [Releases](#releases)
 - [Environment Variables](#environment-variables)
 - [Command Summary](#command-summary)
+- [Architecture Decisions](#architecture-decisions)
 - [Contributing](#contributing)
 - [Legal](#legal)
 
@@ -404,6 +405,11 @@ PYTHONUNBUFFERED=1
 | `docker compose down` | Stop Docker container |
 | `docker compose down -v` | Stop and remove Docker volume |
 
+## Architecture Decisions
+
+Key architectural decisions are documented as ADRs in
+[`docs/adr/`](docs/adr/README.md).
+
 ## Contributing
 
 Contributions are welcome! Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on the code of conduct and the process for submitting pull requests.
diff --git a/docs/adr/0001-sqlite-as-database-engine.md b/docs/adr/0001-sqlite-as-database-engine.md
@@ -0,0 +1,58 @@
+# ADR-0001: SQLite as Database Engine
+
+Date: 2026-03-21
+
+## Status
+
+Accepted
+
+## Context
+
+This project is a proof-of-concept REST API. A database engine was
+required to persist player data. The realistic alternatives were:
+
+- **PostgreSQL** — the standard choice for production REST APIs; requires
+  a running server process, a connection string, and either a local
+  installation or a Docker service alongside the application.
+- **MySQL / MariaDB** — similar trade-offs to PostgreSQL.
+- **SQLite** — an embedded, file-based engine; no server process, no
+  installation beyond a driver, and the database file can be committed
+  to the repository pre-seeded.
+
+The project includes a pre-seeded database file (`storage/players-sqlite3.db`)
+that contains all 26 players from the Argentina 2022 World Cup squad.
+SQLAlchemy 2.0 abstracts most engine-specific SQL, and aiosqlite provides
+an async driver compatible with the rest of the async stack.
+
+## Decision
+
+We will use SQLite as the database engine via `aiosqlite` and
+SQLAlchemy 2.0 (async).
+
+The deciding factor is operational simplicity for a PoC: zero server
+infrastructure, a self-contained database file that can be seeded once
+and committed, and a one-command startup (`uv run uvicorn main:app`).
+The async driver (`aiosqlite`) keeps the stack consistent with the
+rest of the async I/O patterns, and SQLAlchemy abstracts enough of the
+engine differences that migrating to PostgreSQL later would require
+minimal changes to the application code.
+
+## Consequences
+
+**Positive:**
+- No external service dependency — the application runs with a single
+  command and no Docker Compose required for development.
+- The pre-seeded database file ships with the repository, making the
+  project immediately runnable with real data.
+- SQLAlchemy 2.0 abstracts most engine-specific behavior; a future
+  migration to PostgreSQL is feasible with driver and connection string
+  changes only.
+
+**Negative:**
+- SQLite does not support concurrent writes. This is acceptable for a
+  single-instance PoC but rules out horizontal scaling without a
+  database change.
+- Some PostgreSQL features (e.g. `RETURNING`, advisory locks, full-text
+  search) are unavailable or behave differently.
+- The committed database file can accumulate stale data if seed scripts
+  are updated but the file is not regenerated manually.
diff --git a/docs/adr/0002-no-alembic-manual-seed-scripts.md b/docs/adr/0002-no-alembic-manual-seed-scripts.md
@@ -0,0 +1,55 @@
+# ADR-0002: No Alembic — Manual Seed Scripts
+
+Date: 2026-03-21
+
+## Status
+
+Accepted. Migration to Alembic is under consideration — tracked in
+issue #2.
+
+## Context
+
+SQLAlchemy projects typically use Alembic to manage schema migrations:
+versioned migration scripts, upgrade/downgrade paths, and a migration
+history table. The alternatives considered were:
+
+- **Alembic** — the standard SQLAlchemy migration tool; provides
+  versioned, reversible migrations and is well-supported in production.
+- **Manual seed scripts** — standalone Python scripts in `tools/` that
+  create the schema and populate data directly using SQLAlchemy models;
+  no migration history, no upgrade/downgrade concept.
+- **No seeding** — start from an empty database and rely on the API to
+  create all data; unsuitable since the project ships a fixed dataset
+  (Argentina 2022 squad).
+
+The project uses a fixed, pre-seeded SQLite database file
+(`storage/players-sqlite3.db`) committed to the repository. Schema
+evolution has been infrequent and handled by regenerating the file
+manually from updated seed scripts.
+
+## Decision
+
+We will use standalone seed scripts in `tools/` instead of Alembic.
+
+For a PoC with a stable schema and a single committed database file,
+Alembic adds tooling overhead (initialization, migration directory,
+`env.py` configuration) without meaningful benefit. The seed scripts
+(`tools/seed_001_starting_eleven.py`, `tools/seed_002_substitutes.py`)
+are self-contained and produce a deterministic output using UUID v5
+for stable, reproducible primary keys.
+
+## Consequences
+
+**Positive:**
+- No Alembic dependency or configuration; simpler project setup.
+- Seed scripts are plain Python, easy to read and modify.
+- Deterministic UUID v5 keys mean the seeded database is identical
+  across environments and safe to reference in tests.
+
+**Negative:**
+- No migration history. Schema changes require manually updating the
+  database file and rerunning seed scripts; there is no upgrade path.
+- Not suitable for a production deployment where data must be preserved
+  across schema changes.
+- As schema complexity grows, the absence of migration tooling becomes
+  increasingly costly. Issue #2 tracks the future adoption of Alembic.
diff --git a/docs/adr/0003-uuid-surrogate-primary-key.md b/docs/adr/0003-uuid-surrogate-primary-key.md
@@ -0,0 +1,69 @@
+# ADR-0003: UUID Surrogate Primary Key with v4/v5 Split
+
+Date: 2026-03-21
+
+## Status
+
+Accepted
+
+## Context
+
+A primary key strategy was required for the `players` table. The
+candidates considered were:
+
+- **Auto-increment integer** — simple, compact, sequential; but leaks
+  row count and insertion order to clients, and integer IDs are
+  predictable, which can be a security concern for resource enumeration.
+- **UUID v4** — randomly generated, opaque, non-sequential; no
+  information leaked; standard for API-created resources.
+- **UUID v7 / ULID** — time-ordered UUIDs; better index locality than
+  v4 but add a dependency and are less universally supported.
+- **Natural key (`squad_number`)** — human-readable, but not stable
+  across squads or seasons; unsuitable as a surrogate.
+
+An additional constraint was seeded data: the project ships 26
+pre-seeded players that must have stable, reproducible primary keys
+across environments so that tests can reference them by ID.
+
+UUID v4 (random) cannot satisfy this: regenerating the seed would
+produce different IDs each time. UUID v5 (deterministic, derived from
+a namespace and a name) produces the same UUID for the same input,
+making seeded records reproducible. The project stores pre-computed
+UUID v5 constants directly in the seed scripts rather than deriving
+them at runtime from the squad number.
+
+## Decision
+
+We will use a UUID surrogate primary key stored as a hyphenated string
+(`HyphenatedUUID` custom SQLAlchemy type). API-created records receive
+a UUID v4 (random); migration-seeded records receive a UUID v5
+(deterministic, pre-computed and stored as constants in the seed
+scripts).
+
+The v4/v5 split preserves the benefits of each approach in its context:
+randomness for API-created records (opaque, non-predictable), and
+determinism for seeded records (stable across environments, safe for
+test fixtures).
+
+The UUID is intentionally opaque to external clients. It is exposed
+only via `GET /players/{player_id}` and `POST /players/` responses.
+All mutation endpoints (PUT, DELETE) use `squad_number` as the
+identifier — see ADR-0004.
+
+## Consequences
+
+**Positive:**
+- UUIDs are opaque; no information about row count or insertion order
+  is exposed to clients.
+- The v5/v4 split means seeded records have stable IDs across
+  environments, safe to hard-code in tests.
+- `HyphenatedUUID` stores IDs as standard hyphenated strings in SQLite,
+  readable without special tooling.
+
+**Negative:**
+- UUIDs are larger than integers (36 chars as strings), which has a
+  minor storage and index performance cost — acceptable at PoC scale.
+- The v4/v5 distinction is non-obvious; developers unfamiliar with the
+  codebase may not know why seeded records have deterministic IDs.
+- Clients cannot use the UUID to infer insertion order or paginate
+  reliably by ID (UUID v4 is random).
diff --git a/docs/adr/0004-squad-number-as-mutation-key.md b/docs/adr/0004-squad-number-as-mutation-key.md
@@ -0,0 +1,66 @@
+# ADR-0004: Squad Number as the Mutation Key
+
+Date: 2026-03-21
+
+## Status
+
+Accepted
+
+## Context
+
+The API exposes two identifiers for a player:
+
+- **UUID** (`id`) — surrogate key; opaque, server-generated, stable.
+  See ADR-0003.
+- **Squad number** (`squad_number`) — natural key; human-readable,
+  domain-meaningful (e.g. number 10 = Messi), unique within a squad.
+
+A design choice was required for which identifier mutation endpoints
+(PUT, DELETE) and secondary GET endpoints should use as their path
+parameter. The candidates were:
+
+- **UUID for all endpoints** — consistent use of the surrogate key;
+  clients must store UUIDs after POST or GET to perform mutations.
+- **Squad number for mutations** — uses the natural, human-meaningful
+  key for the operations that domain users care about; UUID remains
+  available for internal/service-to-service use via
+  `GET /players/{player_id}`.
+
+Up to v1.x, the API used UUID for PUT and DELETE. Version 2.0.0
+(Capello) changed mutation endpoints to use squad number, introduced
+`GET /players/squadnumber/{squad_number}` as a lookup alternative, and
+retained `GET /players/{player_id}` for UUID-based lookup.
+
+## Decision
+
+We will use `squad_number` as the path parameter for all mutation
+endpoints (`PUT /players/squadnumber/{squad_number}`,
+`DELETE /players/squadnumber/{squad_number}`) and for the secondary
+GET endpoint (`GET /players/squadnumber/{squad_number}`).
+
+Squad number is the identifier that domain users reason about. Requiring
+clients to store and re-use a UUID to update or delete a player they
+identified by squad number adds unnecessary indirection. Keeping UUID
+lookup available (`GET /players/{player_id}`) preserves
+service-to-service use cases where a stable opaque key is preferred.
+
+As a consequence of using squad number on PUT, the request body's
+`squad_number` field must match the path parameter. A mismatch returns
+HTTP 400; the path parameter is always authoritative.
+
+## Consequences
+
+**Positive:**
+- Mutation endpoints are intuitive for domain users:
+  `PUT /players/squadnumber/10` clearly targets Messi's record.
+- Clients do not need to perform a GET to obtain a UUID before mutating.
+- UUID remains available for stable, opaque internal references.
+
+**Negative:**
+- Squad numbers can change (a player re-assigned to a different number
+  requires a careful update sequence). The API has no special handling
+  for squad number changes — it is treated as any other field update.
+- Two lookup endpoints (`/players/{id}` and
+  `/players/squadnumber/{squad_number}`) increase surface area slightly.
+- The mismatch guard on PUT (body squad number must equal path
+  parameter) is an additional contract constraint clients must respect.
diff --git a/docs/adr/0005-full-replace-put-no-patch.md b/docs/adr/0005-full-replace-put-no-patch.md
@@ -0,0 +1,58 @@
+# ADR-0005: Full Replace PUT, No PATCH
+
+Date: 2026-03-21
+
+## Status
+
+Accepted. Partial update support via PATCH is under consideration —
+tracked in issue #461.
+
+## Context
+
+HTTP defines two semantics for updating a resource:
+
+- **PUT** — full replacement; the request body represents the complete
+  new state of the resource. Fields absent from the body are
+  overwritten with their zero/null values.
+- **PATCH** — partial update; the request body contains only the fields
+  to change. Requires a patch format (JSON Merge Patch, JSON Patch) or
+  a convention for interpreting absent fields.
+
+Both are common in REST APIs. The choice affects the Pydantic model
+design (all fields required vs optional), the service layer logic
+(overwrite all vs merge), and the client contract (must send all fields
+vs only changed fields).
+
+The current implementation uses a single `PlayerRequestModel` shared by
+both POST and PUT. Four fields are required (`first_name`, `last_name`,
+`squad_number`, `position`); the remaining six are optional (`middle_name`,
+`date_of_birth`, `abbr_position`, `team`, `league`, `starting11`). The
+service's `update_by_squad_number_async` overwrites every field on the
+existing record using the request body values.
+
+## Decision
+
+We will implement PUT as a full replacement operation only, with no
+PATCH endpoint.
+
+Full replace is simpler to implement correctly: no merge logic, no
+ambiguity about what an absent field means, and a single request model
+covers both POST and PUT. For a PoC with a small, flat domain model
+(10 fields), requiring the full resource on update is not a meaningful
+burden for clients.
+
+## Consequences
+
+**Positive:**
+- Simple, unambiguous semantics: the request body is the new state.
+- No additional Pydantic model or merge logic required.
+- No need to decide on a patch format (JSON Merge Patch vs JSON Patch).
+
+**Negative:**
+- Clients must send the full resource even to change a single field,
+  which is verbose and risks accidental data loss if a field is omitted.
+- Not suitable for resources with many fields, binary data, or
+  frequently partial updates.
+- Issue #461 tracks the addition of PATCH for partial updates, which
+  would require an optional-fields model and merge logic in the service
+  layer.
diff --git a/docs/adr/0006-in-memory-caching-with-aiocache.md b/docs/adr/0006-in-memory-caching-with-aiocache.md
diff --git a/docs/adr/0007-integration-only-test-strategy.md b/docs/adr/0007-integration-only-test-strategy.md
diff --git a/docs/adr/README.md b/docs/adr/README.md
diff --git a/docs/adr/template.md b/docs/adr/template.md