feat: v1 API overhaul (/v1 namespace, 4 inference verbs, recipe discovery) by marevol · Pull Request #103 · codelibs/recotem

marevol · 2026-05-21T06:24:15Z

Summary

Replaces the alpha-era POST /predict/{name} surface with a versioned v1 HTTP API mounted under /v1, exposing four inference verbs (single/batch × user/related), recipe discovery, and lifted health/metrics endpoints.

9 endpoints, AIP-136 colon-verb pattern (Vertex AI-style)
Algolia-style batch body ({requests: [...]}) with per-element status for partial failures
Artifact / signing / hot-swap / X-API-Key auth unchanged
Pre-existing alpha POST /predict/{name} and GET /models removed (alpha → v1 migration table in docs/migration-v1.md)

Endpoints

Method	Path	Purpose
POST	`/v1/recipes/{name}:recommend`	user → items (single)
POST	`/v1/recipes/{name}:recommend-related`	seed items → items (single)
POST	`/v1/recipes/{name}:batch-recommend`	user bulk
POST	`/v1/recipes/{name}:batch-recommend-related`	seed bulk
GET	`/v1/recipes`	list loaded recipes
GET	`/v1/recipes/{name}`	recipe detail (capability advertise)
GET	`/v1/health`	unauthenticated liveness
GET	`/v1/health/details`	authenticated diagnostics
GET	`/v1/metrics`	Prometheus (opt-in)

Design background

Industry survey (AWS Personalize, Vertex AI Search for Retail, Azure Personalizer, Algolia Recommend, Spotify, Recombee) drove the path-prefix + custom-verb + Algolia-batch hybrid. Full design in docs/specs/2026-05-21-v1-api-overhaul-design.md and 15-task TDD plan in docs/plans/2026-05-21-v1-api-overhaul.md.

Notable decisions

Schemas: RecommendRequest/Response, batch variants, RecipeSummary/Detail in new src/recotem/serving/schemas.py. RecommendItem allows extra (metadata passthrough); requests are strict.
Error codes: UNKNOWN_USER, UNKNOWN_SEED_ITEMS, RECIPE_UNAVAILABLE, RECIPE_NOT_FOUND, VALIDATION_ERROR.
Partial failures: HTTP 200 with per-element status: ok|error; HTTP 503 only when the recipe itself is unavailable.
model_version: sha256:<hex> derived from _loaded_marker[1]; surfaced on every recommend response plus X-Recotem-Model-Version header.
Batch endpoints intentionally do not join per-item metadata (single endpoints still do). Documented in docs/api-reference.md and CHANGELOG.md.

Test plan

uv run pytest -q tests/unit tests/integration → 1614 passed, 4 deselected
uv run ruff check src tests clean; uv run ruff format --check clean
Live smoke test with examples/quickstart/ artifact: /v1/health, :recommend, :recommend-related, :batch-recommend, /v1/recipes all return 200 with the expected envelope
Final review caught a bug where _try_load_artifact (startup-scan path) didn't populate loaded_at_unix/config_digest/algorithms; fixed in d29b1aa with a regression test
Reviewer to gut-check error code naming and the partial-failure contract before merge

Docs updated

README.md quickstart
docs/getting-started.md
docs/operations.md (SLO/metrics table reflects v1 verbs)
docs/api-reference.md (new — authoritative endpoint reference)
docs/migration-v1.md (new — alpha → v1 mapping)
CLAUDE.md (legacy references purged)
CHANGELOG.md (new — Unreleased section)

Known follow-ups (not blocking merge)

schemas.py:loaded_at: str could become AwareDatetime for OpenAPI-side validation.
X-Recotem-Metadata-Degraded header is documented but not currently emitted by v1 endpoints (server-side metric still recorded).
Consider renaming serving/routes.py to serving/_metadata_join.py now that only _lookup_metadata lives there.

Captures the design decisions (industry survey of major recommendation APIs, business motivation, endpoint catalogue, schemas, status codes, metrics) and the 15-task TDD plan for delivering the v1 surface.

Confirms /v1/recipes/{name}:recommend style paths route and appear in OpenAPI before refactoring routes.py. Removed in Task 13.

- POC docstring + plan Task 1 file list: Task 13 -> Task 12 (the file is removed alongside the legacy router retirement, not the e2e conversion task). - Spec: drop unresolvable recotem-playground cross-reference and the bare 'see separate survey report' sentence (replaced with inline citation list of the vendor docs that informed the design). - Spec §9 acceptance criterion: OpenAPI is mounted at /openapi.json, not /v1/openapi.json.

Introduces RecommendRequest/Response, batch variants, RecipeSummary, etc. Used by the upcoming v1 router. No behaviour change yet.

ModelEntry now carries a v1-shaped artifact identifier (sha256:<hex>), load timestamp (ISO UTC), inference kind, and the list of supported verbs so the upcoming /v1/recipes endpoint can publish them without re-reading artifact files. The watcher passes loaded_at_unix / config_digest / algorithms on every successful (re-)load.

Introduces recotem_v1_requests_total{recipe,verb,status}, recotem_v1_request_latency_seconds{recipe,verb}, and recotem_v1_batch_size{recipe,verb} histograms. Legacy record_predict() remains untouched and will be removed in Task 12.

Mirrors the legacy make_router signature so app.py can swap routers in Task 12. Inference endpoints land in Tasks 7-10.

Adds the first inference endpoint to the v1 router. Returns the new RecommendResponse envelope (request_id / recipe / model_version / items) with structured error detail bodies: - 503 RECIPE_UNAVAILABLE when entry missing or not loaded - 404 UNKNOWN_USER when the recommender raises KeyError - 422 from Pydantic validation (e.g. empty user_id) The metadata join reuses the legacy _lookup_metadata helper from routes.py (kept until Task 12 retires make_router) and the metrics hook routes through _metrics.record_v1_request(name, "recommend", ...). Also updates the Task 5 skeleton test to assert against a verb the router does not define, so it remains valid as inference routes land.

- app.py now wires make_v1_router(...) at prefix=/v1 - routes.py reduced to the _lookup_metadata helper (still imported by v1_router) - legacy tests/unit/test_serving_routes.py removed - POC test (test_v1_colon_path_poc.py) removed - routes-dependency-introspection regression test removed (its invariants apply only to the deleted v0 router; v1_router uses Annotated[] which is compatible with from __future__ import annotations) - test_serving_app.py and test_cli.py probe paths migrated to /v1/* (e.g. /health -> /v1/health, /predict/{x} -> /v1/recipes/{x}:recommend, /models -> /v1/recipes); make_router monkey-patch references switched to make_v1_router - test_v1_router_basics.py: stale 404-on-undefined-verb assertion rewritten to probe a verb that does not exist as a route - metadata/loader.py docstring xref updated to point at make_v1_router

Replaces /predict/{name} calls with /v1/recipes/{name}:recommend and adds a :recommend-related coverage case using the existing quickstart artifact.

…ion-v1 Aligns published documentation with the v1 API surface.

_try_load_artifact constructed ModelEntry without loaded_at_unix / config_digest / algorithms, so GET /v1/recipes reported loaded_at='1970-01-01T00:00:00Z' for every recipe at startup until a hot-swap occurred. Mirrors the watcher._build_entry fix from Task 3. Also clarifies the X-Recotem-Metadata-Degraded doc bullet to drop the 'legacy code paths' wording (legacy paths were retired in Task 12).

Health probe and recommend call were still hitting /health and /predict/{name}; v1 mounts them under /v1/health and /v1/recipes/{name}:recommend with a `limit` field and a flat response schema. CI e2e job was timing out at the health-wait loop.

- RequestIDMiddleware: echo X-Request-ID on every response (including HTTPException and unhandled-error paths) and bind structlog contextvars so downstream log lines carry request_id without each handler having to pass it. Client-supplied IDs are validated to a short charset and replaced with a server-generated one when missing or malformed. - Split RECIPE_NOT_FOUND (404) vs RECIPE_UNAVAILABLE (503) in recipe_detail and the four inference handlers; previously a stub registry entry (loaded=False) was indistinguishable from an unknown recipe, breaking the retry contract documented in api-reference.md. - Drop the orphan recotem_predict_total / recotem_predict_latency_seconds metrics that lingered after /predict/{name} was retired. Inventory docstring now lists the v1 metrics, and CHANGELOG + migration-v1 call out the removed metric names so dashboards/alerts can be retargeted at recotem_v1_requests_total / recotem_v1_request_latency_seconds.

These directories hold session-scoped planning and design notes that should not ship in the repository. Add them to .gitignore so future edits do not accidentally re-add them.

Drop the legacy ``v1_router.py`` / ``routes.py`` split. ``routes.py`` is now the single router module, ``make_v1_router`` is renamed to ``make_router``, and the ``_lookup_metadata`` helper is inlined. The ``/v1`` URL prefix is still applied at mount time via ``app.include_router(..., prefix="/v1")``. Also folds in expanded test coverage that was in flight: ``X-Recotem-Model-Version`` response-header round-trip, recipe-name path regex enforcement, schema field round-trips, dict / DataFrame metadata enrichment paths, and additional e2e scenarios.

The scheduled nightly suite re-downloaded MovieLens100K from files.grouplens.org on every run via irspack's MovieLens100KDataManager, making green CI dependent on an external server that intermittently times out (run 26212149650 errored at fixture setup with Errno 110). Drop the workflow entirely; the MovieLens-backed slow tests remain available via `pytest -m slow` for local runs.

Alpha→v1 migration page is unnecessary for this PR's scope; the v1 API replaces the alpha endpoints outright and the README/operations docs already cover the current shape.

Address review of PR #103 v1 API overhaul: restore observability parity with the legacy /predict handler and rationalise the error body shape across the v1 surface. Code: - routes.py: read request_id from request.state (set by RequestIDMiddleware) instead of re-parsing the header per handler, resolving a body/header split-brain for 65-128 char IDs. Bind recipe/kid via structlog.contextvars for the duration of each inference handler and unbind in finally. Emit recipe_unavailable WARN on every 404 RECIPE_NOT_FOUND / 503 RECIPE_UNAVAILABLE. Add (MemoryError, RecursionError) fast-path before the generic Exception branch so OOM does not run through logger.exception. Capture exc and include error_class on unexpected-error logs. - app.py: new HTTPException handler flattens dict-shaped details to {detail, code} top-level so the body is no longer double-nested. Defensive setdefault on detail using _DEFAULT_DETAIL_FOR. New RequestValidationError handler returns {request_id, detail, code: VALIDATION_ERROR, errors} and records recotem_v1_requests_total{status=validation_error} when the path matches a v1 inference verb. _unhandled_exception_handler now attaches X-Request-ID from request.state because ServerErrorMiddleware wraps outside RequestIDMiddleware. Docs: - api-reference.md: align X-Request-ID regex to {1,128}, drop 403 from status code lists, add RECIPE_NOT_FOUND to per-endpoint codes, document flat error body envelope and 422 / 500 shapes. - operations.md: rewrite error body samples from {"error": {...}} to the actual flat {"detail", "code"} shape, add 422 section. - security.md: update trust boundary diagram, inference response section, and nginx rate-limit example to /v1/* (zone recotem_v1). - Remove stale /predict references from registry.py, app.py, metrics.py (Prometheus help text), recipe/models.py, metadata/loader.py, CONTRIBUTING.md, compose.yaml, README and example recipes. Tests: - Update v1 unit tests for the flat error envelope. - Add tests/conftest.py build_v1_app helper that mirrors create_app wiring (RequestIDMiddleware + all three exception handlers). - New tests/unit/test_v1_error_handling.py covers X-Request-ID consistency, flat error body shape across 401/404/422/500/503, request_id correlation, 405 method-not-allowed, FastAPI auto-404, path-param validation, contextvars cleanup, and exception-handler parity between create_app and build_v1_app. Full test suite: 1716 passed, 4 deselected. ruff check + format clean.

…rity, signals) Resolves the blockers and important issues from the multi-agent review of the v1 API overhaul: - Auth: require X-API-Key for /v1/metrics (was unauthenticated). - Schemas: extra="forbid" on every request/response model except RecommendItem (metadata passthrough). Per-element max_length on seed_items/exclude_items. Aggregate batch-cap validator (sum of limits <= 5000). kind/supported_verbs as Literal. loaded_at validated as ISO-8601 UTC. score allow_inf_nan=False. RecipeDetailResponse no longer inherits from RecipeSummary. - exclude_items: wired into :recommend and :recommend-related as a post-filter (was accepted but silently ignored). - Typed responses: handlers return RecommendResponse/BatchRecommendResponse instead of JSONResponse, with X-Recotem-Model-Version set on the response object. - Batch error handling: batch-recommend-related now has per-element try/except parity with batch-recommend; both broadened to catch generic Exception (MemoryError/RecursionError re-raised), so one bad element no longer 500s the whole batch. - Error message hygiene: stop echoing user_id/seed_items in error bodies; rely on the machine-readable code field (mitigates membership-oracle). - Metadata-Degraded signal: _lookup_metadata returns (fields, degraded); single-verb handlers set X-Recotem-Metadata-Degraded: 1 when join fails. - /v1/recipes/{name} detail restores trained_at, best_class, best_params, best_score, metric, cutoff, tuning, data_stats, recotem_version, irspack_version, recipe_hash (previously dropped from GET /models). - Handler refactor: _resolve_entry helper + _request_metrics context manager removes ~150 lines of duplicated prelude across the four verbs. - 500 handler body now includes request_id (symmetric with 422 handler). - 422 handler strips raw input/ctx from error dicts (prevents echo of client-supplied secrets). - Path-param regex tightened to require alphanumeric first char. - _REQUEST_ID_RE narrowed from 128 to 64 chars to match docs. - last_load_error sanitized (URI redaction + 200-char cap) before storage in the registry / details endpoint. - _lookup_metadata warning log truncates item_id and rate-limits to 10 per (recipe, kind) tuple to prevent log flooding. - health() simplified: dead second loop removed. - watcher sidecar handling: TypeError/OSError no longer silently swallowed. - registry.models_dict docstring updated (legacy /models endpoint removed). - Posture warning loop cadence: 5 min outside test env. - Test fixture: defensively unregister v1 Prometheus collectors by name to survive cross-test state leakage. CI: - secrets-in-logs grep: exclude public model_version field and X-Recotem-Model-Version header from sha256:<hex64> false positives.

…d 422 sanitization Add tests covering auth requirements on related/batch endpoints, request_id echoing in error envelopes, 422 error dict sanitization (input/ctx stripped), path regex leading-char rejection, and kid contextvar log binding. Update conftest's build_v1_app to mirror the production 422 sanitization so tests exercise the real response shape.

…alidation Round of v1 API maturation based on review feedback: - Add NO_CANDIDATES (404) to distinguish ranker survival failure from UNKNOWN_SEED_ITEMS in :recommend-related and per-element batch. - Switch unknown-recipe response from 503 recipe_unavailable to 404 RECIPE_NOT_FOUND; clients should treat 404 as hard fail vs 503 retry. - Uppercase all error codes (MISSING_API_KEY / INVALID_API_KEY / INTERNAL_ERROR / VALIDATION_ERROR) for consistency. - Validate batch sub-requests per-element (bad ones surface as status=error code=VALIDATION_ERROR); aggregate sum(limit)<=5000 cap now also enforced per-element. - Expand recotem_v1_requests_total status enum and add recotem_v1_batch_element_errors_total{recipe,verb,code}; add reason label on recotem_artifact_load_failures_total so HMAC failures (a security signal) are alertable independently. - Drop dead X-Recotem-Metadata-Degraded code path (unreachable since metadata_index is populated at every load). - Relax X-Request-ID echo regex to {1,128} and /v1/recipes/{name} path regex to ^[A-Za-z0-9_-]{1,64}$ to match recipe-loader constraints. - Log batch per-element failures with logger.exception and the actual exc_type; raise startup HMAC verify failures to ERROR with exc_info.

Silent failures & observability - _any_seed_known + user_known AttributeError: log + recotem_recommender_layout_unexpected_total + INTERNAL_ERROR propagation - KeyError mis-attribution fixed: pre-check membership; unexpected KeyError -> INTERNAL_ERROR via logger.exception - _unhandled_500 / validation_failed now structured-log with sanitized errors and request_id - Auth bypass log carries mode (insecure_no_auth vs loopback_no_keys) - Watcher: recotem_watcher_state_divergence_total, dir_scan failure reason, sidecar_disappeared transition warning - inc_metadata_lookup_error wired via on_row_error callback in build_metadata_index (serving/metadata layering preserved) Schema & type design - BatchResultEntry as discriminated union (_BatchResultOk | _BatchResultErr) removes anti-pattern - Sha256Hex / HexHash branded types applied to model_version, config_digest, recipe_hash (None for stub entries) - ModelEntry.loaded_at returns tz-aware datetime; loaded_at/trained_at typed as AwareDatetime - RecipeDetailResponse: metric Literal, cutoff Field(ge=1), version Field(pattern=...) - RecommendRequest.context dropped (restores extra=forbid integrity) - RecommendItem.item_id bounded matching _ItemStr Dead code & convention - inc_metadata_lookup_error / metadata_field_deny removed/wired - _RecipeName Annotated alias -> name: str = Path(pattern=...) per CLAUDE.md - _batch_error_entry typed against ErrorCode; drop type-ignore - _emit_security_posture: logger.exception spec alignment API contract - include_metadata: bool = False on batch requests restores batch <-> single shape parity (opt-in) - batch_user_known re-initialized per loop iteration Tests +178 (1614 -> 1792): discriminated union, branded types, AwareDatetime, include_metadata opt-in, KeyError attribution, auth bypass mode, watcher dir_scan/sidecar_disappeared, structured 500/422 logs, batch-recommend-related empty list 422, HTTP-layer concurrent hot-swap, recipe_not_found across all 4 verbs.

Create docs/migration-v1.md covering endpoint mapping, field renames, flat error envelope, X-Recotem-Metadata-Degraded removal, metrics renames (no dual-emit), /v1/metrics auth requirement with Prometheus scrape-config snippet, batch metadata opt-in, and partial-failure semantics. Add Migration subsection in CHANGELOG.md linking to the guide.

Add explicit warning in api-reference.md that GET /v1/metrics now requires X-API-Key (the alpha /metrics was unauthenticated) and cross-link to the migration guide for the Prometheus scrape-config snippet.

- Add _sanitize_validation_errors() and _format_batch_validation_message() helpers so both batch handlers share the same logic. - Both batch_recommend and batch_recommend_related now build a human-readable loc+msg string from exc.errors()[0] and log sanitized error details (loc, msg, type only — no user input) at WARNING level. - Rename _BatchResultOk/_BatchResultErr to BatchResultOk/BatchResultErr (M4: drop underscore prefix since they appear in the public OpenAPI schema). - Update all tests to use the new public names and assert that VALIDATION_ERROR messages contain the violating field name.

Add a callout in operations.md (structured-log events section) explaining that the alpha X-Recotem-Metadata-Degraded per-response header is gone in v1 and directing operators to recotem_metadata_lookup_errors_total for load-time metadata join failures.

- Add "timeout" to _LOAD_FAILURE_REASONS in metrics.py (stat hangs in the executor are distinct from read errors — infrastructure vs. data signal). - Stat-timeout path in _poll_artifacts now passes reason="timeout". - Both _read_artifact_bytes failure paths in _load_recipe now pass reason="read" (previously defaulted to "unexpected"). - Hoist `import errno` to module top in watcher.py (M8: was lazy import inside _check_sidecar_changed with noqa suppression).

Include exc_type=type(exc).__name__ and error=str(exc)[:200] in the metadata_index_row_error warning so operators can diagnose the root cause (e.g. AttributeError from a non-unique index, TypeError from a non-string column) without enabling debug logging or adding instrumentation.

- Add recipe_name parameter to _build_items() so the warning and metric can be attributed to the correct recipe. - Wrap RecommendItem.model_validate() in try/except ValidationError: on failure, log a metadata_serialization_failed WARNING with item_id and truncated error (no user input), increment inc_metadata_lookup_error, and skip the item rather than aborting the entire response.

Add integration test that loads artifact A, swaps in artifact B via registry.replace_with_marker, then verifies model_version and X-Recotem-Model-Version header change between calls. Both values must match sha256:<64 hex>, and the header must equal the body field.

Add parametrized tests for all 4 recommend verbs asserting that a valid-length but wrong X-API-Key returns 401 INVALID_API_KEY (T2). Add key-rotation tests (T8): configure two keys (old + new), verify both authenticate with 200 on :recommend, and a third key gets 401.

Add three unit tests verifying that insecure_no_auth=True lets :recommend, :recommend-related, and :batch-recommend succeed without an X-API-Key header. Existing dev-bypass tests only hit /v1/health/details; these cover the actual prediction paths.

Add integration test that starts ArtifactWatcher with a real recipes directory, verifies the entry exists, deletes the YAML file, waits for the watcher to process the deletion, then asserts :recommend returns 404 (RECIPE_NOT_FOUND) or 503 (RECIPE_UNAVAILABLE).

Add test_recommend_related_includes_metadata_fields and test_recommend_related_strips_denied_fields to mirror the existing :recommend coverage for the related verb. Denied fields applied at load time must not appear in :recommend-related items.

Add tests verifying that recotem_v1_requests_total counters accumulate value (one Prometheus line per distinct label-set, not one per request) and that the recipe_not_found status label is correctly assigned for RECIPE_NOT_FOUND 404 responses.

Add three tests verifying the per-recipe shape returned by /v1/health/details: healthy entries include loaded=True, best_class, trained_at, kid, and no error field; stub entries include loaded=False and an error string. Two-recipe scenario (1 healthy + 1 stub) confirms the degraded aggregate status and correct per-recipe fields.

M1: Add ModelRegistry.health_counts() -> (loaded, total) returning both values under a single lock so /v1/health cannot observe a TOCTOU split between loaded_count() and health_snapshot(). M2: Add code comments near both health handlers documenting the intentional design difference (probe vs. operator endpoint). M3: app.py startup now calls registry.loaded_count() for set_active_recipes instead of len(loaded_entries) to avoid a count desync. M5: Consolidate the banner warn_loop: capture flags into local variables so only one asyncio task fires per interval even when both --insecure-no-auth and --dev-allow-unsigned are active. M6: Guard best_score in recipe_detail against NaN/Inf by converting non-finite floats to None before returning (matches RecommendItem.score posture). M7: Add sidecar_unsupported sentinel to _RecipeWatchState so TypeError in _check_sidecar_changed only warns once per recipe rather than every poll.

Apply review fixes across critical, major, and minor categories from the v1 API overhaul review. Critical: - Restore response-side degradation signal: emit X-Recotem-Items-Degraded and recotem_v1_metadata_degraded_items_total with fallback/dropped path in _build_items (single recommend verbs only). - Split recotem_metadata_lookup_errors_total into recotem_metadata_index_build_errors_total{recipe} (load-time) and recotem_metadata_serialization_errors_total{recipe,verb} (request-time) to disambiguate on-call routing. - Re-evaluate sidecar after recipe YAML mtime change instead of permanent skip on TypeError or repeated transient OSError. Major: - Populate ModelEntry.algorithms from header tuning.tried_algorithms when the top-level field is absent. - Normalize config_digest at the ModelEntry boundary so the sha256: prefix matches the Sha256Hex schema regardless of writer convention. - Split _resolve_entry's recipe_unavailable warning into recipe_not_found (404) and recipe_not_loaded (503) for distinct alert routing. - Drop route-level logger.exception in verb handlers; rely on the global _unhandled_exception_handler for a single unhandled_500 emission. - Reset _post_hmac_failure_streak in the generic-Exception branch of watcher._load_recipe so the streak only counts deserialize failures. Minor: - Bind recipe contextvar in recipe_detail / list_recipes. - Use reason="unexpected" for non-ArtifactError reads in _record_load_failure. - Replace NUL-byte kid sentinel with "<extract_failed>". - Include user_id_hash / seed_items_count in 500 logs for debugging. - Add recotem_v1_validation_errors_outside_verb_total for non-verb 422s. - Default _classify_artifact_error to "unexpected" with a WARN log. - Log security.posture even when validate_insecure_flags raises. - Add signing_key_status="construction_failed" for keyring build errors. - Suppress sidecar reload storms on non-ENOENT OSError after 3 strikes. - Centralize stub-name dedup in serving/_naming.py. - Centralize header field extraction in serving/_header_utils.py. Browser interop / observability: - Add CORS expose_headers for X-Request-ID, X-Recotem-Model-Version, and X-Recotem-Items-Degraded so JS clients can read them cross-origin. - Whitelist kind label values in inc_metadata_degraded_items to bound Prometheus label cardinality. - Update recotem_recommender_layout_unexpected_total HELP text to mention both user_id_to_index and item_id_to_index probes. Tests: - Cover the new degraded fallback / dropped paths via monkeypatched RecommendItem.model_validate. - Add include_metadata and exclude_items coverage for :batch-recommend-related (mirroring :batch-recommend). - Assert outer-except partial-failure behavior for batch verbs. - Assert sidecar_unsupported clears on YAML mtime change. - Assert _post_hmac_failure_streak resets after generic exceptions. - Assert 404 / 503 log events use the new recipe_not_found and recipe_not_loaded names; assert exactly one unhandled_500 per failure. - Add Sha256Hex round-trip for normalize_config_digest with a 64-hex sample. - Rename stale test names from the old recipe_unavailable wording. Docs: - Remove CHANGELOG.md and docs/migration-v1.md (alpha was internal-only; no public migration path needed). - Update docs/api-reference.md, docs/operations.md, docs/security.md, and signing.py comments to drop CHANGELOG references and document the new X-Recotem-Items-Degraded header.

marevol added 30 commits May 21, 2026 13:08

docs: spec and implementation plan for v1 API overhaul

9299303

Captures the design decisions (industry survey of major recommendation APIs, business motivation, endpoint catalogue, schemas, status codes, metrics) and the 15-task TDD plan for delivering the v1 surface.

test: POC for AIP-136 colon-verb paths in FastAPI

c212eaf

Confirms /v1/recipes/{name}:recommend style paths route and appear in OpenAPI before refactoring routes.py. Removed in Task 13.

feat(serving): add v1 schema module (Pydantic v2)

a377e56

Introduces RecommendRequest/Response, batch variants, RecipeSummary, etc. Used by the upcoming v1 router. No behaviour change yet.

feat(serving): skeleton v1_router factory

a7ca066

Mirrors the legacy make_router signature so app.py can swap routers in Task 12. Inference endpoints land in Tasks 7-10.

feat(serving): port health + metrics into v1_router

11fa153

feat(serving): POST /v1/recipes/{name}:recommend-related

e55a7ce

feat(serving): POST /v1/recipes/{name}:batch-recommend

e1de8c9

feat(serving): POST /v1/recipes/{name}:batch-recommend-related

20ec9e3

feat(serving): GET /v1/recipes and /v1/recipes/{name}

9f65e78

test(integration): retarget e2e suite at v1 endpoints

99525bd

Replaces /predict/{name} calls with /v1/recipes/{name}:recommend and adds a :recommend-related coverage case using the existing quickstart artifact.

docs: refresh README + getting-started + add api-reference and migrat…

813a9f3

…ion-v1 Aligns published documentation with the v1 API surface.

style: apply ruff lint + format fixes across v1 modules

8f4e595

docs: record v1 API overhaul in CHANGELOG

2a63d3e

chore: stop tracking local working docs under docs/plans and docs/specs

0a0edf6

These directories hold session-scoped planning and design notes that should not ship in the repository. Add them to .gitignore so future edits do not accidentally re-add them.

docs: drop migration-v1 guide

eb1d84b

Alpha→v1 migration page is unnecessary for this PR's scope; the v1 API replaces the alpha endpoints outright and the README/operations docs already cover the current shape.

marevol added 16 commits May 22, 2026 13:12

docs(serving): note /v1/metrics auth requirement

f6d0a46

Add explicit warning in api-reference.md that GET /v1/metrics now requires X-API-Key (the alpha /metrics was unauthenticated) and cross-link to the migration guide for the Prometheus scrape-config snippet.

marevol added this to the 2.0.0a1 milestone May 22, 2026

marevol merged commit b4d6b95 into main May 22, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: v1 API overhaul (/v1 namespace, 4 inference verbs, recipe discovery)#103

feat: v1 API overhaul (/v1 namespace, 4 inference verbs, recipe discovery)#103
marevol merged 46 commits into
mainfrom
feat/v1-api

marevol commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

marevol commented May 21, 2026

Summary

Endpoints

Design background

Notable decisions

Test plan

Docs updated

Known follow-ups (not blocking merge)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant