This file provides guidance to coding agents (Claude Code, Cursor, etc.) when working in this repository.
uv sync # install deps (creates .venv)
uv run pytest # all tests (uses testcontainers → needs Docker/OrbStack)
uv run pytest --cov # with coverage gate (fail_under = 85, branch coverage)
uv run pytest tests/test_parsers.py # one file
uv run pytest -k test_revert # one test by keyword
uv run ruff check . && uv run ruff format --check . # lint + format check
uv run ruff format . # auto-format
uv run pyright # type-check (strict mode for src/)
RIPTIDE_DB_URL=... uv run alembic upgrade head # apply migrations
RIPTIDE_DB_URL=... uv run alembic downgrade base # tear down
podman-compose up # local dev: Postgres + migrations + app on :8000If docker ps fails, ask the user to start OrbStack.
- Append-only ingestion. Every webhook handler does
INSERT … ON CONFLICT (delivery_id) DO NOTHING. NeverUPDATEorDELETEevent rows. Webhook retries must be idempotent.delivery_idis the dedup key for each source. - Raw payload always stored.
payload JSONBkeeps the full request body even if fields are extracted into typed columns. Don't drop fields you don't currently use. riptide.jsonis config, not data.openshift/collector/riptide.jsonis the in-repo sample; it declares teams (name +group_email) and org-wide automation rules. Edits go through PRs. The running pod hot-reloads via mtime inRiptideConfigStore.maybe_reload(). Do not propose moving the config into Postgres.- Per-team bearer keys live in a separate file: in production it is mounted from the
riptide-collector-team-keysSecret (never committed); the in-repoopenshift/collector/team-keys.jsonis a dev sample with deterministic test hashes (raw dev bearers documented incompose.yaml). Stored as sha256 hashes;TeamKeysStorehot-reloads it the same way as the config. The bearer is the team identity — every webhook is tagged withteam = caller_team. - No
servicecolumn. Noservice_idon the wire. Per-source aggregations group byrepo_full_name/pipeline_name/app_name/repo; org-wide rollups group byteam. Cross-source joins for BB↔Pipeline usecommit_sha; Argo CD joins are described in the next bullet. Identifiers are lowercased at ingest (commit_sha,revision,repo_full_name,branch_name,repo) so SQL joins are case-stable. Do not propose adding a unifiedservicecolumn orservice_idfield — it served only single-pane labelling and was dropped. automationis org-wide. Bot definitions live at the config root, not per team.- Metrics are computed on read, not at ingest. Don't add aggregation tables or scheduled rollup jobs in v1. Schema additions should preserve raw events; new metrics are SQL queries against existing rows or future materialized views.
- Commit SHA joins Bitbucket↔Pipeline; Argo CD needs
payload->'images'.bitbucket_events.commit_sha = pipeline_events.commit_shais a deterministic join (App-repo SHA on both sides).argocd_events.revisionis the GitOps-repo SHA (proven empirically: four Apps for one service share one revision), so it does NOT directly match the other two. The App-repo SHA is embedded in image tags rendered via.app.status.summary.images; the receiver stores them inpayload->'images'. A future correlator extracts SHAs from those tags to bridge Argo CD events to pipeline events. Do not propose adding aservice_idor hand-coded service-name mappings to fix correlation — the image-tag SHA is the contract. change_typelives on Bitbucket events only. Don't denormalise it onto pipeline / Argo rows; join Pipeline rows viacommit_shaand Argo rows via the image-tag SHA inpayload->'images'at read time.- CI events are source-tagged, not source-routed. All pipeline events from any CI (Jenkins, Tekton, …) land in the single
pipeline_eventstable viaPOST /webhooks/pipeline, distinguished by thesourcecolumn. Do not add per-CI tables or endpoints. The dedup key issource#pipeline_name#run_id#phase. - Noergler events carry finops + reviewer-precision only. The
noergler_eventstable isevent_type-discriminated (completed|feedback) and is fed byPOST /webhooks/noerglerfrom optional noergler instances. Do not re-emit PR lifecycle from noergler —bitbucket_eventsalready covers open / merged / declined. Dedup keys:completed#<run_id>andfeedback#<finding_id>#<verdict>. - Senders verify reachability + bearer at startup via
GET /auth/ping. Authenticated endpoint returning{"status":"ok","team":"<caller_team>"}. Use this from any sender (noergler, future ones) to fail-fast on a wrong token. Don't reuse/health(unauth liveness) or/ready(unauth readiness) for this — those answer different questions. modified_athas a Postgres trigger (riptide_set_modified_at), not just SQLAlchemyonupdate. Raw-SQL updates also bump it. Keep the trigger when changing migrations.- Database is external.
riptide-collectordoes NOT manage Postgres. Do not add a Postgres Deployment toopenshift/. - Pyright strict for
src/, standard fortests/andmigrations/. New code undersrc/must satisfy strict mode — noAnyleaks; narrowOptionals withisinstanceor helpers like_as_dict()inrouters/bitbucket.py.
- Layering. Routers do HTTP + auth + dispatch only. Source-specific payload extraction lives in
parsers_<source>.py(e.g.parsers_bitbucket.py) as pure functions returning a typed*EventDraftdataclass — no HTTP, no DB, no config. The router computes config-derived fields (e.g.automation_source) and persists. Don't put extraction logic in routers; don't duplicate JSON-shape coercion (_as_dict/_as_list-style helpers belong with the extractor that uses them). - Single Python package,
riptide_collector(flat top-level, not a namespace package). Future suite components (e.g.riptide-api,riptide-dashboard) get their own top-level package, e.g.riptide_dashboard— leave architectural room for them. - Webhook routers are factories that return an
APIRouter. Bitbucket needs the config for automation detection (make_router(config, session_factory, auth_dep)); Pipeline, ArgoCD, and Noergler don't, so they take just(session_factory, auth_dep). They're wired up insrc/riptide_collector/main.py::create_app. Add the config only when a router actually needsautomationrules or team metadata. - Pydantic schemas: strict for
/webhooks/pipelineand/webhooks/argocd(we own the contract — invalid payloads must 422); permissive raw-dict parsing for Bitbucket (its payload shapes vary; we best-effort extract). - Use
_as_dict()/_as_list()helpers inrouters/bitbucket.pyto coerce arbitrary JSON shapes — pyright strict won't accept chained.get()onOptional[dict]. - Tests use real Postgres via testcontainers, never SQLite. The
clientfixture intests/conftest.pydepends onsession_factorywhich truncates tables per test. .pre-commit-config.yamlruns ruff + pyright + uv-lock-check; expect CI to enforce the same.
- One JSON object per line, on stdout. OpenShift's Splunk Connect for Kubernetes (SCK) tails the container log; Splunk auto-extracts fields via
KV_MODE=jsonfor sourcetyperiptide:collector:json(set as pod annotation inopenshift/collector/deployment.yaml). - Stdlib loggers (uvicorn, sqlalchemy, alembic) are bridged through structlog. Do NOT add separate logging handlers or re-init
logging.basicConfig—configure_logging()inlogging_config.pyis the single entry point. - Splunk-reserved field names are forbidden as kwargs:
source,sourcetype,host,index,time,_time,_raw,event. The CI vendor field isci_system(notsource); the structlog event name lives inmsg(renamed fromevent); severity lives inlog_level(renamed fromlevel). A runtime processor (_strip_reserved) namespaces accidental reserved kwargs undersplunk_<name>as a safety net — do not rely on it; pick the right name from the start. - Field-naming convention for non-reserved kwargs: prefer generic names that mean the same across sources (
event_type,status,phase,delivery_id,team,repo,commit_sha). Don't pre-namespace with the source name (noergler_event_type,pipeline_status) —webhook_sourcealready disambiguates instats by webhook_source, event_type. Only namespace when two sources legitimately mean different things by the same word and would collide in a single Splunk panel. - Webhook handlers emit exactly one
msg=webhook_processedlog per request with required fieldswebhook_source ∈ {bitbucket,pipeline,argocd,noergler},outcome ∈ {accepted,deduped,ignored,skipped},delivery_id,team. Source-specific fields go alongside (e.g.app,revision,phasefor argocd). Includedelivery_ideven onignored/skippedpaths so triage has a key. outcome=dedupedis detected viaRETURNING delivery_idon theINSERT ... ON CONFLICT DO NOTHING— aNonescalar means the row already existed. Preserve this when adding new sources.- Persist failures: wrap the
async with session_factory()block intry/except Exception: logger.exception("webhook_persist_failed", ...); raise. Never swallow. - Access log is emitted by the
access_logmiddleware inmain.pyasmsg=http_requestwithrequest_id,method,path,status_code,duration_ms.request_idis bound to contextvars so any log within the request inherits it./healthand/readyare silenced; uvicorn.access is set to WARNING (do not lower it). - Splunk
props.confsnippet (owned by platform team, kept here for reference):[riptide:collector:json] SHOULD_LINEMERGE = false LINE_BREAKER = ([\r\n]+) KV_MODE = json TIME_PREFIX = "timestamp":\s*" TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6NZ TRUNCATE = 0
openshift/ is suite-level, structured per-component. The collector lives in openshift/collector/. When adding a new component:
- Create
openshift/<component>/with its ownkustomization.yaml. - Add it to the
resources:list inopenshift/kustomization.yaml. - Every container needs explicit
requestsANDlimitsfor cpu and memory — no exceptions. - Use
runAsNonRoot: trueandreadOnlyRootFilesystem: true; no fixedrunAsUser(OpenShift assigns a random UID per project).
If asked to add these, push back unless the user is explicit:
- Change failure rate / failed deployment recovery time (DORA's current term, formerly MTTR) — no reliable incident source yet; schema reserves room for rollback-proxy detection
- Backfill workers (forward-only ingestion only)
- Aggregation API or metric endpoints (collector ingests; reads are SQL or future siblings)
- Helm chart (Kustomize is enough for v1)
- Postgres deployment manifests