alexization
diff --git a/‎.codex/evals/README.md‎
Lines changed: 20 additions & 0 deletions b/‎.codex/evals/README.md‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎.codex/evals/templates/feature-delivery.md‎
Lines changed: 22 additions & 0 deletions b/‎.codex/evals/templates/feature-delivery.md‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 8 additions & 0 deletions b/‎.gitignore‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎AGENTS.md‎
Lines changed: 82 additions & 0 deletions b/‎AGENTS.md‎
Lines changed: 82 additions & 0 deletions
diff --git a/‎ARCHITECTURE.md‎
Lines changed: 153 additions & 0 deletions b/‎ARCHITECTURE.md‎
Lines changed: 153 additions & 0 deletions
diff --git a/‎PLANS.md‎
Lines changed: 83 additions & 0 deletions b/‎PLANS.md‎
Lines changed: 83 additions & 0 deletions
@@ -0,0 +1,20 @@
+# Evals
+
+Use this directory for repo-local eval definitions that measure whether the AI
+workflow and the product behavior are improving or regressing.
+
+Recommended layout:
+
+```text
+.codex/evals/
+  templates/
+  <feature-name>.md
+  <feature-name>.log
+```
+
+For non-trivial changes, define:
+
+- capability evals for the new behavior
+- regression evals for the old behavior that must keep working
+- clear pass or fail evidence
+
@@ -0,0 +1,22 @@
+# EVAL: <feature-name>
+
+## Capability evals
+
+- [ ] The intended user-visible behavior works end to end.
+- [ ] The relevant Playwright journey passes.
+- [ ] The expected log evidence is present.
+
+## Regression evals
+
+- [ ] Existing adjacent behavior still works.
+- [ ] No new console or runtime errors appear.
+- [ ] Build, lint, typecheck, and tests still pass.
+
+## Evidence
+
+- Plan:
+- Playwright artifact path:
+- CDP artifact path:
+- Log query:
+- Notes:
+
@@ -0,0 +1,8 @@
+.runtime/
+.worktrees/
+.artifacts/
+.idea/
+playwright-report/
+test-results/
+dist/
+coverage/
@@ -0,0 +1,82 @@
+# git-ranker-workflow AGENTS
+
+This repository is the control plane for the `git-ranker` backend and the
+`git-ranker-client` frontend. Keep this file short. The system of record lives
+in [ARCHITECTURE.md](ARCHITECTURE.md) and [docs/](docs/index.md).
+
+## What this repo owns
+
+- Repository-local knowledge store and operating rules for coding agents
+- Cross-repo feature delivery workflow, QA loop, and observability workflow
+- ExecPlan conventions for long-running tasks
+- Guardrails for frontend/backend coordination across the two submodule repos
+
+## Repo map
+
+- `git-ranker/`: backend repo (submodule)
+- `git-ranker-client/`: frontend repo (submodule)
+- `ARCHITECTURE.md`: top-level control-plane architecture
+- `PLANS.md`: rules for long-running ExecPlans
+- `docs/`: knowledge store; treat this as the source of truth
+- `scripts/`: lightweight verification and scaffolding helpers
+- `harness/`: local observability and QA harness configuration
+- `.codex/evals/`: eval definitions and templates
+
+## How to start a task
+
+1. Read [ARCHITECTURE.md](ARCHITECTURE.md).
+2. Read [docs/index.md](docs/index.md) and the specific docs for the change
+   surface.
+3. If the request spans multiple files, multiple repos, new behavior, or a
+   likely multi-hour effort, create an ExecPlan in
+   `docs/exec-plans/active/<yyyy-mm-dd>-<slug>.md` and follow [PLANS.md](PLANS.md).
+4. Restate the request in terms of:
+   - user-visible outcome
+   - impacted repos
+   - acceptance checks
+   - required Playwright/CDP/Loki evidence
+5. Work inside a task-specific isolated runtime footprint under `.runtime/` and
+   `.worktrees/`.
+
+## System of record
+
+- Product intent: [docs/product-specs/index.md](docs/product-specs/index.md)
+- Architectural rules: [docs/design-docs/index.md](docs/design-docs/index.md)
+- UX and UI behavior: [docs/DESIGN.md](docs/DESIGN.md),
+  [docs/FRONTEND.md](docs/FRONTEND.md)
+- Backend and data behavior: [docs/BACKEND.md](docs/BACKEND.md),
+  [docs/SECURITY.md](docs/SECURITY.md), [docs/RELIABILITY.md](docs/RELIABILITY.md)
+- Quality and cleanup rules: [docs/QUALITY_SCORE.md](docs/QUALITY_SCORE.md)
+- Generated facts: [docs/generated/README.md](docs/generated/README.md)
+- Workflow loop: [docs/workflows/feature-delivery-loop.md](docs/workflows/feature-delivery-loop.md),
+  [docs/workflows/qa-feedback-loop.md](docs/workflows/qa-feedback-loop.md)
+
+## Non-negotiables
+
+- Do not turn `AGENTS.md` into a large manual. Promote durable rules into
+  `docs/` or scripts.
+- Do not implement from vague intent. Convert feature requests into explicit
+  acceptance criteria first.
+- Do not ship a user-visible change without QA evidence from:
+  - automated tests
+  - Playwright
+  - browser inspection via CDP or equivalent
+  - worktree-local logs in Loki or the configured log backend
+- Do not treat Slack, chat history, or memory as source of truth. If it matters
+  later, check it into the repo.
+- Do not handwave cross-repo changes. Contract changes must be reflected in
+  backend, frontend, docs, and validation steps.
+
+## Delivery loop
+
+1. Intake and clarify the request.
+2. Write or update an ExecPlan if the task is non-trivial.
+3. Implement in backend/frontend worktrees.
+4. Run build, typecheck, lint, and tests.
+5. Boot the isolated stack for the task.
+6. Run Playwright journeys.
+7. Inspect UI, network, console, and DOM with CDP tooling.
+8. Query logs, metrics, and traces for the same task runtime.
+9. Feed findings back into code, docs, and the ExecPlan.
+10. Record outcomes and remaining debt before handoff or merge.
+
@@ -0,0 +1,153 @@
+# git-ranker Workflow Architecture
+
+## Purpose
+
+This repository is the orchestration layer for an agent-first development
+workflow across two application repositories:
+
+- `git-ranker`: backend system of record for APIs, jobs, persistence, and domain
+  rules
+- `git-ranker-client`: frontend system of record for routes, components, user
+  flows, and client-side state
+
+The control plane in this repo exists to make the product legible to coding
+agents, not to store application logic.
+
+## Current repo facts
+
+The submodules are initialized in this workspace and currently expose these
+high-level facts:
+
+- backend: Spring Boot 3.4, Java 21, JPA, Batch, Security, Actuator, Prometheus,
+  structured JSON logging, Testcontainers, ArchUnit
+- frontend: Next.js App Router, React 19, TypeScript, ESLint, React Query,
+  Zustand, Tailwind, Radix UI
+
+Those facts should shape the workflow and harness choices instead of generic
+defaults.
+
+## Core principle
+
+Repository-local knowledge is the system of record. A coding agent should be
+able to understand the product, architecture, quality bar, and execution flow
+from versioned artifacts in this repository plus the checked-out submodules.
+
+## Control-plane flow
+
+```text
+feature request
+  -> request intake and acceptance contract
+  -> ExecPlan for non-trivial work
+  -> backend contract / behavior changes
+  -> frontend integration / UI changes
+  -> isolated task runtime
+  -> Playwright + CDP validation
+  -> logs / metrics / traces review
+  -> fix loop
+  -> PR / merge / debt update
+```
+
+## Worktree model
+
+Every non-trivial task should use an isolated runtime footprint keyed by a task
+slug, for example `rank-comparison-filtering`.
+
+Expected layout:
+
+```text
+.worktrees/
+  backend/<task-slug>/
+  frontend/<task-slug>/
+.runtime/
+  <task-slug>/
+    logs/
+    traces/
+    screenshots/
+    videos/
+    playwright/
+    observability/
+```
+
+The goal matches OpenAI's harness model:
+
+- one isolated app instance per task
+- one isolated observability context per task
+- artifacts are disposable once the task is complete
+
+## Knowledge-store layout
+
+```text
+AGENTS.md
+ARCHITECTURE.md
+PLANS.md
+docs/
+  design-docs/
+  exec-plans/
+  generated/
+  product-specs/
+  references/
+  workflows/
+```
+
+`AGENTS.md` is only the table of contents. The durable knowledge lives in
+`docs/`.
+
+## Cross-repo contract
+
+The repositories are versioned independently, but the workflow treats them as a
+single product system. A change request must identify which of the following are
+affected:
+
+- backend domain rules
+- backend API or event contracts
+- frontend route or component behavior
+- shared product language and acceptance criteria
+- reliability, security, or QA evidence
+
+Any contract change must update both sides of the boundary plus the knowledge
+store if the change affects future tasks.
+
+## Layering model
+
+The two repos should converge on one directional dependency model:
+
+```text
+Types -> Schemas/Contracts -> Repository/Gateway -> Service/Use Case
+      -> Runtime/Delivery -> UI or HTTP surface
+
+Cross-cutting concerns enter only through Providers:
+auth, feature flags, telemetry, configuration, external connectors
+```
+
+This is intentionally rigid. Agents move faster when the allowed edges are
+obvious and mechanically enforceable.
+
+## QA and observability loop
+
+Every user-visible change is expected to produce:
+
+- automated regression evidence
+- a Playwright run over the affected journey
+- CDP evidence for DOM, console, network, and screenshot state
+- log evidence from the isolated task runtime
+- metrics and trace evidence when performance or async flow matters
+
+The recommended local stack is documented in
+[docs/workflows/local-observability-stack.md](docs/workflows/local-observability-stack.md).
+The implementation provided in `harness/` uses Loki, Prometheus, Tempo, and
+Grafana to preserve the same agent-facing query model described by OpenAI:
+LogQL, PromQL, and TraceQL.
+
+## What stays out of this repo
+
+- application code that belongs in `git-ranker` or `git-ranker-client`
+- private tribal knowledge that should instead be turned into docs
+- ad hoc task notes that never graduate into reusable rules
+
+## Current limitations
+
+- the frontend repo does not yet contain committed Playwright or test config
+- the harness knows the backend metrics endpoint, but frontend metrics and trace
+  export wiring are still generic
+- repo-specific start scripts and local env bootstrapping still need to be
+  codified into the harness
@@ -0,0 +1,83 @@
+# ExecPlans for git-ranker-workflow
+
+This document adapts OpenAI's `PLANS.md` pattern to a two-repository product
+workflow. Use it for any task that is likely to take more than one session,
+spans multiple files or repos, changes contracts, or requires non-trivial QA.
+
+## When to create an ExecPlan
+
+Create an ExecPlan when any of the following are true:
+
+- the request spans backend and frontend
+- the request changes API, schema, routing, or product behavior
+- the work is expected to last more than 30 minutes
+- you need a reproducible QA and feedback loop
+- you expect to stop and resume later
+
+Store plans in `docs/exec-plans/active/<yyyy-mm-dd>-<slug>.md`.
+
+## Non-negotiable rules
+
+- Every ExecPlan must be self-contained.
+- Every ExecPlan must remain a living document.
+- Every ExecPlan must let a novice continue from only the working tree and the
+  plan file.
+- Every ExecPlan must describe observable outcomes, not just code edits.
+- Every ExecPlan must define the validation loop clearly.
+
+## Repo-specific additions
+
+Every plan in this repository must also include:
+
+- impacted repo list: backend, frontend, or both
+- request intake summary in plain language
+- contract boundary notes
+- exact task runtime slug
+- expected Playwright journeys
+- expected CDP evidence
+- expected Loki or log-backend queries
+- rollback or retry notes for each risky step
+
+## Required sections
+
+Every ExecPlan must keep these sections current:
+
+- `Purpose / Big Picture`
+- `Progress`
+- `Surprises & Discoveries`
+- `Decision Log`
+- `Outcomes & Retrospective`
+- `Context and Orientation`
+- `Plan of Work`
+- `Concrete Steps`
+- `Validation and Acceptance`
+- `Idempotence and Recovery`
+- `Artifacts and Notes`
+- `Interfaces and Dependencies`
+
+## Formatting
+
+The plan file itself should contain one single fenced code block labeled `md`.
+Do not nest other fenced blocks inside the plan. Use indentation for commands,
+snippets, and transcripts.
+
+## Required execution rhythm
+
+1. Clarify the user's request in product language.
+2. Identify impacted repos and documents.
+3. Research before implementation.
+4. Update the plan before and after every material milestone.
+5. Validate behavior in the isolated task runtime.
+6. Record the evidence path for screenshots, videos, traces, and logs.
+7. Update docs when a new durable rule or system fact is discovered.
+
+## Plan naming
+
+Use a sortable filename:
+
+`docs/exec-plans/active/2026-03-07-rank-comparison-filtering.md`
+
+## Template
+
+Start from `docs/exec-plans/_template.md`.
+