AI Shot Analysis — Technical Reference

This document describes the complete data pipeline behind the AI shot analysis feature: how raw shot data is preprocessed, how prompts are assembled, what is sent to the AI API, what response is expected, and what optimisations reduce token consumption and cost.

Supported Models

Provider	Model IDs
Anthropic	`claude-haiku-4-5-20251001`, `claude-sonnet-4-6`, `claude-opus-4-8`
OpenAI	`gpt-4o-mini`, `gpt-4o`

The model and API key are configured per-user in Settings → KI Analyse. The provider is inferred from the model name (gpt-* → OpenAI, everything else → Claude).

Request Flow

flowchart TD
    Browser["Browser\n(Shot Detail Page)"] -->|"POST /api/analysis/shot/:id\n?window=30d&type=detail"| Route["Analysis Route\nanalysis.ts"]
    Route -->|"?regenerate=false"| CacheCheck{ShotAnalysis\ncache hit?}
    CacheCheck -->|"Yes"| Browser
    CacheCheck -->|"No"| Settings["Load Settings\n(API key, model, language,\ncustom context, analysis mode)"]
    Settings --> Preprocess["preprocessShots()\nanalysisService.ts"]
    Preprocess --> DB[("SQLite\n(Shot + context shots)")]
    Preprocess --> Pipeline["Phase detection\nCurve downsampling\nHistorical aggregation"]
    Pipeline --> BuildPrompt["buildDetailPrompt()\nor buildDetailPromptOptimized()"]
    BuildPrompt --> Provider{AI Provider}
    Provider -->|"Claude"| Anthropic["☁ Anthropic API\nclient.messages.create()"]
    Provider -->|"OpenAI"| OpenAI["☁ OpenAI API\nchat.completions.create()"]
    Anthropic --> Parse["extractJson()\nstrip markdown fences\nparse JSON"]
    OpenAI --> Parse
    Parse --> Pricing["getModelPricing()\nOpenRouter → fallback"]
    Pricing --> Store[("Upsert ShotAnalysis\n(barista, roaster,\ntokens, cost, mode)")]
    Store --> Browser

Data Preprocessing

1. Target Shot

The shot is loaded from SQLite with all fields:

Category	Fields
Identity	`id`, `startTime`
Parameters	`beanWeight`, `drinkWeight`, `duration`, `drinkTds`, `drinkEy`, `espressoEnjoyment`
Bean	`beanBrand`, `beanType`, `roastLevel`, `roastDate`, `beanNotes`
Equipment	`profileTitle`, `grinderModel`, `grinderSetting`
Tasting	`espressoNotes`, `acidity`, `sweetness`, `bitterness`, `mouthfeel`, `fragrance`, `aroma`
Sensor curves	`espresso_pressure`, `espresso_pressure_goal`, `espresso_flow`, `espresso_flow_goal`, `espresso_temperature_mix`, `espresso_temperature_basket`, `espresso_flow_weight`, `timeframe`

2. Context Window

Up to 100 recent shots preceding the target shot are loaded within a configurable time window (7 d, 30 d, 90 d, all). They are used only for historical baseline calculation — not included in the user prompt directly.

Tiered matching ensures the baseline is always comparable:

Tier	Condition	Label in prompt
1 — ideal	Same `profileTitle` and same `beanBrand` + `beanType`	`same profile & bean`
2 — fallback	Same `profileTitle` only	`same profile`
—	`profileTitle` is null, or fewer than 2 matches found	no historical context

Comparing across different profiles is meaningless — a Turbo shot (high-flow, low-pressure) and a Blooming Flow shot have entirely different baseline curves. Shots with no profile stored produce no historical section in the prompt at all.

3. Curve Downsampling

Raw sensor curves typically contain 300–600 data points at ~2 Hz for a 30–60 s shot. Before analysis they are downsampled to 50 points per channel using linear interpolation, preserving the overall shape while keeping the payload small.

Downsampled channels (all to 50 points, kept time-aligned via timeframe):

espresso_pressure / espresso_pressure_goal
espresso_flow / espresso_flow_goal
espresso_temperature_mix
espresso_flow_weight

Phase detection runs on the full-resolution data. Only the resulting statistics (avg, min, max, σ, trend) enter the prompt — no raw arrays.

4. Phase Detection

This is the central preprocessing step. It converts the raw sensor streams into named, annotated phases that the model can reason about correctly.

flowchart TD
    Raw["Raw shot data\n(timeframe, pressure/goal, flow/goal)"] --> CtrlType["Classify each sample:\nflowGoal > 0  →  flow-controlled\npressureGoal > 0  →  pressure-controlled"]
    CtrlType --> Boundaries["Detect phase boundaries\n① control type switches\n② goal value jumps by > 1.0 unit"]
    Boundaries --> Skip{"Phase < 3 samples?"}
    Skip -->|"Yes"| Drop["Skip (noise sliver)"]
    Skip -->|"No"| Naming["Name phase:\n'Preinfusion' — first, early, low-pressure\n'Extraction' — pressure-ctrl, avg > 4 bar\n'Flow Phase' — flow-ctrl, mid-shot"]
    Naming --> WholePhase["Whole-phase stats\n(avg / min / max / σ / trend)\nfor pressure and flow"]
    WholePhase --> StableCheck{"goalValue > 0\nAND ≥ 6 samples?"}
    StableCheck -->|"No"| Output["Phase output"]
    StableCheck -->|"Yes, pressure-ctrl"| FlowMin["Find flow minimum\n(bottom of headspace-fill ramp-down)\nStable period: trough → end"]
    StableCheck -->|"Yes, flow-ctrl"| FlowGoal["Find first sample where\nflow ≥ 90% of goal\nStable period: that point → end"]
    FlowMin --> StableStats["Stable sub-phase stats\n(separate from ramp — avoids inflated σ)"]
    FlowGoal --> StableStats
    StableStats --> Output

Why a stable sub-phase? For pressure-controlled phases, flow is low during the headspace fill and ramp-up before the puck is saturated. Including those samples in the standard deviation would produce artificially high σ values and false channeling alerts. The stable sub-phase starts at the flow minimum (end of ramp-down) and uses only the steady extraction window.

Phase output fields:

{
  name: string              // "Preinfusion" | "Extraction" | "Flow Phase" | …
  control: 'flow' | 'pressure'
  goalValue: number | null  // programmed target (ml/s or bar)
  startTime, endTime, durationS: number
  pressure: { avg, min, max, stdDev, tracking? }  // tracking = |actual − goal|
  flow:     { avg, min, max, stdDev, tracking? }
  tempAvg: number | null
  trend: 'stable' | 'rising' | 'falling' | 'peaked'
  stable?: {                // only when a stable sub-phase was found
    startTime, durationS
    pressure: { avg, min, max, stdDev }
    flow:     { avg, min, max, stdDev }
    tempAvg: number | null
    flowTrend: 'stable' | 'rising' | 'falling' | 'peaked'
  }
}

5. Scale Flow Analysis

espresso_flow_weight is the cup-scale output — delayed 5–15 s vs. machine flow, but the true extraction signal. Analysed separately:

First drop time — when the scale detects flow (> 0.05 ml/s)
Peak flow rate
Late-shot average (last third of non-zero values)
Residual instability — deviation from a 5-point moving average. This detects oscillations (channeling signal) while ignoring normal rising/falling trends. Marked UNSTABLE when residual σ > 0.12 ml/s and the trend is neither purely rising nor falling.

6. Historical Aggregation

From the context shots, extraction-phase averages are computed (using stable sub-phase stats where available, otherwise falling back to pressure > 4 bar samples). Included in the prompt only when ≥ 2 context shots exist:

Mean extraction pressure (bar)
Mean extraction flow (ml/s)
Mean basket temperature (°C, when available)

Prompt Construction

System Prompt

The system prompt establishes two analysis perspectives and the critical rules for interpreting sensor data.

Key rules encoded in both variants:

Control-type semantics — flow-controlled phase: pressure is the output (puck resistance), not a stability metric. Pressure-controlled phase: flow is the output, flow spikes indicate channeling.
Channeling threshold — only flag in pressure-controlled phases; σ threshold differs by mode (see table below).
Goal values are authoritative — goal=X is the actual programmed target, never to be replaced by training-data intuition.
No invented history — forbidden to invent historical comparisons unless a "History" line is present in the prompt.
JSON-only output — {"barista":[…],"roaster":[…]}, 3–5 entries each, concrete data references.

Standard system prompt (~850 tokens)

Detailed markdown prose. Each rule explained verbatim with examples. Includes the [HIGH — channeling?] label in the user prompt for flow σ > 0.15 ml/s.

Optimised system prompt (~350 tokens, −60%)

Equivalent rules in terse bullet form. No markdown headings, no examples, no "CRITICAL" callouts. Channeling requires both σ > 0.20 ml/s and a sudden spike (stricter, fewer false positives).

User Prompt — Standard Mode

Markdown-formatted prose. Typical size: 400–800 tokens.

## Shot Analysis

**Shot Date:** 2026-06-04
**Bean:** Gardelli · Ethiopia Wush Wush · Light
**Roast Date:** 2026-05-15 (20 days since roast)
**Parameters:** 18g → 36g (1:2.00) · 28.0s · TDS 9.2% · EY 22.1% · Score 82/100
**Profile:** Blooming Flow
**Grinder:** Timemore Sculptor 078S @ 12.0

### Profile Phases
**Preinfusion** (0.0–8.5s, 8.5s, flow-controlled, goal 2.0 ml/s)
  Stable (3.2–8.5s, 5.3s):
    Puck resistance (pressure output): avg=1.82 bar, min=0.4, max=3.1, trend=rising
    Flow (controlled): avg=1.94 ml/s, min=1.1, max=2.1, trend=stable
    Basket temp: 93.1°C

**Extraction** (8.5–30.2s, 21.7s, pressure-controlled, goal 9.0 bar)
  Ramp: 8.5–11.3s (pressure rising to goal)
  Stable extraction (11.3–30.2s, 18.9s):
    Pressure: avg=9.03 bar, min=8.7, max=9.3
    Flow: avg=1.82 ml/s, min=1.1, max=2.3, trend=falling
    Basket temp: 93.2°C

### Scale Flow (cup output)
- First drop at: 6.2s
- Peak: 2.41 ml/s, late avg: 1.78 ml/s, trend: falling

### Tasting
Notes: "slightly bitter finish" · Acidity 3 · Sweetness 4 · Bitterness 5 · Mouthfeel 3

### Historical Context (28 shots, same profile & bean, extraction-phase averages)
- Avg extraction pressure: 9.1 bar
- Avg extraction flow: 1.9 ml/s
- Avg basket temp: 93.2°C

### Machine & Setup Context
Machine: Decent Espresso DE1
[user-configured custom context from Settings]

Analyze this shot from all three perspectives: Barista, Röster, Analyst.

User Prompt — Optimised Mode

Key=value compact format. Typical size: 200–400 tokens (−50%).

Shot 2026-06-04
Bean: Gardelli · Ethiopia Wush Wush · Light (20d post-roast)
Params: 18g→36g (1:2.0) · 28s · TDS 9.2% · EY 22.1% · Score 82/100
Setup: Blooming Flow | Timemore Sculptor 078S @12.0

### Phases
Preinfusion 0.0–8.5s | flow-ctrl goal=2.0ml/s
  stable 3.2–8.5s: pres=1.8 flow=1.9 trend=stable temp=93.1°C
Extraction 8.5–30.2s | pres-ctrl goal=9.0bar
  ramp 8.5–11.3s
  stable 11.3–30.2s: pres=9.0 flow=1.8 trend=falling temp=93.2°C

Scale flow: first-drop=6.2s peak=2.41ml/s late=1.8ml/s trend=falling

Tasting: "slightly bitter finish" · acid=3 · sweet=4 · bitter=5 · body=3
History (28 shots, same profile & bean): pres=9.1bar · flow=1.9ml/s · temp=93.2°C

Context: Machine: Decent Espresso DE1
[user-configured custom context]

σ values are omitted from the optimised prompt unless they exceed the 0.20 ml/s threshold — reducing noise for "good" shots.

API Request

Claude — Standard Mode

{
  "model": "claude-haiku-4-5-20251001",
  "max_tokens": 2048,
  "system": "<standard system prompt ~850 tokens>",
  "messages": [
    { "role": "user", "content": "<standard user prompt ~400–800 tokens>" }
  ]
}

Claude — Optimised Mode (with Prompt Caching)

{
  "model": "claude-haiku-4-5-20251001",
  "max_tokens": 1024,
  "system": [
    {
      "type": "text",
      "text": "<optimised system prompt ~350 tokens>",
      "cache_control": { "type": "ephemeral" }
    }
  ],
  "messages": [
    { "role": "user", "content": "<optimised user prompt ~200–400 tokens>" }
  ]
}

The cache_control: ephemeral annotation tells the Anthropic API to cache the system prompt for up to 5 minutes. On a cache hit the system prompt tokens are billed at ~10% of the normal input-token rate.

OpenAI

{
  "model": "gpt-4o-mini",
  "max_tokens": 2048,
  "response_format": { "type": "json_object" },
  "messages": [
    { "role": "system", "content": "<system prompt>" },
    { "role": "user",   "content": "<user prompt>" }
  ]
}

response_format: json_object enforces JSON output natively, without relying solely on the prompt instruction. Prompt caching is not used for OpenAI.

Expected Response

The model must return a JSON object with exactly two keys:

{
  "barista": [
    "The grind appears slightly coarse: flow averaged 2.3 ml/s at the end of stable extraction with goal 9.0 bar. Try stepping down 1 click on the Timemore.",
    "The 8.5 s pre-wet in Blooming Flow achieved good puck saturation — ramp from 8.5–11.3 s is clean with no pressure overshoot."
  ],
  "roaster": [
    "At 20 days post-roast this light Ethiopian is past peak degassing. TDS 9.2% at 1:2 ratio suggests mild underextraction — consider +0.5°C or a finer grind.",
    "93°C basket temp suits a light roast well. The slight bitterness likely reflects the natural process character rather than over-extraction at this duration."
  ]
}

3–5 entries per array. Each entry must reference specific data values and phase names. The analyst key was removed from the UI and prompts; it is always returned as an empty array [] for database schema compatibility.

Extraction fallback: extractJson() strips any markdown code fences (```json … ```) before parsing, to tolerate models that add them despite the instruction.

Optimisations

Overview

	Standard	Optimised
System prompt	~850 tokens	~350 tokens (−60%)
User prompt	~400–800 tokens	~200–400 tokens (−50%)
`max_tokens`	2 048	1 024
Prompt caching (Claude)	No	Yes — system prompt cached
Flow σ shown when	> 0.08 ml/s	> 0.20 ml/s only
Channeling flag	σ > 0.15 → `[HIGH]`	σ > 0.20 AND sudden spike
Estimated cost (Haiku, cache miss)	~$0.0003	~$0.00015
Estimated cost (Haiku, cache hit)	—	~$0.00005

Compact Prompt Format

The optimised user prompt uses a key=value shorthand (pres=9.0 flow=1.8) instead of markdown prose (Pressure: avg=9.03 bar, min=8.7, max=9.3). The information density is identical; the token count is roughly half.

Sigma Suppression

In optimised mode, flow σ is only included in the prompt text when it exceeds 0.20 ml/s. For well-executed shots this line simply disappears, removing noise and making anomalies stand out more clearly to the model.

Stable Sub-Phase (both modes)

The ramp-up period of pressure-controlled phases is excluded from all statistics reported to the model. Without this, the natural flow rise from 0 to ~2 ml/s during the first 3–5 s of extraction would inflate σ and produce spurious channeling warnings.

Ramp / Stable Split

For pressure-controlled phases, the prompt explicitly separates the ramp period (ramp 8.5–11.3s) from the stable extraction, so the model understands exactly which window the statistics cover.

`max_tokens` Halved

The optimised response prompt plus the system prompt occupy fewer tokens, and the expected output (3–5 short bullet points) rarely needs more than 400 output tokens. Halving max_tokens from 2 048 to 1 024 reduces cost on long responses and eliminates the rare runaway generation.

Cost Tracking

Pricing is fetched live from the OpenRouter API and cached in memory.

flowchart LR
    Tokens["token counts\nfrom API response"] --> Lookup["getModelPricing(modelName)"]
    Lookup --> CacheCheck{"In-memory cache\nvalid? (24 h TTL)"}
    CacheCheck -->|"Yes"| HitReturn["Return cached price"]
    CacheCheck -->|"No"| Fetch["GET openrouter.ai/api/v1/models\n(5 s timeout)"]
    Fetch -->|"Success"| NewCache["Store fresh prices\nreset TTL"]
    Fetch -->|"Fail, cache exists"| StaleReset["Reset TTL on stale cache\n(prevents retry storm)"]
    Fetch -->|"Fail, no cache"| Fallback["Hardcoded fallback pricing\n(valid as of 2026-06-04)"]
    HitReturn --> Calc
    NewCache --> Calc
    StaleReset --> Calc
    Fallback --> Calc
    Calc["cost = tokens × price_per_token\n(input and output separately)"] --> Store["Stored in ShotAnalysis\nat analysis time — permanent"]

Fallback prices (USD per token):

Model	Input	Output
claude-haiku-4-5	$0.00000025	$0.00000125
claude-sonnet-4-6	$0.000003	$0.000015
claude-opus-4-8	$0.000005	$0.000025
gpt-4o-mini	$0.00000015	$0.0000006
gpt-4o	$0.0000025	$0.000010

Costs are stored at analysis time. Historical entries remain accurate even if prices change later, because the price is captured rather than looked up again when viewing old analyses.

Database Schema

ShotAnalysis {
  id               String    @id  @default(cuid())
  shotId           String    @unique          ← 1:1 with Shot
  analysisType     String                     "detail" | "stats"
  aiModel          String                     full model name used
  barista          String                     JSON-stringified string[]
  roaster          String                     JSON-stringified string[]
  analyst          String                     JSON-stringified string[] (always [])
  tokenInputCount  Int
  tokenOutputCount Int
  costInputUsd     Float?                     null when pricing unavailable
  costOutputUsd    Float?
  analysisMode     String?                    "standard" | "optimized"
  createdAt        DateTime  @default(now())
}

Analysis Cache

A ShotAnalysis row is stored per shot (1:1 relationship). On subsequent requests for the same shot, the cached result is returned immediately without calling the AI API.

The Regenerate button on the shot detail page sends ?regenerate=true, which bypasses the cache lookup and overwrites the stored analysis with a fresh result.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Shot Analysis — Technical Reference

Supported Models

Request Flow

Data Preprocessing

1. Target Shot

2. Context Window

3. Curve Downsampling

4. Phase Detection

5. Scale Flow Analysis

6. Historical Aggregation

Prompt Construction

System Prompt

Standard system prompt (~850 tokens)

Optimised system prompt (~350 tokens, −60%)

User Prompt — Standard Mode

User Prompt — Optimised Mode

API Request

Claude — Standard Mode

Claude — Optimised Mode (with Prompt Caching)

OpenAI

Expected Response

Optimisations

Overview

Compact Prompt Format

Sigma Suppression

Stable Sub-Phase (both modes)

Ramp / Stable Split

`max_tokens` Halved

Cost Tracking

Database Schema

Analysis Cache

FilesExpand file tree

ai-analysis.md

Latest commit

History

ai-analysis.md

File metadata and controls

AI Shot Analysis — Technical Reference

Supported Models

Request Flow

Data Preprocessing

1. Target Shot

2. Context Window

3. Curve Downsampling

4. Phase Detection

5. Scale Flow Analysis

6. Historical Aggregation

Prompt Construction

System Prompt

Standard system prompt (~850 tokens)

Optimised system prompt (~350 tokens, −60%)

User Prompt — Standard Mode

User Prompt — Optimised Mode

API Request

Claude — Standard Mode

Claude — Optimised Mode (with Prompt Caching)

OpenAI

Expected Response

Optimisations

Overview

Compact Prompt Format

Sigma Suppression

Stable Sub-Phase (both modes)

Ramp / Stable Split

max_tokens Halved

Cost Tracking

Database Schema

Analysis Cache

`max_tokens` Halved