Skip to content

Commit 35263cd

Browse files
authored
feat(trace): add normalized trajectory contract (#1331)
* feat(trace): add normalized trajectory contract * chore(beads): mark normalized trajectory ready for review * docs(trace): clarify derived trace summary invariant
1 parent 083e08c commit 35263cd

6 files changed

Lines changed: 1307 additions & 12 deletions

File tree

.beads/issues.jsonl

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

docs/plans/trace-evaluation-architecture.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -110,10 +110,10 @@ flowchart TB
110110
Q --> R
111111
```
112112

113-
The normalized trajectory should have two layers:
113+
The normalized trace model should keep one canonical source of truth plus derived read models:
114114

115-
- A compact summary for cheap storage and dashboard aggregation: counts, durations, token usage, cost, error count, and tool-call counts.
116-
- A full trajectory for grading and explanation: ordered model turns, tool calls/results, branch metadata, source event IDs, content redaction state, and raw evidence handles.
115+
- The full trajectory is the canonical artifact for grading, replay, and explanation: ordered model turns, tool calls/results, branch metadata, source event IDs, content redaction state, and raw evidence handles.
116+
- The compact summary is a derived compatibility/read model for cheap result storage and dashboard aggregation: counts, durations, token usage, cost, error count, and tool-call counts. It must be recomputable from a full trajectory and should not be authored as separate trace state when the trajectory is available.
117117

118118
Directional wire shape:
119119

@@ -178,7 +178,7 @@ The exact schema belongs in implementation, but these concepts should be stable:
178178
- **Files:** `packages/core/src/evaluation/trace.ts`, `packages/core/src/evaluation/types.ts`, `packages/eval/src/schemas.ts`, new focused files under `packages/core/src/evaluation/trace/` if the existing file becomes too large.
179179
- **Patterns:** Follow the existing `TraceSummary`, `TokenUsage`, and project wire conversion conventions. Keep internal fields camelCase and wire fields snake_case.
180180
- **Test Scenarios:** Add tests that validate round-trip conversion, version rejection, missing optional content, inferred duration flags, branch metadata, and raw evidence handles.
181-
- **Verification:** Unit tests should prove summaries can be derived from full trajectories without changing current summary behavior.
181+
- **Verification:** Unit tests should prove summaries can be derived from full trajectories without changing current summary behavior, and that normalized trajectory artifacts do not embed a separate summary payload.
182182

183183
### U2. Trajectory Extraction From AgentV Runs
184184

0 commit comments

Comments
 (0)