feat: Add GET /api/v3/trace-summaries endpoint for lightweight search results#8604
Conversation
Proposes a new FindTraceSummaries endpoint in the v3 API returning per-trace statistics instead of full span data, with propagation through QueryService, the Go tracestore.SummaryReader optional interface, and the Remote Storage gRPC API. Includes fallback behavior for storage backends that do not implement native summary computation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
The search results row renders a coloured tag per service showing span count and an error indicator (ResultItem.tsx / transformTraceData()). Replace the flat service_names list with a ServiceSummary sub-message carrying name, span_count, and has_errors across all three layers: api_v3 proto, storage/v2 proto, and the Go/TypeScript type definitions. Also expand the Context section with the exact UI fields being replaced. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
Replace the old proto-first milestones with a sequence that ships working software at each step: 1. Backend HTTP endpoint + fallback aggregation (jaeger/ only) 2. UI migration to the new endpoint (jaeger-ui/ only) — validates the data model with real rendering before any IDL work 3. Formalise in jaeger-idl as a gRPC RPC 4. Remote storage adapter with UNIMPLEMENTED fallback 5. Native backend implementation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
…ceSummary The UI renders an error icon when any span from the service has StatusCode.ERROR; there is no cross-service error propagation in the search results view (that only exists in the trace timeline). Using an int is consistent with the trace-level error_count field and keeps all information the backend computed anyway during aggregation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
Both TraceSummary and ServiceSummary now use error_span_count consistently across all layers (proto, Go, TypeScript). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
Each RPC in the existing schema has its own request struct. Introduce FindTraceSummariesRequest (wrapping TraceQueryParameters) instead of reusing FindTracesRequest. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
storage/v2 FindTraceIDs currently reuses FindTracesRequest; fix that inconsistency in the same IDL phase, introducing a dedicated FindTraceIDsRequest with the same field layout (wire-compatible, source-breaking, coordinated update in jaeger/). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
Adds GET /api/v3/trace-summaries endpoint that returns lightweight per-trace statistics instead of full span data, suitable for the search results page. Changes: - tracestore.TraceSummary / ServiceSummary types and optional SummaryReader interface (internal/storage/v2/api/tracestore/) - querysvc.FindTraceSummaries with fallback aggregation via computeSummaries (no IDL changes required for this milestone) - HTTP handler at GET /api/v3/trace-summaries, reusing existing query-parameter parsing from FindTraces - Unit tests for computeSummaries (empty, error, multi-service), FindTraceSummaries (fallback path, native SummaryReader path), and the HTTP handler Relates to jaegertracing#8602 (ADR-010) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
There was a problem hiding this comment.
Pull request overview
Adds the first working backend-only implementation of ADR-010’s Trace Summary API by introducing storage/query-layer summary types, computing summaries via a fallback aggregation path, and exposing a new HTTP endpoint at GET /api/v3/trace-summaries (with unit tests and ADR documentation updates).
Changes:
- Introduce
tracestore.TraceSummary/ServiceSummarytypes and optionaltracestore.SummaryReaderinterface. - Add
QueryService.FindTraceSummariesplus fallback summary aggregation logic (computeSummaries/summarizeTrace). - Add
/api/v3/trace-summariesHTTP handler returning plain JSON, plus unit tests and ADR-010 documentation.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/storage/v2/api/tracestore/summary.go | Defines Go summary model + optional SummaryReader storage capability. |
| cmd/jaeger/internal/extension/jaegerquery/querysvc/summary.go | Implements FindTraceSummaries and fallback summarization logic. |
| cmd/jaeger/internal/extension/jaegerquery/querysvc/summary_test.go | Unit tests for summary computation and dispatch (fallback vs native). |
| cmd/jaeger/internal/extension/jaegerquery/internal/apiv3/http_gateway.go | Registers and implements GET /api/v3/trace-summaries JSON handler. |
| cmd/jaeger/internal/extension/jaegerquery/internal/apiv3/http_gateway_test.go | Adds handler tests for success and error cases. |
| docs/adr/README.md | Adds ADR-010 to ADR index. |
| docs/adr/010-trace-summary-api.md | Adds ADR-010 describing the API design and milestones. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8604 +/- ##
==========================================
+ Coverage 96.56% 96.60% +0.03%
==========================================
Files 332 333 +1
Lines 17575 17678 +103
==========================================
+ Hits 16972 17077 +105
+ Misses 454 452 -2
Partials 149 149
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…eaming Aligns SummaryReader.FindTraceSummaries with the iterator contract used by FindTraces and FindTraceIDs, allowing native storage backends to stream summaries incrementally rather than buffering all results. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
…ntation - Fix multi-chunk bug: use jptrace.AggregateTraces so a trace split across consecutive ptrace.Traces chunks yields a single summary - Add OrphanSpanCount to TraceSummary and compute it via two-pass span iteration (first pass collects all span IDs, second detects missing parents) - Sort Services slice by name for deterministic output - Move JSON types and findTraceSummaries handler from http_gateway.go to summaries.go; add comment noting they are pre-IDL placeholders Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
- Move FindTraceSummaries method into service.go alongside other QueryService methods - Move findTraceSummaries HTTP handler back to http_gateway.go; summaries.go now holds only the pre-IDL JSON types - Apply clock-skew adjuster before summarizing in the fallback path, consistent with FindTraces behavior - Guard zero StartTime in JSON response (emit 0 instead of large negative value) - Rename local var stats → svcStats per reviewer suggestion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
…cated alias The canonical proto field is search_depth; accepting num_traces diverged from what the future gRPC-gateway generated binding will produce. Accept query.search_depth as the primary parameter and keep query.num_traces as a backwards-compatible deprecated alias. Fixes jaegertracing#8617 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>
…gertracing#3941) ## Summary - Introduces `TraceSummary` and `ServiceSummary` types in `src/types/trace-summary.ts` as a lightweight alternative to `IOtelTrace` for the search results view - Replaces `IOtelTrace` with `TraceSummary` in `ResultItem`, `SearchResults`, and `SearchTracePage` — error counts and per-service error flags are now read from pre-computed fields instead of iterating spans on every render - Adds `traceToTraceSummary(trace: IOtelTrace): TraceSummary` adapter in `src/model/trace-summary.ts` so the existing `sortedTracesXformer` path continues to work unchanged - `ResultItemTitle.duration` prop type simplified from `IOtelTrace['duration']` to `Microseconds` directly, removing the dependency on `IOtelTrace` - Prepares for ADR-010 Milestone 2: once the UI calls `GET /api/v3/trace-summaries`, the reducer can populate `TraceSummary` directly from the API response without the `transformTraceData` aggregation step ## Test plan - [ ] New unit tests in `src/model/trace-summary.test.ts` cover: empty trace, root service/operation extraction, error span counting (total and per-service), zero-error case, orphan count propagation, and a round-trip through `traceGenerator` + `transformTraceData` - [ ] Existing `ResultItem.test.jsx` and `SearchResults/index.test.jsx` updated to use `TraceSummary`-shaped fixtures - [ ] `npm test`, `npm run lint`, `npm run build` all pass ## Related - Part of jaegertracing/jaeger#8606 - ADR-010: jaegertracing/jaeger#8602 - Backend PR: jaegertracing/jaeger#8604 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…Depth in trace-summaries endpoint (#8633) ## Summary Three fixes to the `GET /api/v3/trace-summaries` endpoint introduced in #8604: - **`traceId` field name casing**: the response was returning `traceID` (uppercase D) but the OpenAPI spec and the generated Zod client expect `traceId` (lowercase d, matching proto3 JSON naming). The UI worked around this defensively; this fixes the backend. - **`SearchDepth` default**: `parseFindTracesQuery` was not applying a default when `query.search_depth` is absent, causing the memory backend to return a 500 on every unauthenticated request. Now defaults to 100, matching the v1 HTTP handler. - **Snapshot test**: adds a golden-file test for the full `FindTraceSummaries` JSON response so field name and encoding regressions are caught automatically. ## Changes - `summaries.go`: change JSON tag from `"traceID"` to `"traceId"` - `query_parser.go`: add `defaultSearchDepth = 100`; apply it when `query.search_depth` is absent - `gateway_test.go` + `snapshots/FindTraceSummaries.json`: snapshot test for the HTTP response - `docs/adr/010-trace-summary-api.md`: reflect M1/M2 completion, note that jaeger-idl already has all M3/M4 proto work merged, add PR sequence for remaining work ## Test plan - [x] `go test ./cmd/jaeger/internal/extension/jaegerquery/...` passes - [x] `make lint` passes ## Related - #8604 — main implementation this PR fixes - #8617 — tracking issue for `search_depth` naming / default - #8618 — rename `num_traces` → `search_depth` (cherry-picked) - jaegertracing/jaeger-ui#3941 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Summary - Bump `idl/` submodule to `jaeger-idl` main (`0daa719`), which adds the `FindTraceSummaries` RPC to `api_v3.QueryService` and introduces the dedicated `FindTraceIDsRequest` type in `storage/v2`. Regenerate Go bindings via `make proto`. - Implement `Handler.FindTraceSummaries` in `grpc_handler.go`: delegates to `querysvc.QueryService.FindTraceSummaries` (the native `SummaryReader` path or fallback aggregation, both already in place) and converts each `tracestore.TraceSummary` batch to an `api_v3.FindTraceSummariesResponse` chunk for streaming. ## Test plan - [x] `TestFindTraceSummaries` — success path: mock reader returns a trace, verify the summary's `traceId` field - [x] `TestFindTraceSummariesQueryNil` — missing query and missing time range return `InvalidArgument` - [x] `TestFindTraceSummariesStorageError` — storage error propagates to the client - [x] `make lint` and `make test` pass ## Related - ADR-010 Milestone 3 (PR B in the sequence table) - #8604 — original `FindTraceSummaries` HTTP endpoint - jaeger-idl PR #203, #204 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…-trace aggregation (#8638) ## What this PR does Implements the query-service layer for the Trace Summary API (tracking issue #8606, part C of the PR sequence in [ADR-010](https://github.com/jaegertracing/jaeger/blob/main/docs/adr/010-trace-summary-api.md)). **Core behaviour:** - `QueryService.FindTraceSummaries` delegates to `tracestore.SummaryReader` when the underlying storage implements it (native path — efficient, storage-computed summaries). - If no `SummaryReader` is present, or if it returns `errors.ErrUnsupported` as a direct error (e.g. a remote backend that has not yet implemented the RPC), it transparently falls back to `FindTraces` + client-side aggregation via `jptrace.AggregateTraces`. - `findSummaryReader` walks the reader chain via `Unwrap()` so decorators wrapping the underlying reader remain transparent. **Interface design:** - `tracestore.SummaryReader.FindTraceSummaries` returns `(iter.Seq2[[]TraceSummary, error], error)` — the direct error lets callers detect "not supported" immediately before starting iteration. **End-to-end integration test:** - `traceReader` in `cmd/jaeger/internal/integration` implements `tracestore.SummaryReader` by calling the `api_v3.QueryService.FindTraceSummaries` gRPC RPC. - `StorageIntegration.RunSpanStoreTests` always runs `FindTraceSummaries`; backends opt out by adding `"FindTraceSummaries"` to their `Capabilities.SkipList`. ## Testing - Unit tests for `computeSummaries`, fallback path, native path, `Unwrap` chain, and `ErrUnsupported` fallback: `querysvc/summary_test.go` - End-to-end coverage via `TestJaegerQueryService`, `TestGRPCStorage`, `TestMemoryStorage` ## Related - Tracking issue: #8606 - ADR-010: [docs/adr/010-trace-summary-api.md](https://github.com/jaegertracing/jaeger/blob/main/docs/adr/010-trace-summary-api.md) - Prerequisite PRs: #8604 (proto), #8633 (gRPC handler), #8634 (HTTP handler) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Implements Milestone 1 from ADR-010 (#8602): a working
GET /api/v3/trace-summariesHTTP endpoint backed by fallback aggregation, entirely withinjaeger/with nojaeger-idlchanges.tracestore.TraceSummary/ServiceSummarytypes and optionaltracestore.SummaryReaderinterface (internal/storage/v2/api/tracestore/summary.go) — existing storage implementations require no changesquerysvc.FindTraceSummarieswith fallback: type-asserts toSummaryReader; if absent, callsFindTracesand computes summaries viacomputeSummaries(querysvc/summary.go)GET /api/v3/trace-summariesHTTP handler (apiv3/http_gateway.go), reusing the existing query-parameter parser fromFindTraces; response is plain JSON (proto formalisation is Milestone 3)computeSummaries(empty, error, multi-service with per-service error counts),FindTraceSummaries(fallback path + nativeSummaryReaderpath), HTTP handler (success + storage error)Response shape
Timestamps are Unix nanoseconds encoded as decimal strings, consistent with OTLP proto3 JSON encoding (
startTimeUnixNanoinGET /api/v3/traces). String encoding avoids float64 precision loss in JavaScript for values above 2⁵³ (~9×10¹⁵); nanosecond timestamps (~1.7×10¹⁸) are well above that threshold. Duration is intentionally omitted — callers derive it asBigInt(maxEndTimeUnixNano) - BigInt(minStartTimeUnixNano).{ "summaries": [ { "traceID": "00000000000000000000000000000001", "rootServiceName": "frontend", "rootOperationName": "HTTP GET /", "minStartTimeUnixNano": "1000000000000000000", "maxEndTimeUnixNano": "1001000000000000000", "spanCount": 3, "errorSpanCount": 1, "orphanSpanCount": 0, "services": [ {"name": "backend", "spanCount": 1, "errorSpanCount": 0}, {"name": "frontend", "spanCount": 2, "errorSpanCount": 1} ] } ] }Example
Test plan
make fmt lint— cleango test ./cmd/jaeger/internal/extension/jaegerquery/...— all passcurlagainst a running Jaeger-all-in-one — returns correct summariesRelated
🤖 Generated with Claude Code