Trace latency: add observability and benchmark gates

Parent: #281

# Goal

Add the observability and benchmark gates needed to make the rest of the DataFusion latency work measurable.

This should be the first implementation step. Without request-level timings, object-store request counts, cache-hit metrics, and repeatable scenarios, later changes can look successful because one local query got faster while p95/p99 or cold-path behavior got worse.

# Scope

- Add trace/query spans that separate planning, Delta snapshot refresh, pruning, footer reads, object-store range reads, DataFusion execution, and result materialization.
- Add object-store counters for `get`, `get_range`, `head`, and `list`, split by table or logical workload where practical.
- Add cache metrics for RAM object-store range cache hits/misses and result-cache hits/misses.
- Add a sentinel metric that increments if Delta refresh runs on the HTTP request path. This should normally stay at zero.
- Add a benchmark harness that records p50/p95/p99, object-store request counts, bytes read, cache hit rates, table/file counts, and storage freshness state.
- Include benchmark scenarios for cold trace lookup, repeated trace lookup, service-filtered dashboard reads, service-agnostic dashboard reads, small-file partitions, compacted partitions, and warmup behavior.

# High-level design

Instrument first, tune second. The benchmark harness should be able to run against local object storage and cloud object storage, but the data it emits should have the same shape in both cases. Prefer JSON artifacts so results can be compared across branches.

Use a counting `ObjectStore` wrapper where possible instead of scattering counters through query code. That keeps object-store behavior visible across trace spans, summaries, GenAI tables, Bifrost tables, and future DataFusion-backed datasets.

# Acceptance criteria

- Benchmark output includes latency distribution, object-store operation counts, bytes read, and cache hit rates.
- The refresh-on-request-path sentinel exists and is included in the benchmark output.
- A synthetic request-path refresh increments the sentinel, proving the guard works.
- A normal read workload keeps the sentinel at zero.
- The harness can compare before/after results for the later maintenance, warmup, and cache work.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trace latency: add observability and benchmark gates #282

Goal

Scope

High-level design

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Trace latency: add observability and benchmark gates #282

Description

Goal

Scope

High-level design

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions