Commit 38a0c36
Add proposal for per-tenant cardinality API (#7335)
* Add proposal for per-tenant TSDB status API
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
* Extend TSDB status proposal with long-term storage cardinality via store gateways
Add source=blocks query parameter to analyze cardinality from compacted
blocks in object storage. The blocks path fans out to store gateways,
which compute statistics from block index headers (cheap label value
counts) and posting list expansion (exact series counts per metric).
Results are cached per immutable block.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
* Update proposal based on PR review: rename to Cardinality API and simplify
Address feedback from PR #7335 review:
- Rename endpoint from /api/v1/status/tsdb to /api/v1/cardinality
- Drop Prometheus compatibility as a goal
- Add start/end time range query parameters
- Drop head-specific fields (numLabelPairs, memoryInBytesByLabelName,
minTime, maxTime) to unify response across both sources
- Remove API Compatibility and Field Portability sections
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
* Require start/end for blocks path and add per-tenant max query range limit
Make start/end required for source=blocks to prevent unbounded block
scanning. Add cardinality_max_query_range per-tenant limit (default 24h)
to give operators control over the blast radius.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
* Address all review findings from proposal review
Critical:
- Fix blocks path aggregation: no SG RF division since GetClientsFor
routes each block to exactly one store gateway
Significant:
- Add min_time, max_time, block_ids to store gateway CardinalityRequest
- Specify MaxErrors=0 for head path with availability implications
- Add consistency check and retry logic for blocks path
- Document RF division as best-effort approximation
Moderate:
- Wrap responses in standard {status, data} Prometheus envelope
- Change HTTP 422 to HTTP 400 for limit violations
- Add Error Responses section with all validation scenarios
- Add approximated field for block overlap and partial results
- Add Observability section with metrics
- Add per-tenant concurrency limit and query timeout
- Reject start/end for source=head instead of silently ignoring
Low:
- Add Rollout Plan with phased approach and feature flag
- Document rolling upgrade compatibility (Unimplemented handling)
- Document Query Frontend bypass
- Improve caching: full results keyed by ULID, limit at response time
- Add missing files to implementation section
- Move shared proto to pkg/cortexpb/cardinality.proto
- Rename TSDBStatus* to Cardinality* throughout
- Add limit upper bound (max 512)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
* Rename proposal file to per-tenant-cardinality-api.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
---------
Signed-off-by: Charlie Le <charlie_le@apple.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>1 parent edd0091 commit 38a0c36
1 file changed
Lines changed: 407 additions & 0 deletions
0 commit comments