Skip to content

Commit 38a0c36

Browse files
CharlieTLeclaude
andauthored
Add proposal for per-tenant cardinality API (#7335)
* Add proposal for per-tenant TSDB status API Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Extend TSDB status proposal with long-term storage cardinality via store gateways Add source=blocks query parameter to analyze cardinality from compacted blocks in object storage. The blocks path fans out to store gateways, which compute statistics from block index headers (cheap label value counts) and posting list expansion (exact series counts per metric). Results are cached per immutable block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Update proposal based on PR review: rename to Cardinality API and simplify Address feedback from PR #7335 review: - Rename endpoint from /api/v1/status/tsdb to /api/v1/cardinality - Drop Prometheus compatibility as a goal - Add start/end time range query parameters - Drop head-specific fields (numLabelPairs, memoryInBytesByLabelName, minTime, maxTime) to unify response across both sources - Remove API Compatibility and Field Portability sections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Require start/end for blocks path and add per-tenant max query range limit Make start/end required for source=blocks to prevent unbounded block scanning. Add cardinality_max_query_range per-tenant limit (default 24h) to give operators control over the blast radius. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Address all review findings from proposal review Critical: - Fix blocks path aggregation: no SG RF division since GetClientsFor routes each block to exactly one store gateway Significant: - Add min_time, max_time, block_ids to store gateway CardinalityRequest - Specify MaxErrors=0 for head path with availability implications - Add consistency check and retry logic for blocks path - Document RF division as best-effort approximation Moderate: - Wrap responses in standard {status, data} Prometheus envelope - Change HTTP 422 to HTTP 400 for limit violations - Add Error Responses section with all validation scenarios - Add approximated field for block overlap and partial results - Add Observability section with metrics - Add per-tenant concurrency limit and query timeout - Reject start/end for source=head instead of silently ignoring Low: - Add Rollout Plan with phased approach and feature flag - Document rolling upgrade compatibility (Unimplemented handling) - Document Query Frontend bypass - Improve caching: full results keyed by ULID, limit at response time - Add missing files to implementation section - Move shared proto to pkg/cortexpb/cardinality.proto - Rename TSDBStatus* to Cardinality* throughout - Add limit upper bound (max 512) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Rename proposal file to per-tenant-cardinality-api.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> --------- Signed-off-by: Charlie Le <charlie_le@apple.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent edd0091 commit 38a0c36

1 file changed

Lines changed: 407 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)