What
Add compression stats and storage breakdown to the existing GET /tables/{table}/size and GET /tables/{table}/metadata endpoints. No new endpoints.
Modules Changed
Stats are written once at segment creation into metadata.properties, loaded into ColumnMetadata at segment load, and aggregated on-demand per API call in the controller — no background jobs, no separate store.
| Module |
Change |
pinot-spi / IndexingConfig |
New compressionStatsEnabled boolean (default false) |
pinot-segment-local writers |
BaseChunkForwardIndexWriter, VarByteChunkForwardIndexWriterV4/V5/V6, CLPForwardIndexCreatorV2 — track raw byte count during writes |
pinot-segment-local / BaseSegmentCreator |
Persists uncompressed size + codec to segment metadata.properties at creation time |
pinot-segment-spi / ColumnMetadata |
Two new default methods to read persisted stats at segment load |
pinot-common |
New DTOs: ColumnCompressionStatsInfo, CompressionStatsSummary, StorageBreakdownInfo; extended: SegmentSizeInfo, TableMetadataInfo; new ControllerGauge entries |
pinot-server |
/tables/{table}/size and /tables/{table}/metadata — read per-column stats from ColumnMetadata and include in response |
pinot-controller |
TableSizeReader + ServerSegmentMetadataReader aggregate per-segment responses on each API call; emit Prometheus gauges |
New Fields (both endpoints, same structure)
"compressionStats": {
"rawForwardIndexSizePerReplicaInBytes": 550000000,
"compressedForwardIndexSizePerReplicaInBytes": 30000000,
"compressionRatio": 18.3,
"segmentsWithStats": 312,
"totalSegments": 801,
"isPartialCoverage": true
},
"columnCompressionStats": [
{
"column": "url",
"codec": "LZ4",
"hasDictionary": false,
"uncompressedSizeInBytes": 120000000,
"compressedSizeInBytes": 8000000,
"compressionRatio": 15.0,
"indexes": ["forward_index"]
},
{
"column": "status_code",
"hasDictionary": true,
"uncompressedSizeInBytes": -1,
"compressedSizeInBytes": 500000,
"compressionRatio": 0,
"codec": null
}
],
"storageBreakdown": {
"tiers": {
"hotTier": { "count": 50, "sizePerReplicaInBytes": 10000000 },
"coldTier": { "count": 262, "sizePerReplicaInBytes": 20000000 }
}
}
Behavior
Feature flag (tableIndexConfig.compressionStatsEnabled, default false):
false: zero overhead — no tracking in writers, nothing written to disk, compressionStats and columnCompressionStats absent from responses
true: writers track raw byte counts; codec + uncompressed size persisted to segment metadata
storageBreakdown is always returned regardless of the flag.
Dictionary columns appear in columnCompressionStats with hasDictionary=true, uncompressedSizeInBytes=-1, codec=null. Forward index size is still reported.
Partial coverage: enabling the flag on an existing table only affects new segments. Old segments are excluded from ratio computation (not counted as zero). isPartialCoverage=true and segmentsWithStats < totalSegments signal this.
Realtime: consuming segments excluded — stats appear only after segment commit.
All ingestion paths covered: offline batch, realtime, and minion tasks all converge at SegmentIndexCreationDriverImpl → BaseSegmentCreator.
Prometheus gauges: TABLE_COMPRESSION_RATIO_PERCENT, TABLE_RAW_FORWARD_INDEX_SIZE_PER_REPLICA, TABLE_COMPRESSED_FORWARD_INDEX_SIZE_PER_REPLICA, TABLE_TIERED_STORAGE_SIZE. Cleared when flag is disabled or table becomes dict-only.
What's Out of Scope
- Dictionary-encoded column uncompressed size tracking (follow-up): Forward index writers for dict-encoded columns only see dictionary IDs, not raw values — tracking true uncompressed sizes requires instrumenting the stats collection phase before dictionary encoding.
- UI changes (follow-up): Surface compression ratio, per-column stats, and tier breakdown in Pinot Console table detail page — API already returns all required data, purely a rendering change.
Use Cases
- COGS estimation: Compression ratio and per-column breakdown for informed storage cost projections
- Codec optimization: Identify columns with poor compression ratios and switch codecs (e.g., LZ4 → ZSTANDARD for cold data)
- Capacity planning: Right-size clusters by understanding true storage footprint with local vs tiered breakdown
- Schema optimization: Identify columns that benefit from dictionary encoding vs raw encoding
- Index cost analysis: Per-column index size visibility to evaluate cost-vs-performance trade-offs when adding or removing indexes
- Monitoring/alerting: Alert when compression ratio degrades after schema changes or data pattern shifts
Related Issues and PRs
Draft PR
#18185
What
Add compression stats and storage breakdown to the existing
GET /tables/{table}/sizeandGET /tables/{table}/metadataendpoints. No new endpoints.Modules Changed
Stats are written once at segment creation into
metadata.properties, loaded intoColumnMetadataat segment load, and aggregated on-demand per API call in the controller — no background jobs, no separate store.pinot-spi/IndexingConfigcompressionStatsEnabledboolean (defaultfalse)pinot-segment-localwritersBaseChunkForwardIndexWriter,VarByteChunkForwardIndexWriterV4/V5/V6,CLPForwardIndexCreatorV2— track raw byte count during writespinot-segment-local/BaseSegmentCreatormetadata.propertiesat creation timepinot-segment-spi/ColumnMetadatapinot-commonColumnCompressionStatsInfo,CompressionStatsSummary,StorageBreakdownInfo; extended:SegmentSizeInfo,TableMetadataInfo; newControllerGaugeentriespinot-server/tables/{table}/sizeand/tables/{table}/metadata— read per-column stats fromColumnMetadataand include in responsepinot-controllerTableSizeReader+ServerSegmentMetadataReaderaggregate per-segment responses on each API call; emit Prometheus gaugesNew Fields (both endpoints, same structure)
Behavior
Feature flag (
tableIndexConfig.compressionStatsEnabled, defaultfalse):false: zero overhead — no tracking in writers, nothing written to disk,compressionStatsandcolumnCompressionStatsabsent from responsestrue: writers track raw byte counts; codec + uncompressed size persisted to segment metadatastorageBreakdownis always returned regardless of the flag.Dictionary columns appear in
columnCompressionStatswithhasDictionary=true,uncompressedSizeInBytes=-1,codec=null. Forward index size is still reported.Partial coverage: enabling the flag on an existing table only affects new segments. Old segments are excluded from ratio computation (not counted as zero).
isPartialCoverage=trueandsegmentsWithStats < totalSegmentssignal this.Realtime: consuming segments excluded — stats appear only after segment commit.
All ingestion paths covered: offline batch, realtime, and minion tasks all converge at
SegmentIndexCreationDriverImpl→BaseSegmentCreator.Prometheus gauges:
TABLE_COMPRESSION_RATIO_PERCENT,TABLE_RAW_FORWARD_INDEX_SIZE_PER_REPLICA,TABLE_COMPRESSED_FORWARD_INDEX_SIZE_PER_REPLICA,TABLE_TIERED_STORAGE_SIZE. Cleared when flag is disabled or table becomes dict-only.What's Out of Scope
Use Cases
Related Issues and PRs
Draft PR
#18185