You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When CTT (CPU Time Token) admission control is enabled, KV work is properly routed to the CTT WorkQueue via CPUGrantCoordinators.GetKVWorkQueue. However, SQL CPU admission is completely bypassed since serverless runs multi-process (kv pods do not need sql cpu admission). For resource manager work, they cannot be bypassed. The old slot-based system will not work here. It continues running for SQL work - SQLKVResponseAdmissionQ and sqlSQLResponseAdmissionQ still use tokenGranter with grant chains, and the old GrantCoordinator continues its CPULoad callback even though KV work is no longer
flowing through it.
Why the old system already has problems
The slot-based system uses three WorkKind queues with a hard-wired priority ordering (KVWork > SQLKVResponseWork > SQLSQLResponseWork) connected via grant chains. This has three drawbacks:
Complexity. Grant chains are subtle. The tokenGranter for SQL response work uses cpuOverloadIndicator (backed by kvSlotAdjuster) as an additional gate, linking SQL admission to KV slot availability in a non-obvious way.
Indirect control. SQL CPU is only throttled indirectly by gating KV/DistSQL response processing, rather than directly measuring and controlling SQL CPU consumption.
Proposal
Integrate SQL CPU admission into the CTT system so that when CTT is enabled, SQL CPU consumption is admitted through the same CPU time token budget as KV work.
This uses the SQLCPUHandle/SQLCPUProvider infrastructure (sql_cpu_handle.go), which tracks per-goroutine CPU time via grunning.Time(). GoroutineCPUHandle.MeasureAndAdmit is called at regular intervals during SQL execution and, when CTT is enabled, blocks to acquire CPU time tokens from the CTT WorkQueue.
Problem
When CTT (CPU Time Token) admission control is enabled, KV work is properly routed to the CTT WorkQueue via
CPUGrantCoordinators.GetKVWorkQueue. However, SQL CPU admission is completely bypassed since serverless runs multi-process (kv pods do not need sql cpu admission). For resource manager work, they cannot be bypassed. The old slot-based system will not work here. It continues running for SQL work -SQLKVResponseAdmissionQandsqlSQLResponseAdmissionQstill usetokenGranterwith grant chains, and the oldGrantCoordinatorcontinues itsCPULoadcallback even though KV work is no longerflowing through it.
Why the old system already has problems
The slot-based system uses three WorkKind queues with a hard-wired priority ordering (
KVWork > SQLKVResponseWork > SQLSQLResponseWork) connected via grant chains. This has three drawbacks:SQLKVResponseWork. See admission: priority inversion caused by SQLKVResponseWork and SQLSQLResponseWork #85471.tokenGranterfor SQL response work usescpuOverloadIndicator(backed bykvSlotAdjuster) as an additional gate, linking SQL admission to KV slot availability in a non-obvious way.Proposal
SQLCPUHandle/SQLCPUProviderinfrastructure (sql_cpu_handle.go), which tracks per-goroutine CPU time viagrunning.Time().GoroutineCPUHandle.MeasureAndAdmitis called at regular intervals during SQL execution and, when CTT is enabled, blocks to acquire CPU time tokens from the CTT WorkQueue.Jira issue: CRDB-62076
Epic CRDB-59016