admission: collapse from two WorkQueues to one#170621
Conversation
|
😎 Merged directly without going through the merge queue, as the queue was empty and the PR was up to date with the target branch - details. |
|
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
c8909d9 to
65a68de
Compare
65a68de to
762f06f
Compare
|
Apologies for the large diff. |
|
Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link) |
762f06f to
aabee23
Compare
wenyihu6
left a comment
There was a problem hiding this comment.
nice! glad we are ripping this out in this release. re: commit 2: if the only purpose of the
.app_tenant aliases is to keep existing dashboards rendering, is the right move to update the dashboard as part of the upgrade or create a new dashboard instead?
@wenyihu6 made 1 comment.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on joshimhoff and sumeerbhola).
Metrics change detectedThis PR adds or updates one or more CRDB metrics. If you want these metrics to be exported by CRDB Cloud clusters to Internal CRL Datadog and/or included in the customer metric export integration (Essential metrics for standard deployment, and Essential metrics for advanced deployment), refer to this Installation and Usage guide of a CLI tool that syncs the metric mappings in managed-service. Run this CLI tool after your CRDB PR is merged.
Note: Your metric will appear in Internal CRL Datadog only after the managed-service PR merges and the new OTel configuration rolls out to at least one cluster running a CRDB build that includes this metric. Docs: cockroach-metric-sync Questions: reach out to @obs-india-prs |
|
Re-running CI - hit an unrelated flake https://mesolite.cluster.engflow.com/invocations/default/eaac9317-1811-4624-a344-8d9a97a1379b (#170845). |
I'm OK with this as long as @joshimhoff is |
|
Lets drop the second commit and get the first commit merged first? Don't want this PR to rot. I will stamp if you open a new PR with the second commit, and we can dabate more on it. |
aabee23 to
17b3806
Compare
The CPU Time Token system previously used two work queues, one for system tenant traffic and one for application tenant traffic. While this does allow us to set different top-level utilization targets for the system tenant vs the application tenant, we believe we can get most of the benefit by mapping all serverless-mode system-tenant requests into a group that is almost surely going to be prioritized over any other tenant. This collapse allows us to delete resourceTier, cpuTimeTokenChildGranter, and the per-tier arrays that organized the system into two WorkQueues, two requesters, and four token buckets. The granter now holds two buckets (canBurst, noBurst): and implements the granter interface directly. The coordinator owns a single WorkQueue, and GetKVWorkQueue always returns it when CTT is enabled. Settings are consolidated: KVCPUTimeUtilGoal becomes the canonical target under admission.cpu_time_tokens.target_util, with the old system_tenant key retired. Also delete the activeMode/lastMode indirection that was previously used to track mode transitions across the filler. GetKVWorkQueue and resetInterval read the cpuTimeTokenACMode cluster setting directly. The per-tier CPU time token metrics (suffixed .system_tenant and .app_tenant) are replaced by single, untiered metrics. Epic: none Release note: none Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
17b3806 to
bd4651f
Compare
|
/trunk merge |
Delete resourceTier, cpuTimeTokenChildGranter, and the per-tier
arrays that organized the system into two WorkQueues, two requesters,
and four token buckets. The granter now holds two buckets
(canBurst, noBurst) and implements the granter interface directly.
The coordinator owns a single WorkQueue, and GetKVWorkQueue always
returns it when CTT is enabled (ignoring isSystemTenant).
Settings are consolidated: KVCPUTimeUtilGoal becomes the canonical
target under admission.cpu_time_tokens.target_util.system_tenant is retired.
ConfigSnapshot collapses from per-tier fractions to a single NoBurstFrac.
Epic: none
Release note: None
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com