roachtest/cdc: add linger-tradeoff perf variants by aerfrei · Pull Request #170586 · cockroachdb/cockroach

aerfrei · 2026-05-19T19:14:48Z

Add cdc/linger/10ms and cdc/linger/5s as Weekly-suite benchmarks of the v2 kafka sink. Both run the same kv workload with --ramp 10m --duration 12m --read-percent 0 against a single-table changefeed and differ only in kafka_sink_config.Flush.Frequency, which the v2 kafka path wires straight into batchingSink.minFlushFrequency
(pkg/ccl/changefeedccl/sink_kafka_v2.go:430). Together the two curves trace the linger tradeoff on a real Kafka sink: the tight cadence should win at low load, the slack cadence at saturation. This closes the microbench gap called out in #170200, where the existing batching benchmarks don't capture per-flush network/broker cost.

Server-side changefeed.commit_latency (p50, p99) and the rate of changefeed.emitted_messages are exported to roachperf as per-tick prometheus series via a parameterized startStatsCollection. When the no-linger sink lands, a third registration that flips changefeed.no_linger_sink.enabled will overlay against the same methodology and produce the comparison plot #170574 asks for.

Informs: #170200
Epic: CRDB-62737
Release note: None

Add cdc/linger/10ms and cdc/linger/5s as Weekly-suite benchmarks of the v2 kafka sink. Both run the same kv workload with `--ramp 10m --duration 12m --read-percent 0` against a single-table changefeed and differ only in `kafka_sink_config.Flush.Frequency`, which the v2 kafka path wires straight into `batchingSink.minFlushFrequency` (pkg/ccl/changefeedccl/sink_kafka_v2.go:430). Together the two curves trace the linger tradeoff on a real Kafka sink: the tight cadence should win at low load, the slack cadence at saturation. This closes the microbench gap called out in #170200, where the existing batching benchmarks don't capture per-flush network/broker cost. Server-side `changefeed.commit_latency` (p50, p99) and the rate of `changefeed.emitted_messages` are exported to roachperf as per-tick prometheus series via a parameterized `startStatsCollection`. When the no-linger sink lands, a third registration that flips `changefeed.no_linger_sink.enabled` will overlay against the same methodology and produce the comparison plot #170574 asks for. Informs: #170200 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

trunk-io · 2026-05-19T19:14:53Z

Merging to master in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

cockroach-teamcity · 2026-05-19T19:15:00Z

This change is

Two fixes for cdc/linger/* roachperf surfaced by the first run: 1. NaN-scrub stats before serialization. The new commit_latency p50/p99 aggregations are histogram_quantile-based, which returns NaN whenever the rate window has zero increments — at the head of the run before the first event, and at the tail after the workload stops. Go's encoding/json refuses to serialize NaN, so the entire perf export crashed at the end of the test. Route the export through dryRun=true, walk the returned ClusterStatRun zeroing any NaN, then call SerializeOutRun manually. online_restore.go has the same workaround inline (with a comment noting PromQL `or vector(0)` does not fix it). 2. Cut --duration from 12m to 2m. kv's --duration is additive with --ramp (workload/cli/run.go:51), so the original 10m+12m gave a 22m workload — twice what the plan called for. The corrected 10m ramp + 2m tail matches the plan: a steady-state segment at peak just long enough to confirm saturation. Informs: #170200 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Replace cdc/linger/{10ms,5s} with a single cdc/linger/sweep test that walks Flush.Frequency across 10ms, 100ms, 500ms, 1s, 5s back-to-back on one cluster. Each variant runs the same kv --ramp workload; the test cancels the previous changefeed and issues a fresh CREATE CHANGEFEED against the cached sink URI between variants, while the topic consumers spawned by the initial setupSink keep draining throughout. Variant boundaries are logged as t.Status `=== variant N/M ===` lines in test.log, and the Grafana dashboard spans the full run so each variant is identifiable by its time range. No per-variant stats.json is written — Grafana is the source of truth. Lets the whole sweep run unattended from one command rather than queueing N separate tests. Informs: #170200 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Replace cdc/linger/sweep with five independent registrations cdc/linger/{10ms,100ms,500ms,1s,5s}. Each test runs a single kv --ramp workload against a v2 Kafka changefeed with that Flush.Frequency, restoring the per-test artifact directory, independent failure recovery, and easy single-variant re-runs. The whole sweep can still be driven from one command on a reused cluster: bin/roachtest run --cluster <name> --wipe cdc/linger which matches all five tests by substring and runs them in series, wiping cluster state between each. Informs: #170200 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Cut concurrency from len(nData)*64 to len(nData)*8 (256 -> 32 workers on a 4-data-node cluster). The first sweep saturated every variant well inside the ramp: throughput climbed, peaked, then collapsed as source-side backpressure kicked in, and commit latency went exponential. We were measuring how each variant breaks rather than the latency-vs-throughput tradeoff the test is meant to expose. Lower offered load keeps the sink ahead of the workload for the bulk of the ramp, so latency reflects flush cadence (the linger tradeoff proper) instead of buffer fill. Informs: #170200 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Register cdc/linger/loadmatrix, a single test that sweeps a 3x5 matrix of (cadence, offered-rate) cells on one cluster. Each cell runs the kv workload at a fixed --max-rate for 4 minutes, producing a steady-state latency point per (cadence, rate) pair. Cell boundaries are logged as t.Status markers so the Grafana dashboard can be sliced by timestamp. The existing fixed-load cdc/linger/* variants show the latency cost of slack cadence cleanly but did not actually demonstrate the throughput cost of tight cadence — 10ms and 100ms saturated at indistinguishable points (~3.3 MiB/s) because the bottleneck was upstream, not the sink. The matrix pushes offered load above that point, so if tight cadences collapse first while slack cadences hold up, we have the tradeoff; otherwise we have evidence that for this Kafka config the linger story is mostly latency, not throughput. Either outcome is the answer #170200 wants. Cadence transitions go via CANCEL JOB + CREATE CHANGEFEED against the cached sink URI (same pattern as the previous sweep prototype). Topic consumers from the initial setupSink stay up across cadence changes. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Two fixes for cdc/linger/loadmatrix off the first run: 1. Zoom the load grid on the saturation transition. The first matrix used 12k/25k/40k/60k/80k, but cells at 12k and 25k were indistinguishable across cadences — both well below saturation and below where any cadence-dependent behavior would manifest. The interesting question is the exact rate each cadence's pipeline tips over at, so replace those low cells with finer points in the saturation zone: 40k/45k/50k/55k/60k/70k. Same wall-clock total (6 loads x 3 cadences x 4 min ~= 75 min). 2. Add `resolved = ''` to the CANCEL+CREATE changefeed SQL. The first changefeed (created via ct.newChangefeed) has it; the subsequent ones (created via raw SQL in the loop) did not, so resolved- timestamp emission stopped after the first cadence transition. The row data still flowed and commit_latency was still measured, but the topic consumer's resolved count froze and the Max Changefeed Latency Grafana panel had no highwater to advance. With this fix the lag signal survives all cadence transitions. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

The matrix workload was exiting on the first NEW_LEASE_PREVENTS_TXN / ambiguous-result error fired after a cadence transition that followed an over-saturation cell. Cell 6 (10ms x 70k) overdrove the sink for 4 minutes; the cluster was still working off that backpressure when cell 7 started, and the first transient SQL error killed the workload. Add --tolerate-errors to the kv workload command so transient errors log and continue rather than aborting the run. cdc_bench.go does the same thing conditionally for high-range-count cases — same class of mitigation for the same class of transient. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Register a new cdc/linger/control variant alongside the existing cdc/linger/{10ms,100ms,500ms,1s,5s} tests. The control omits kafka_sink_config entirely so the v2 sink falls through to kgo's default producer-side batching with no CDC-level linger trigger. This is the natural baseline for the linger experiment: it shows the floor the cadenced variants are trying to beat, and lets us check whether very tight cadences (10ms) are effectively pass-through for this setup. If 10ms and control look identical, that confirms the sub-ms per-flush overhead hypothesis directly. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Restore the original len(nData)*64 (256 worker) concurrency for all cdc/linger/* variants. The earlier cut to *8 kept the cadenced variants in the sub-saturation regime so latency reflected flush cadence, but it also capped offered load below what's needed to see where the no-batching control tops out. Pushing all variants to saturation makes the control comparable to the matrix data and lets the cadenced variants surface the same upstream-bound ceiling we measured before. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

blathers-crl · 2026-05-21T15:42:48Z

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

Move the kafka sink node to us-west1 while keeping crdb and workload in us-east1, so per-flush RTT to the sink is ~25-35ms instead of sub-ms. The earlier colocated runs showed cadence had no measurable throughput effect because per-flush overhead was rounding error compared to per- event work — the bottleneck was always upstream (encoder CPU at 90%). With a slow sink, per-flush cost becomes a real fraction of total work and Jeff's throughput formula (concurrency / latency * batch_size) should finally make slack cadence visibly beat tight cadence on throughput. That's the regime #170200's premise was about. Layout (with withNumSinkNodes(1) at runtime): node 1-4: crdb (us-east1-b) node 5: kafka sink (us-west1-b) node 6: workload (us-east1-b) CompatibleClouds narrows to OnlyGCE because GCEZones is GCE-specific. A reused single-region cluster won't work with --cluster for these; roachtest provisions a new multi-region one on first run. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add three initial-scan-only changefeed variants (control / 10ms / 100ms) that bulk-load 10M rows via IMPORT, then run an initial_scan='only' changefeed and wait for completion. Throughput is logged as rows_emitted / elapsed. The initial scan emits events as fast as the source-to-sink pipeline can drain — no per-write transaction overhead, no admission control gating — so the sink is more likely to be the binding constraint than under a live write workload. That's the regime where slack linger cadence should amortize per-flush overhead into a real throughput win. Also revert cdc/linger/* back to single-region. The multi-region variant didn't produce a measurable change in behavior (kgo pipelining hides the cross-region RTT well enough that the upstream encoder CPU stays the binding constraint), and the extra topology just makes the tests harder to reason about. cdc/linger/* now uses the same single-zone topology as cdc/initial-scan/*. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

cockroach-teamcity · 2026-05-26T15:23:54Z

⚪ Sysbench [SQL, 3node, oltp_read_write]

Metric	Old Commit	New Commit	Delta	Note
⚪ sec/op	9.357m ±2%	9.243m ±3%	~	p=0.267 n=15
⚪ allocs/op	6.258k ±0%	6.254k ±0%	~	p=0.830 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/755b3da/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/755b3da2f964ccc7d9509077f3acdd884a442029/bin/pkg_sql_tests benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=2ba887e --new=755b3da --memprofile ./pkg/sql/tests

⚪ Sysbench [KV, 3node, oltp_read_only]

Metric	Old Commit	New Commit	Delta	Note
⚪ sec/op	3.045m ±1%	3.058m ±1%	~	p=0.217 n=15
⚪ allocs/op	2.101k ±0%	2.101k ±0%	~	p=0.257 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/755b3da/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/755b3da2f964ccc7d9509077f3acdd884a442029/bin/pkg_sql_tests benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=2ba887e --new=755b3da --memprofile ./pkg/sql/tests

🔴 Sysbench [KV, 3node, oltp_write_only]

Metric	Old Commit	New Commit	Delta	Note
🔴 sec/op	2.799m ±1%	2.833m ±1%	+1.23%	p=0.000 n=15
⚪ allocs/op	4.202k ±0%	4.203k ±0%	~	p=0.059 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/755b3da/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/755b3da2f964ccc7d9509077f3acdd884a442029/bin/pkg_sql_tests benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=2ba887e --new=755b3da --memprofile ./pkg/sql/tests

Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/755b3da2f964ccc7d9509077f3acdd884a442029/26455525551-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/26455525551-1/\* old/

built with commit: 755b3da2f964ccc7d9509077f3acdd884a442029

10M produced only ~30s of throughput plateau before the scan completed, too brief to read meaningful differences between variants. 50M (~3.75 GB) gives several minutes of plateau at the throughput rates we've been seeing, which is what's needed to compare the control vs 10ms vs 100ms variants. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

cockroach-teamcity · 2026-05-26T15:51:50Z

⚪ Sysbench [SQL, 3node, oltp_read_write]

Metric	Old Commit	New Commit	Delta	Note
⚪ sec/op	10.34m ±10%	10.16m ±9%	~	p=0.436 n=15
⚪ allocs/op	6.269k ±1%	6.267k ±0%	~	p=0.560 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/4711765/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/471176512c71ef925ac0fd323345c07e691a769c/bin/pkg_sql_tests benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=2ba887e --new=4711765 --memprofile ./pkg/sql/tests

⚪ Sysbench [KV, 3node, oltp_read_only]

Metric	Old Commit	New Commit	Delta	Note
⚪ sec/op	3.134m ±1%	3.134m ±1%	~	p=0.967 n=15
⚪ allocs/op	2.102k ±0%	2.102k ±0%	~	p=0.524 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/4711765/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/471176512c71ef925ac0fd323345c07e691a769c/bin/pkg_sql_tests benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=2ba887e --new=4711765 --memprofile ./pkg/sql/tests

⚪ Sysbench [KV, 3node, oltp_write_only]

Metric	Old Commit	New Commit	Delta	Note
⚪ sec/op	2.959m ±0%	2.945m ±1%	-0.46%	p=0.009 n=15
⚪ allocs/op	4.217k ±0%	4.216k ±0%	~	p=0.599 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/4711765/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/471176512c71ef925ac0fd323345c07e691a769c/bin/pkg_sql_tests benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=2ba887e --new=4711765 --memprofile ./pkg/sql/tests

Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/471176512c71ef925ac0fd323345c07e691a769c/26458093175-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/26458093175-1/\* old/

built with commit: 471176512c71ef925ac0fd323345c07e691a769c

blathers-crl · 2026-05-26T15:54:29Z

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

Add two variants that combine 100ms linger with an explicit Flush.Messages size cap. The hypothesis under test: at sustained high load, the size cap fires before the linger timer, so the effective flush cadence collapses to cap/throughput rather than the configured 100ms. If 100ms-cap1500 matches 10ms throughput, that confirms the tuning recommendation can be reframed as "set a Messages cap based on amortization needs, treat Frequency as a low-load safety floor" instead of "set Frequency to match RTT." Refactor the variant registration to iterate over a struct slice since the (name, freq, messages) tuple no longer maps cleanly to a single flushFreq string. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

cockroach-teamcity · 2026-05-26T17:14:11Z

⚪ Sysbench [SQL, 3node, oltp_read_write]

Metric	Old Commit	New Commit	Delta	Note
⚪ sec/op	10.11m ±0%	10.10m ±0%	~	p=0.367 n=15
⚪ allocs/op	6.264k ±1%	6.258k ±1%	~	p=0.976 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/8e2b8eb/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb/bin/pkg_sql_tests benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=2ba887e --new=8e2b8eb --memprofile ./pkg/sql/tests

⚪ Sysbench [KV, 3node, oltp_read_only]

Metric	Old Commit	New Commit	Delta	Note
⚪ sec/op	3.053m ±1%	3.066m ±0%	+0.40%	p=0.004 n=15
⚪ allocs/op	2.101k ±0%	2.101k ±0%	~	p=0.623 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/8e2b8eb/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb/bin/pkg_sql_tests benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=2ba887e --new=8e2b8eb --memprofile ./pkg/sql/tests

⚪ Sysbench [KV, 3node, oltp_write_only]

Metric	Old Commit	New Commit	Delta	Note
⚪ sec/op	2.822m ±0%	2.835m ±0%	~	p=0.061 n=15
⚪ allocs/op	4.202k ±0%	4.205k ±0%	~	p=0.211 n=15

Reproduce

benchdiff binaries:

mkdir -p benchdiff/8e2b8eb/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb/bin/pkg_sql_tests benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=2ba887e --new=8e2b8eb --memprofile ./pkg/sql/tests

Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb/26462028738-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/26462028738-1/\* old/

built with commit: 8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb

aerfrei · 2026-05-27T16:26:57Z

Superceded by #170945

aerfrei and others added 9 commits May 19, 2026 15:49

aerfrei force-pushed the roachtest-cdc-linger-tradeoff-170200 branch from 735c435 to f0cd797 Compare May 21, 2026 15:46

aerfrei force-pushed the roachtest-cdc-linger-tradeoff-170200 branch from c383282 to 755b3da Compare May 26, 2026 14:46

cockroach-teamcity added the X-perf-check Microbenchmarks CI: Added to a PR if a performance regression is detected and should be checked label May 26, 2026

aerfrei closed this May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

roachtest/cdc: add linger-tradeoff perf variants#170586

roachtest/cdc: add linger-tradeoff perf variants#170586
aerfrei wants to merge 14 commits into
cockroachdb:masterfrom
aerfrei:roachtest-cdc-linger-tradeoff-170200

aerfrei commented May 19, 2026

Uh oh!

trunk-io Bot commented May 19, 2026

Uh oh!

cockroach-teamcity commented May 19, 2026

Uh oh!

blathers-crl Bot commented May 21, 2026

Uh oh!

cockroach-teamcity commented May 26, 2026

Uh oh!

cockroach-teamcity commented May 26, 2026

Uh oh!

blathers-crl Bot commented May 26, 2026

Uh oh!

cockroach-teamcity commented May 26, 2026

Uh oh!

aerfrei commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

aerfrei commented May 19, 2026

Uh oh!

trunk-io Bot commented May 19, 2026

Uh oh!

cockroach-teamcity commented May 19, 2026

Uh oh!

blathers-crl Bot commented May 21, 2026

Uh oh!

cockroach-teamcity commented May 26, 2026

Uh oh!

cockroach-teamcity commented May 26, 2026

Uh oh!

blathers-crl Bot commented May 26, 2026

Uh oh!

cockroach-teamcity commented May 26, 2026

Uh oh!

aerfrei commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants