roachtest/cdc: add linger-tradeoff perf variants#170586
Conversation
Add cdc/linger/10ms and cdc/linger/5s as Weekly-suite benchmarks of the v2 kafka sink. Both run the same kv workload with `--ramp 10m --duration 12m --read-percent 0` against a single-table changefeed and differ only in `kafka_sink_config.Flush.Frequency`, which the v2 kafka path wires straight into `batchingSink.minFlushFrequency` (pkg/ccl/changefeedccl/sink_kafka_v2.go:430). Together the two curves trace the linger tradeoff on a real Kafka sink: the tight cadence should win at low load, the slack cadence at saturation. This closes the microbench gap called out in #170200, where the existing batching benchmarks don't capture per-flush network/broker cost. Server-side `changefeed.commit_latency` (p50, p99) and the rate of `changefeed.emitted_messages` are exported to roachperf as per-tick prometheus series via a parameterized `startStatsCollection`. When the no-linger sink lands, a third registration that flips `changefeed.no_linger_sink.enabled` will overlay against the same methodology and produce the comparison plot #170574 asks for. Informs: #170200 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
Two fixes for cdc/linger/* roachperf surfaced by the first run: 1. NaN-scrub stats before serialization. The new commit_latency p50/p99 aggregations are histogram_quantile-based, which returns NaN whenever the rate window has zero increments — at the head of the run before the first event, and at the tail after the workload stops. Go's encoding/json refuses to serialize NaN, so the entire perf export crashed at the end of the test. Route the export through dryRun=true, walk the returned ClusterStatRun zeroing any NaN, then call SerializeOutRun manually. online_restore.go has the same workaround inline (with a comment noting PromQL `or vector(0)` does not fix it). 2. Cut --duration from 12m to 2m. kv's --duration is additive with --ramp (workload/cli/run.go:51), so the original 10m+12m gave a 22m workload — twice what the plan called for. The corrected 10m ramp + 2m tail matches the plan: a steady-state segment at peak just long enough to confirm saturation. Informs: #170200 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Replace cdc/linger/{10ms,5s} with a single cdc/linger/sweep test that
walks Flush.Frequency across 10ms, 100ms, 500ms, 1s, 5s back-to-back on
one cluster. Each variant runs the same kv --ramp workload; the test
cancels the previous changefeed and issues a fresh CREATE CHANGEFEED
against the cached sink URI between variants, while the topic consumers
spawned by the initial setupSink keep draining throughout.
Variant boundaries are logged as t.Status `=== variant N/M ===` lines
in test.log, and the Grafana dashboard spans the full run so each
variant is identifiable by its time range. No per-variant stats.json is
written — Grafana is the source of truth.
Lets the whole sweep run unattended from one command rather than
queueing N separate tests.
Informs: #170200
Epic: CRDB-62737
Release note: None
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Replace cdc/linger/sweep with five independent registrations
cdc/linger/{10ms,100ms,500ms,1s,5s}. Each test runs a single kv --ramp
workload against a v2 Kafka changefeed with that Flush.Frequency,
restoring the per-test artifact directory, independent failure
recovery, and easy single-variant re-runs.
The whole sweep can still be driven from one command on a reused
cluster:
bin/roachtest run --cluster <name> --wipe cdc/linger
which matches all five tests by substring and runs them in series,
wiping cluster state between each.
Informs: #170200
Epic: CRDB-62737
Release note: None
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Cut concurrency from len(nData)*64 to len(nData)*8 (256 -> 32 workers on a 4-data-node cluster). The first sweep saturated every variant well inside the ramp: throughput climbed, peaked, then collapsed as source-side backpressure kicked in, and commit latency went exponential. We were measuring how each variant breaks rather than the latency-vs-throughput tradeoff the test is meant to expose. Lower offered load keeps the sink ahead of the workload for the bulk of the ramp, so latency reflects flush cadence (the linger tradeoff proper) instead of buffer fill. Informs: #170200 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Register cdc/linger/loadmatrix, a single test that sweeps a 3x5 matrix of (cadence, offered-rate) cells on one cluster. Each cell runs the kv workload at a fixed --max-rate for 4 minutes, producing a steady-state latency point per (cadence, rate) pair. Cell boundaries are logged as t.Status markers so the Grafana dashboard can be sliced by timestamp. The existing fixed-load cdc/linger/* variants show the latency cost of slack cadence cleanly but did not actually demonstrate the throughput cost of tight cadence — 10ms and 100ms saturated at indistinguishable points (~3.3 MiB/s) because the bottleneck was upstream, not the sink. The matrix pushes offered load above that point, so if tight cadences collapse first while slack cadences hold up, we have the tradeoff; otherwise we have evidence that for this Kafka config the linger story is mostly latency, not throughput. Either outcome is the answer #170200 wants. Cadence transitions go via CANCEL JOB + CREATE CHANGEFEED against the cached sink URI (same pattern as the previous sweep prototype). Topic consumers from the initial setupSink stay up across cadence changes. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Two fixes for cdc/linger/loadmatrix off the first run: 1. Zoom the load grid on the saturation transition. The first matrix used 12k/25k/40k/60k/80k, but cells at 12k and 25k were indistinguishable across cadences — both well below saturation and below where any cadence-dependent behavior would manifest. The interesting question is the exact rate each cadence's pipeline tips over at, so replace those low cells with finer points in the saturation zone: 40k/45k/50k/55k/60k/70k. Same wall-clock total (6 loads x 3 cadences x 4 min ~= 75 min). 2. Add `resolved = ''` to the CANCEL+CREATE changefeed SQL. The first changefeed (created via ct.newChangefeed) has it; the subsequent ones (created via raw SQL in the loop) did not, so resolved- timestamp emission stopped after the first cadence transition. The row data still flowed and commit_latency was still measured, but the topic consumer's resolved count froze and the Max Changefeed Latency Grafana panel had no highwater to advance. With this fix the lag signal survives all cadence transitions. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
The matrix workload was exiting on the first NEW_LEASE_PREVENTS_TXN / ambiguous-result error fired after a cadence transition that followed an over-saturation cell. Cell 6 (10ms x 70k) overdrove the sink for 4 minutes; the cluster was still working off that backpressure when cell 7 started, and the first transient SQL error killed the workload. Add --tolerate-errors to the kv workload command so transient errors log and continue rather than aborting the run. cdc_bench.go does the same thing conditionally for high-range-count cases — same class of mitigation for the same class of transient. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Register a new cdc/linger/control variant alongside the existing
cdc/linger/{10ms,100ms,500ms,1s,5s} tests. The control omits
kafka_sink_config entirely so the v2 sink falls through to kgo's
default producer-side batching with no CDC-level linger trigger.
This is the natural baseline for the linger experiment: it shows the
floor the cadenced variants are trying to beat, and lets us check
whether very tight cadences (10ms) are effectively pass-through for
this setup. If 10ms and control look identical, that confirms the
sub-ms per-flush overhead hypothesis directly.
Informs: #170200, #170574
Epic: CRDB-62737
Release note: None
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Restore the original len(nData)*64 (256 worker) concurrency for all cdc/linger/* variants. The earlier cut to *8 kept the cadenced variants in the sub-saturation regime so latency reflected flush cadence, but it also capped offered load below what's needed to see where the no-batching control tops out. Pushing all variants to saturation makes the control comparable to the matrix data and lets the cadenced variants surface the same upstream-bound ceiling we measured before. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
|
Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link) |
Move the kafka sink node to us-west1 while keeping crdb and workload in us-east1, so per-flush RTT to the sink is ~25-35ms instead of sub-ms. The earlier colocated runs showed cadence had no measurable throughput effect because per-flush overhead was rounding error compared to per- event work — the bottleneck was always upstream (encoder CPU at 90%). With a slow sink, per-flush cost becomes a real fraction of total work and Jeff's throughput formula (concurrency / latency * batch_size) should finally make slack cadence visibly beat tight cadence on throughput. That's the regime #170200's premise was about. Layout (with withNumSinkNodes(1) at runtime): node 1-4: crdb (us-east1-b) node 5: kafka sink (us-west1-b) node 6: workload (us-east1-b) CompatibleClouds narrows to OnlyGCE because GCEZones is GCE-specific. A reused single-region cluster won't work with --cluster for these; roachtest provisions a new multi-region one on first run. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
735c435 to
f0cd797
Compare
Add three initial-scan-only changefeed variants (control / 10ms / 100ms) that bulk-load 10M rows via IMPORT, then run an initial_scan='only' changefeed and wait for completion. Throughput is logged as rows_emitted / elapsed. The initial scan emits events as fast as the source-to-sink pipeline can drain — no per-write transaction overhead, no admission control gating — so the sink is more likely to be the binding constraint than under a live write workload. That's the regime where slack linger cadence should amortize per-flush overhead into a real throughput win. Also revert cdc/linger/* back to single-region. The multi-region variant didn't produce a measurable change in behavior (kgo pipelining hides the cross-region RTT well enough that the upstream encoder CPU stays the binding constraint), and the extra topology just makes the tests harder to reason about. cdc/linger/* now uses the same single-zone topology as cdc/initial-scan/*. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
c383282 to
755b3da
Compare
⚪ Sysbench [SQL, 3node, oltp_read_write]
Reproducebenchdiff binaries: mkdir -p benchdiff/755b3da/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/755b3da2f964ccc7d9509077f3acdd884a442029/bin/pkg_sql_tests benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=2ba887e --new=755b3da --memprofile ./pkg/sql/tests⚪ Sysbench [KV, 3node, oltp_read_only]
Reproducebenchdiff binaries: mkdir -p benchdiff/755b3da/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/755b3da2f964ccc7d9509077f3acdd884a442029/bin/pkg_sql_tests benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=2ba887e --new=755b3da --memprofile ./pkg/sql/tests🔴 Sysbench [KV, 3node, oltp_write_only]
Reproducebenchdiff binaries: mkdir -p benchdiff/755b3da/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/755b3da2f964ccc7d9509077f3acdd884a442029/bin/pkg_sql_tests benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/755b3da/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=2ba887e --new=755b3da --memprofile ./pkg/sql/testsArtifactsdownload: mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/755b3da2f964ccc7d9509077f3acdd884a442029/26455525551-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/26455525551-1/\* old/built with commit: 755b3da2f964ccc7d9509077f3acdd884a442029 |
10M produced only ~30s of throughput plateau before the scan completed, too brief to read meaningful differences between variants. 50M (~3.75 GB) gives several minutes of plateau at the throughput rates we've been seeing, which is what's needed to compare the control vs 10ms vs 100ms variants. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
⚪ Sysbench [SQL, 3node, oltp_read_write]
Reproducebenchdiff binaries: mkdir -p benchdiff/4711765/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/471176512c71ef925ac0fd323345c07e691a769c/bin/pkg_sql_tests benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=2ba887e --new=4711765 --memprofile ./pkg/sql/tests⚪ Sysbench [KV, 3node, oltp_read_only]
Reproducebenchdiff binaries: mkdir -p benchdiff/4711765/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/471176512c71ef925ac0fd323345c07e691a769c/bin/pkg_sql_tests benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=2ba887e --new=4711765 --memprofile ./pkg/sql/tests⚪ Sysbench [KV, 3node, oltp_write_only]
Reproducebenchdiff binaries: mkdir -p benchdiff/4711765/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/471176512c71ef925ac0fd323345c07e691a769c/bin/pkg_sql_tests benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/4711765/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=2ba887e --new=4711765 --memprofile ./pkg/sql/testsArtifactsdownload: mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/471176512c71ef925ac0fd323345c07e691a769c/26458093175-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/26458093175-1/\* old/built with commit: 471176512c71ef925ac0fd323345c07e691a769c |
|
Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link) |
Add two variants that combine 100ms linger with an explicit Flush.Messages size cap. The hypothesis under test: at sustained high load, the size cap fires before the linger timer, so the effective flush cadence collapses to cap/throughput rather than the configured 100ms. If 100ms-cap1500 matches 10ms throughput, that confirms the tuning recommendation can be reframed as "set a Messages cap based on amortization needs, treat Frequency as a low-load safety floor" instead of "set Frequency to match RTT." Refactor the variant registration to iterate over a struct slice since the (name, freq, messages) tuple no longer maps cleanly to a single flushFreq string. Informs: #170200, #170574 Epic: CRDB-62737 Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
⚪ Sysbench [SQL, 3node, oltp_read_write]
Reproducebenchdiff binaries: mkdir -p benchdiff/8e2b8eb/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb/bin/pkg_sql_tests benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=2ba887e --new=8e2b8eb --memprofile ./pkg/sql/tests⚪ Sysbench [KV, 3node, oltp_read_only]
Reproducebenchdiff binaries: mkdir -p benchdiff/8e2b8eb/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb/bin/pkg_sql_tests benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=2ba887e --new=8e2b8eb --memprofile ./pkg/sql/tests⚪ Sysbench [KV, 3node, oltp_write_only]
Reproducebenchdiff binaries: mkdir -p benchdiff/8e2b8eb/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb/bin/pkg_sql_tests benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/8e2b8eb/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/2ba887e/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/bin/pkg_sql_tests benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2ba887e/bin/1058449141/cockroachdb_cockroach_pkg_sql_testsbenchdiff command: # NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=2ba887e --new=8e2b8eb --memprofile ./pkg/sql/testsArtifactsdownload: mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb/26462028738-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/2ba887ef6734ff41f2b1e2ee4efc1aa84760e711/26462028738-1/\* old/built with commit: 8e2b8eb40bf019ad414e35dc9f36a9ed1f3f7bcb |
|
Superceded by #170945 |
Add cdc/linger/10ms and cdc/linger/5s as Weekly-suite benchmarks of the v2 kafka sink. Both run the same kv workload with
--ramp 10m --duration 12m --read-percent 0against a single-table changefeed and differ only inkafka_sink_config.Flush.Frequency, which the v2 kafka path wires straight intobatchingSink.minFlushFrequency(pkg/ccl/changefeedccl/sink_kafka_v2.go:430). Together the two curves trace the linger tradeoff on a real Kafka sink: the tight cadence should win at low load, the slack cadence at saturation. This closes the microbench gap called out in #170200, where the existing batching benchmarks don't capture per-flush network/broker cost.
Server-side
changefeed.commit_latency(p50, p99) and the rate ofchangefeed.emitted_messagesare exported to roachperf as per-tick prometheus series via a parameterizedstartStatsCollection. When the no-linger sink lands, a third registration that flipschangefeed.no_linger_sink.enabledwill overlay against the same methodology and produce the comparison plot #170574 asks for.Informs: #170200
Epic: CRDB-62737
Release note: None