Skip to content

obs/ash: add BenchmarkSetWorkState microbenchmark#168292

Merged
trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
alyshanjahani-crl:alyshan/ash-bench-set-work-state
Apr 22, 2026
Merged

obs/ash: add BenchmarkSetWorkState microbenchmark#168292
trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
alyshanjahani-crl:alyshan/ash-bench-set-work-state

Conversation

@alyshanjahani-crl
Copy link
Copy Markdown
Collaborator

Add a microbenchmark for the SetWorkState + clearWorkState hot path. This complements the existing end-to-end BenchmarkASH in pkg/bench by isolating the activeWorkStates sync.Map contention that degrades on high-core machines. Varying GOMAXPROCS via -test.cpu shows how per-operation latency scales with core count due to cache coherence traffic on the map's internal mutex.

Fixes: #164683

Release note: None

@alyshanjahani-crl alyshanjahani-crl requested a review from a team as a code owner April 13, 2026 21:01
@alyshanjahani-crl alyshanjahani-crl requested review from kyle-a-wong and removed request for a team April 13, 2026 21:01
@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io Bot commented Apr 13, 2026

😎 Merged successfully - details.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 13, 2026

Detected infrastructure failure (matched: ). Automatically rerunning failed jobs. (run link)

1 similar comment
@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 13, 2026

Detected infrastructure failure (matched: ). Automatically rerunning failed jobs. (run link)

@alyshanjahani-crl
Copy link
Copy Markdown
Collaborator Author

Note to self - update obs/ash/doc.go - I just noticed its incorrect re:

// The on-hot-path cost is limited to SetWorkState and its cleanup
// function: a sync.Pool get/put and a sync.Map store/load. Benchmarks
// show ASH adds a near-fixed ~600-700 bytes and ~17-21 allocations per
// operation with no statistically significant impact on latency or
// throughput.

That 600-700 bytes and allocations is referring to the POC benchmarks - so this is outdated / not true.

https://docs.google.com/document/d/164fCBKytBGiRDPPMezQDGGiQClVW0wJLAzCEmwb2QKo/edit?tab=t.7vnc0waxjm6l

@alyshanjahani-crl alyshanjahani-crl force-pushed the alyshan/ash-bench-set-work-state branch from c2f1815 to 387c8eb Compare April 21, 2026 19:26
@alyshanjahani-crl
Copy link
Copy Markdown
Collaborator Author

alyshanjahani-crl commented Apr 21, 2026


  Raw results (6 iterations, --bench-time=5s, Intel Xeon @ 2.80GHz, 24 vCPUs):

  BenchmarkSetWorkState         34110825               171.1 ns/op
  BenchmarkSetWorkState         34300887               172.2 ns/op
  BenchmarkSetWorkState         33504736               171.0 ns/op
  BenchmarkSetWorkState         34330886               171.1 ns/op
  BenchmarkSetWorkState         34143302               171.2 ns/op
  BenchmarkSetWorkState         33815146               172.3 ns/op
  BenchmarkSetWorkState-2       20995471               287.8 ns/op
  BenchmarkSetWorkState-2       20942814               288.0 ns/op
  BenchmarkSetWorkState-2       20575088               287.6 ns/op
  BenchmarkSetWorkState-2       20778625               287.8 ns/op
  BenchmarkSetWorkState-2       20296778               287.9 ns/op
  BenchmarkSetWorkState-2       20407939               289.6 ns/op
  BenchmarkSetWorkState-4       21831841               252.7 ns/op
  BenchmarkSetWorkState-4       22222024               261.2 ns/op
  BenchmarkSetWorkState-4       23143623               261.8 ns/op
  BenchmarkSetWorkState-4       23113642               255.8 ns/op
  BenchmarkSetWorkState-4       23917795               241.0 ns/op
  BenchmarkSetWorkState-4       24120210               253.7 ns/op
  BenchmarkSetWorkState-8       17623387               308.2 ns/op
  BenchmarkSetWorkState-8       19018402               321.8 ns/op
  BenchmarkSetWorkState-8       16432999               323.9 ns/op
  BenchmarkSetWorkState-8       19350806               312.2 ns/op
  BenchmarkSetWorkState-8       16972249               324.9 ns/op
  BenchmarkSetWorkState-8       19545050               307.9 ns/op
  BenchmarkSetWorkState-12      17055940               347.7 ns/op
  BenchmarkSetWorkState-12      15682952               348.7 ns/op
  BenchmarkSetWorkState-12      17353879               345.0 ns/op
  BenchmarkSetWorkState-12      16530517               353.3 ns/op
  BenchmarkSetWorkState-12      15407740               390.0 ns/op
  BenchmarkSetWorkState-12      17088633               345.7 ns/op
  BenchmarkSetWorkState-24      12323400               479.7 ns/op
  BenchmarkSetWorkState-24      12400852               434.8 ns/op
  BenchmarkSetWorkState-24      12586096               476.2 ns/op
  BenchmarkSetWorkState-24      12506076               475.1 ns/op
  BenchmarkSetWorkState-24      12657166               476.9 ns/op
  BenchmarkSetWorkState-24      12559584               477.7 ns/op

  benchstat summary:

                  │ sec/op         │
  SetWorkState              171.1n ±  1%
  SetWorkState-2            287.9n ±  1%
  SetWorkState-4            254.7n ±  5%
  SetWorkState-8            317.0n ±  3%
  SetWorkState-12           348.2n ± 12%
  SetWorkState-24           476.6n ±  9%
  geomean                   295.1n

The contention curve is clear: 2.8x degradation from 1→24 cores.
The ±12% variance at 12 cores and ±9% at 24 suggests increasing jitter from cache-line bouncing on the syncutil.Map mutex.
The 4-core dip below 2-core is interesting — likely the map's read path benefits from parallel Load hits before write contention dominates.

Add a microbenchmark for the SetWorkState + clearWorkState hot path.
This complements the existing end-to-end BenchmarkASH in pkg/bench by
isolating the activeWorkStates sync.Map contention that degrades on
high-core machines. Varying GOMAXPROCS via -test.cpu shows how
per-operation latency scales with core count due to cache coherence
traffic on the map's internal mutex.

Also updates old / inaccurate information in obs/ash/doc.go

Fixes: cockroachdb#164683

Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@alyshanjahani-crl alyshanjahani-crl force-pushed the alyshan/ash-bench-set-work-state branch from 387c8eb to ea2b7d9 Compare April 21, 2026 19:38
@alyshanjahani-crl
Copy link
Copy Markdown
Collaborator Author

TFTR!

@alyshanjahani-crl
Copy link
Copy Markdown
Collaborator Author

/trunk merge

@trunk-io trunk-io Bot merged commit e774873 into cockroachdb:master Apr 22, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

obs/ash: add microbenchmarks for ASH

3 participants