feat: add capped sampling rate increase#8286
Conversation
BenchmarksBenchmark execution time: 2026-03-10 21:55:02 Comparing candidate commit ceb762f in PR branch Found 8 performance improvements and 7 performance regressions! Performance is the same for 154 metrics, 23 unstable metrics. scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0
scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net472
scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice netcoreapp3.1
scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync netcoreapp3.1
scenario:Benchmarks.Trace.NLogBenchmark.EnrichedLog net6.0
scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net6.0
scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore netcoreapp3.1
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net6.0
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan netcoreapp3.1
|
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8286) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8286) - mean (76ms) : 73, 78
master - mean (75ms) : 72, 77
section Bailout
This PR (8286) - mean (80ms) : 77, 83
master - mean (79ms) : 77, 80
section CallTarget+Inlining+NGEN
This PR (8286) - mean (1,089ms) : 1051, 1128
master - mean (1,083ms) : 1042, 1124
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8286) - mean (117ms) : 113, 121
master - mean (115ms) : 111, 118
section Bailout
This PR (8286) - mean (117ms) : 115, 120
master - mean (116ms) : 114, 119
section CallTarget+Inlining+NGEN
This PR (8286) - mean (775ms) : 717, 833
master - mean (777ms) : 718, 837
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8286) - mean (105ms) : 102, 109
master - mean (104ms) : 101, 108
section Bailout
This PR (8286) - mean (106ms) : 104, 109
master - mean (105ms) : 102, 109
section CallTarget+Inlining+NGEN
This PR (8286) - mean (767ms) : 702, 833
master - mean (764ms) : 691, 836
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8286) - mean (103ms) : 100, 107
master - mean (102ms) : 100, 105
section Bailout
This PR (8286) - mean (107ms) : 103, 110
master - mean (104ms) : 101, 106
section CallTarget+Inlining+NGEN
This PR (8286) - mean (687ms) : 658, 716
master - mean (676ms) : 659, 693
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8286) - mean (195ms) : 188, 201
master - mean (195ms) : 190, 200
section Bailout
This PR (8286) - mean (199ms) : 194, 204
master - mean (199ms) : 196, 203
section CallTarget+Inlining+NGEN
This PR (8286) - mean (1,163ms) : 1091, 1235
master - mean (1,153ms) : 1115, 1191
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8286) - mean (282ms) : 274, 289
master - mean (280ms) : 275, 284
section Bailout
This PR (8286) - mean (280ms) : 275, 285
master - mean (281ms) : 277, 285
section CallTarget+Inlining+NGEN
This PR (8286) - mean (951ms) : 918, 984
master - mean (945ms) : 908, 983
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8286) - mean (272ms) : 267, 277
master - mean (272ms) : 266, 278
section Bailout
This PR (8286) - mean (272ms) : 268, 275
master - mean (272ms) : 268, 275
section CallTarget+Inlining+NGEN
This PR (8286) - mean (931ms) : 882, 980
master - mean (937ms) : 906, 967
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8286) - mean (272ms) : 267, 277
master - mean (272ms) : 266, 277
section Bailout
This PR (8286) - mean (272ms) : 267, 277
master - mean (274ms) : 268, 280
section CallTarget+Inlining+NGEN
This PR (8286) - mean (841ms) : 814, 868
master - mean (839ms) : 814, 864
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
andrewlock
left a comment
There was a problem hiding this comment.
Thanks, but I think there's a bug which can leave you stuck never updating the rates (I haven't checked whether the same issue exists in the other referenced PRs). Also a few stylistic things
| [InlineData(0.2f, 0.8f, false, 0.2f)] // increase blocked during cooldown | ||
| public void CappedRate(float oldRate, float newRate, bool canIncrease, float expected) | ||
| { | ||
| AgentSamplingRule.CappedRate(oldRate, newRate, canIncrease).Should().Be(expected); |
There was a problem hiding this comment.
Will need to fix this test in line with the suggested changes above
Co-authored-by: Andrew Lock <andrewlock.net@gmail.com>
|
@codex review |
|
I havn't reworked the PR yet, sorry, I'll do it by EOW (need to get my dotnet setup working) |
When the trace-agent is restarted, a rate of 100% is initially provided by the trace-agent, increasing dramatically the number of traces sampled. A rate could go suddenly from 0.1% to 100% and back to 0.1% when the trace-agent eventually computes the new sampling rate.
In particular it is observed that when the agent restarts, the payload buffering that waits for new container tags breaches its memory limit and we send spans without container tags.
This PR applies a limit of sampling rate increases of x2 every 1s resulting in a x10 completed every 3-4s
1->100% takes 7s
0.1 -> 100% takes 10s
RFC: https://docs.google.com/document/d/1h8cnGfOUx688pvRVqbhp6irn8s29LIwNYL1ge17nLyw/edit?usp=sharing
Similar PRs
Other details