Commit 656fac7
authored
Add missing envoy.vhost.vcluster.upstream_rq_time.99_5percentile to metadata.csv (DataDog#23770)
* Add missing envoy.vhost.vcluster.upstream_rq_time.99_5percentile to metadata.csv
* Add changelog entry for DataDog#23770
* Remove changelog entry (metadata-only change)
* Exercise Envoy listener immediately before E2E check scrape
Add a function-scoped exercise_envoy fixture that issues HTTP requests
to the listener right before each E2E test reads /stats. Without this,
the time between env setup (where the conftest's requests previously
lived) and the agent's check invocation can span multiple of Envoy's
5s flush windows, by which point the histogram interval values have
been reset to nan and the parser silently drops them.
Also temporarily drop the metadata entry for
envoy.vhost.vcluster.upstream_rq_time.99_5percentile to confirm CI
now reliably catches missing metadata.
* Restore conftest warm-up requests for integration tests
The integration test (test_check) relies on Envoy having processed
traffic before the check runs to assert metrics like
envoy.cluster.ext_authz.error.count. Keep the dd_environment warm-up
requests for that and have exercise_envoy re-fire just before each E2E
scrape.
* Use exercise_envoy fixture for integration tests too
Move the Envoy listener warm-up out of dd_environment and into the
function-scoped exercise_envoy fixture so it's shared by both the
integration tests (which previously relied on a side-effect inside
dd_environment) and the E2E tests. Single source of truth for "make
sure Envoy has traffic before this test runs."
* Wait for an Envoy stats flush after exercising the listener
Firing the requests immediately before the agent's scrape isn't enough —
Envoy only rolls samples into the histogram interval view at each 5s
flush, and the parser drops percentiles whose interval value is nan.
Sleep 6s so the scrape lands after the flush that captured the samples
but before the next empty flush resets them.
* Add envoy.vhost.vcluster.upstream_rq_time.99_5percentile to metadata.csv
Envoy 1.14+ emits a 99.5th percentile by default for all histograms,
including vhost.vcluster.upstream_rq_time. The other upstream_rq_time
families (cluster, cluster.external, etc.) already carry this entry;
this one was overlooked when those were added.
* Drive continuous traffic for one full Envoy flush interval
The previous single burst + 6s sleep relied on Envoy's flush cycle
aligning with the test's request time. While that landed in the safe
window in practice, the alignment isn't designed — it depends on
docker_run timing happening to be a multiple of the flush interval.
Spreading requests across the window removes that dependency: the most
recent completed flush always has samples, so the interval percentiles
are never reset to nan.
* Temporarily remove 99_5percentile metadata to validate continuous-load fixture
* Derive exercise_envoy timings from a flush-interval constant
* Restore envoy.vhost.vcluster.upstream_rq_time.99_5percentile in metadata.csv
* Document safe-scrape budget of exercise_envoy
* Move exercise_envoy to a background thread
Replace the synchronous loop+sleep fixture with a threading.Thread +
Event so requests keep firing through the entire test, including while
the agent's check is in flight. This removes the finite "safe scrape
window" the previous approach relied on — every flush window during the
test, including those that close mid-scrape, now has samples.
Also drop the 99_5percentile metadata entry temporarily to validate the
fixture continues to reliably trigger emission on master CI.
* Restore envoy.vhost.vcluster.upstream_rq_time.99_5percentile in metadata.csv1 parent 967373d commit 656fac7
6 files changed
Lines changed: 51 additions & 7 deletions
File tree
- envoy
- tests
- legacy
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
801 | 801 | | |
802 | 802 | | |
803 | 803 | | |
| 804 | + | |
804 | 805 | | |
805 | 806 | | |
806 | 807 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
6 | 8 | | |
7 | 9 | | |
8 | 10 | | |
| |||
13 | 15 | | |
14 | 16 | | |
15 | 17 | | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
16 | 25 | | |
17 | 26 | | |
18 | 27 | | |
| |||
35 | 44 | | |
36 | 45 | | |
37 | 46 | | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | 47 | | |
42 | 48 | | |
43 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
44 | 83 | | |
45 | 84 | | |
46 | 85 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
282 | 282 | | |
283 | 283 | | |
284 | 284 | | |
285 | | - | |
| 285 | + | |
286 | 286 | | |
287 | 287 | | |
288 | 288 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
16 | 20 | | |
17 | 21 | | |
18 | 22 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
0 commit comments