Skip to content

Commit f3828fe

Browse files
kaarolchcursoragentpront
authored
enhancement(tag_cardinality_limit transform): add exclude_tags option to bypass cardinality limiting (#25316)
* enhancement(tag_cardinality_limit transform): add top-level per_tag_limits Adds a `per_tag_limits` map at the top level of the `tag_cardinality_limit` config, reusing the `PerTagConfig` / `PerTagMode` types introduced in #25360 for the per-metric variant. Each entry uses the same `mode: limit_override` (with a custom `value_limit`) or `mode: excluded` shape, applied to every metric that does not match a `per_metric_limits` entry. Per-metric overrides shadow the global per-tag map: a matched metric only consults its own `per_tag_limits` (consistent with how the rest of the transform resolves per-metric vs. global config). Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tag_cardinality_limit transform): preserve untracked passthrough when value_limit is zero `tag_limit_exceeded` previously returned `true` for any missing-bucket lookup when `value_limit == 0`, which under `drop_event` caused events to be rejected before `record_tag_value` could detect that `max_tracked_keys` was exhausted. New (metric, tag-key) pairs that cannot be allocated must instead pass through unchecked and emit `tag_cardinality_untracked_events_total`. Guard the `value_limit == 0` rejection on `can_allocate_new_key()` so the untracked path runs whenever the pair would not actually be tracked. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(tag_cardinality_limit transform): explain global per_tag_limits with example Address review feedback from @pront: the field doc comment was correct but dense. Move the precedence rules and a worked YAML example into a new "Per-tag overrides" section under "How it works", and trim the field doc-comment to a one-line description that points at the new section. Also fix CRLF → LF line endings in the changelog fragment caught by `vdev check fmt`. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Pavlos Rontidis <pavlos.rontidis@gmail.com>
1 parent 35c7d5a commit f3828fe

6 files changed

Lines changed: 472 additions & 6 deletions

File tree

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
The `tag_cardinality_limit` transform now accepts a top-level `per_tag_limits` map,
2+
mirroring the per-metric one: `mode: limit_override` to set a per-tag cap, or
3+
`mode: excluded` to bypass cardinality tracking for that tag on every metric without a
4+
`per_metric_limits` entry.
5+
6+
authors: kaarolch

src/transforms/tag_cardinality_limit/config.rs

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,19 @@ pub struct Config {
4949
)]
5050
#[serde(default)]
5151
pub per_metric_limits: HashMap<String, PerMetricConfig>,
52+
53+
/// Global per-tag-key overrides, applied to every metric that does not match a
54+
/// `per_metric_limits` entry. Each entry sets `mode: limit_override` (with a
55+
/// per-tag `value_limit`) or `mode: excluded` (bypass tracking for that tag).
56+
///
57+
/// See the "Per-tag overrides" section under "How it works" for a worked example
58+
/// and the precedence rules.
59+
#[configurable(
60+
derived,
61+
metadata(docs::additional_props_description = "An individual tag configuration.")
62+
)]
63+
#[serde(default)]
64+
pub per_tag_limits: HashMap<String, PerTagConfig>,
5265
}
5366

5467
/// Controls how tag tracking state is partitioned across metrics.
@@ -311,6 +324,7 @@ impl GenerateConfig for Config {
311324
tracking_scope: TrackingScope::default(),
312325
max_tracked_keys: None,
313326
per_metric_limits: HashMap::default(),
327+
per_tag_limits: HashMap::default(),
314328
})
315329
.unwrap()
316330
}

src/transforms/tag_cardinality_limit/mod.rs

Lines changed: 29 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -87,24 +87,30 @@ impl TagCardinalityLimit {
8787
/// Per-tag entries support two modes:
8888
/// - `mode: limit_override` — uses the per-tag `value_limit`; all other settings
8989
/// (`mode`, `cache_size_per_key`, `limit_exceeded_action`, `internal_metrics`)
90-
/// are inherited from the per-metric config.
90+
/// are inherited from the enclosing per-metric (or, for global overrides, the
91+
/// global) config.
9192
/// - `mode: excluded` — opts the tag out entirely; all values pass through.
9293
///
9394
/// Per-metric exclusion is blanket: `mode: excluded` on a per-metric entry opts out
9495
/// every tag on that metric and `per_tag_limits` is ignored.
96+
///
97+
/// Per-metric `per_tag_limits` take precedence over the top-level
98+
/// `Config::per_tag_limits`: when a metric matches a per-metric entry, the global
99+
/// per-tag overrides are not consulted for that metric.
95100
fn get_config_for_metric_tag(
96101
&self,
97102
metric_key: Option<&MetricId>,
98103
tag_key: &str,
99104
) -> TagSettings {
100-
// No matching per-metric override → use the global config as-is.
105+
// No matching per-metric override → use the global config, with global
106+
// per-tag overrides layered on top.
101107
let Some((metric_namespace, metric_name)) = metric_key else {
102-
return TagSettings::Tracked(self.config.global);
108+
return self.apply_global_per_tag(tag_key);
103109
};
104110
let Some((_, per_metric)) = self.config.per_metric_limits.iter().find(|(name, cfg)| {
105111
*name == metric_name && (cfg.namespace.is_none() || cfg.namespace == *metric_namespace)
106112
}) else {
107-
return TagSettings::Tracked(self.config.global);
113+
return self.apply_global_per_tag(tag_key);
108114
};
109115

110116
// Per-metric exclusion is blanket — per-tag overrides do not apply.
@@ -140,6 +146,20 @@ impl TagCardinalityLimit {
140146
})
141147
}
142148

149+
/// Apply the top-level `per_tag_limits` (if any) on top of the global `Inner`.
150+
/// Used for metrics that do not match any `per_metric_limits` entry.
151+
fn apply_global_per_tag(&self, tag_key: &str) -> TagSettings {
152+
let global = self.config.global;
153+
match self.config.per_tag_limits.get(tag_key).map(|c| c.mode) {
154+
Some(PerTagMode::Excluded) => TagSettings::Excluded,
155+
Some(PerTagMode::LimitOverride { value_limit }) => TagSettings::Tracked(Inner {
156+
value_limit,
157+
..global
158+
}),
159+
None => TagSettings::Tracked(global),
160+
}
161+
}
162+
143163
/// Returns the `limit_exceeded_action` that applies to this metric. Decided once per event:
144164
/// per-metric override if any, else global.
145165
fn metric_action(&self, metric_key: Option<&MetricId>) -> LimitExceededAction {
@@ -237,9 +257,12 @@ impl TagCardinalityLimit {
237257
Some(value_set) if value_set.contains(value) => false,
238258
// Adding this value would push us at or past the configured cap. Treat a
239259
// missing bucket as an empty set so `value_limit: 0` correctly rejects
240-
// the first occurrence too.
260+
// the first occurrence too — but only when the (metric, tag) pair would
261+
// actually be tracked. If `max_tracked_keys` is exhausted, `record_tag_value`
262+
// will pass the tag through unchecked and emit `TagCardinalityLimitUntracked`,
263+
// so we must not pre-empt that path by reporting the limit as exceeded here.
241264
Some(value_set) => value_set.len() >= resolved.value_limit,
242-
None => resolved.value_limit == 0,
265+
None => resolved.value_limit == 0 && self.can_allocate_new_key(),
243266
}
244267
}
245268

0 commit comments

Comments
 (0)