Skip to content

feat: add reduce metrics for rust data plane#3364

Open
adarsh0728 wants to merge 5 commits intomainfrom
reduce-metrics
Open

feat: add reduce metrics for rust data plane#3364
adarsh0728 wants to merge 5 commits intomainfrom
reduce-metrics

Conversation

@adarsh0728
Copy link
Copy Markdown
Member

@adarsh0728 adarsh0728 commented Apr 14, 2026

What this PR does / why we need it

Metrics Added

Metric Type Description
reduce_active_windows Gauge Number of currently open reduce windows
reduce_closed_windows Gauge Number of closed windows awaiting GC
reduce_watermark_lag Gauge Difference between wall clock and watermark (ms)
reduce_window_processing_time Histogram Window open-to-close latency (μs)
reduce_pnf_process_time Histogram UDF reduce function execution time per window (μs)
reduce_pbq_write Counter Total messages written to PBQ
  • Watermark exposed as lag, not raw epochwall_clock - watermark in milliseconds is directly alertable (e.g., "lag > 5min")

  • Updated metrics.md as well

Related issues

Fixes #3234

Testing

Test Pipelines

  • Aligned (fixed window) — examples/6-reduce-fixed-window.yaml — even-odd-sum with 60s fixed windows, 2 partitions, emptyDir storage
  • Unaligned (session window) — examples/12-simple-session-pipeline.yaml — simple-session-counter with 120s session timeout, 1 partition, PVC storage

Aligned (Fixed Window) — even-odd-sum Pipeline
Total inputs sent: 101 messages (1 priming + 100 batch)

Metric Replica 0 Replica 1 Verdict
reduce_pbq_write_total 51 50 ✅ Sum = 101, exactly matches input
reduce_active_windows 1 1
reduce_closed_windows 0 0
reduce_watermark_lag 3703 ms 3705 ms ✅ ~3.7s lag
reduce_window_processing_time count=1, sum=12.0M μs (window not yet closed)
reduce_pnf_process_time count=1, sum=12.0M μs (window not yet closed)

Unaligned (Session Window) — simple-session-counter Pipeline
Total inputs sent: 101 messages (1 priming + 100 batch)
After 60s idle: counter stays at 101 — no WMB inflation.

Metric Value Verdict
reduce_pbq_write_total 101 ✅ Exactly matches input
reduce_active_windows 2 ✅ Two keyed sessions (even/odd)
reduce_closed_windows 0
reduce_watermark_lag 33953 ms ✅ ~34s, matches maxDelay: 30s config
reduce_window_processing_time not emitted ✅ Aligned-only by design
reduce_pnf_process_time not emitted ✅ Aligned-only by design

Signed-off-by: adarsh0728 <gooneriitk@gmail.com>
Signed-off-by: adarsh0728 <gooneriitk@gmail.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 88.14433% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.51%. Comparing base (ce2ac17) to head (4f04d15).

Files with missing lines Patch % Lines
...aflow-core/src/reduce/reducer/unaligned/reducer.rs 52.94% 16 Missing ⚠️
...umaflow-core/src/reduce/reducer/aligned/reducer.rs 87.50% 6 Missing ⚠️
rust/numaflow-core/src/metrics/mod.rs 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3364      +/-   ##
==========================================
- Coverage   82.53%   82.51%   -0.03%     
==========================================
  Files         307      307              
  Lines       75325    75517     +192     
==========================================
+ Hits        62172    62313     +141     
- Misses      12595    12649      +54     
+ Partials      558      555       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: adarsh0728 <gooneriitk@gmail.com>

# Conflicts:
#	rust/numaflow-core/src/reduce/pbq.rs
Signed-off-by: adarsh0728 <gooneriitk@gmail.com>
@adarsh0728 adarsh0728 marked this pull request as ready for review May 1, 2026 10:37
Signed-off-by: adarsh0728 <gooneriitk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add metrics for Reduce operations

1 participant