feat(perf-monitor): aggregate CPU/RAM across descendant processes by shm11C3 · Pull Request #1519 · shm11C3/HardwareVisualizer

shm11C3 · 2026-05-12T23:33:50Z

Summary

perf-monitor previously sampled only the parent PID, so apps that put most of their CPU/RSS in child processes (e.g. the Tauri WebView2 host) were significantly under-reported. This PR aggregates CPU and RSS across the parent process and all of its descendants.
Switched the per-tick refresh_processes from Some(&[pid]) to All so children spawned mid-measurement are also picked up. On each tick we build a parent→children adjacency map and DFS from the root PID to sum metrics.
Extracted the tree walk into a pure function sum_subtree(root, children, metrics) with a visited set that guards against cycles caused by PID reuse / inconsistent snapshots.

Changes

perf-monitor/src/monitor.rs
- Added aggregate_descendants: builds per-PID metrics and a parent→children map from the System snapshot, then returns a ProcessMetrics summed over the subtree rooted at the target PID.
- Added sum_subtree: pure, cycle-safe DFS that totals CPU and RSS.
- Initial refresh, warmup, and measurement loop all use ProcessesToUpdate::All now.
perf-monitor/README.md
- Updated the Metrics section to document the parent-plus-descendants aggregation and explain why it matters for multi-process layouts like the Tauri WebView2 host.

Test plan

cargo test --manifest-path perf-monitor/Cargo.toml — 13/13 PASS, including 5 new unit tests for the descendant aggregation (root only, multi-level descendants, sibling tree excluded, cyclic parent chain, missing root).
cargo clippy --manifest-path perf-monitor/Cargo.toml --all-targets -- -D warnings — clean.
cargo build --manifest-path perf-monitor/Cargo.toml — succeeds.
Verify the perf-test CI workflow on Windows / Linux / macOS to confirm the new aggregated values do not unexpectedly trip thresholds; adjust perf-thresholds.toml in a follow-up PR if needed.

https://claude.ai/code/session_01XnYxTyEBg9CarpcTbEArjn

Summary by CodeRabbit

Documentation
- Clarified CPU and memory metrics to state aggregation across a process tree.
New Features
- Added per-process breakdowns in text and JSON reports and explicit failure reasons when checks fail.
- Report output now shows observed process count and a process-tree summary.
Bug Fixes
- Measurement now aggregates CPU and memory across all descendant processes and handles inconsistent parent chains safely.
- Improved error output on monitoring failure.
Tests
- Expanded unit tests covering subtree traversal and per-process statistics.

Apps that spawn helper or renderer processes (e.g. Tauri's WebView2 host) put the bulk of their resource usage in children, so sampling only the parent PID under-reported CPU and RSS. Refresh all processes each tick and sum the metrics over the parent and its full descendant tree, with a visited-set guard so a pathological parent chain can't double-count or loop. https://claude.ai/code/session_01XnYxTyEBg9CarpcTbEArjn

coderabbitai · 2026-05-12T23:34:01Z

📝 Walkthrough

Walkthrough

Aggregates CPU and RSS metrics across the target process and its descendants each tick, records per-PID samples, computes per-process avg/max/p95 CPU and memory, surfaces per-process breakdowns and failure reasons in text/JSON reports, and updates docs and CLI error text.

Changes

Process Tree Aggregation

Layer / File(s)	Summary
Types and baseline/warmup setup `perf-monitor/src/monitor.rs`	Add `ProcessStats` and internal snapshot/track types; extend `MonitorResult` with `per_process`; refresh CPU deltas for all processes during initial baseline and warmup; initialize per-PID tracks.
Measurement loop and subtree capture `perf-monitor/src/monitor.rs`	Each tick refreshes all processes, `capture_subtree`/`walk_subtree` traverse the root PID subtree with a visited-set, subtree CPU/RSS are summed for the tick, per-PID samples are accumulated, and monitoring errors if root PID disappears.
Per-process aggregation `perf-monitor/src/monitor.rs`	`compute_result` accepts `per_pid_tracks`; `compute_per_process_stats` computes avg/max/p95 CPU and memory (MB) per PID, filters empty tracks, and sorts root-first then by descending avg CPU.
Unit tests `perf-monitor/src/monitor.rs`	Add `pid()`/`snap()` helpers; add `walk_subtree` tests (root-only, descendants, sibling exclusion, cycle handling, missing-root); extend `compute_result` tests to validate `per_process`.
Reporting and JSON output `perf-monitor/src/report.rs`	Import `ProcessStats`; add `failure_reasons` to `CheckResult`; extend `JsonReport` and add `PerProcessReport`; populate `per_process` and `failure_reasons` in JSON; print observed process counts, per-process breakdown, and failure reasons in text output.
Docs and CLI error output `perf-monitor/README.md`, `perf-monitor/src/main.rs`	README Metrics section rewritten to document process-tree aggregation semantics; `main` now emits a standardized multi-line “Performance Test Failed” message with fixed failure reason and underlying error details on monitoring failures.

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 From root PID down to every sprout,

I add their CPU and memory out.
I guard against cycles, count each leaf,
report their stats and list the grief.
Hooray—per-process tales, summed neat and stout.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: aggregating CPU/RAM metrics across descendant processes in perf-monitor.
Description check	✅ Passed	The PR description provides a comprehensive summary of changes, includes a detailed test plan with results, and covers the motivation and implementation approach.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/include-child-process-metrics-cVM4D

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@perf-monitor/src/monitor.rs`:
- Around line 90-93: The branch handling aggregate_descendants returning None
currently discards process exit diagnostics and returns a generic error; change
it to re-check the child's try_wait (call the same logic used earlier) and, if
the child has exited, call format_exit_error to produce a detailed error
(including exit status/stderr) instead of the generic message; otherwise
propagate the original "process disappeared" error — update the match arm around
aggregate_descendants(&system, pid, num_cpus) to perform the try_wait check and
delegate to format_exit_error when appropriate so exit diagnostics are
preserved.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 7d29325f-1a22-4016-9ffb-4286720278cc

📥 Commits

Reviewing files that changed from the base of the PR and between d4d9d0b and 9d5c6c4.

📒 Files selected for processing (2)

perf-monitor/README.md
perf-monitor/src/monitor.rs

coderabbitai · 2026-05-12T23:36:49Z

+    match aggregate_descendants(&system, pid, num_cpus) {
+      Some(metrics) => samples.push(metrics),
+      None => return Err("Process disappeared during measurement".into()),
    }


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preserve exit diagnostics when aggregation returns None.

If the process exits between the earlier try_wait and aggregate_descendants, this path drops exit status/stderr details and returns only a generic message. Re-checking try_wait here and delegating to format_exit_error improves debuggability.

Suggested patch

- match aggregate_descendants(&system, pid, num_cpus) { - Some(metrics) => samples.push(metrics), - None => return Err("Process disappeared during measurement".into()), - } + match aggregate_descendants(&system, pid, num_cpus) { + Some(metrics) => samples.push(metrics), + None => { + if guard.0.try_wait()?.is_some() { + return Err(format_exit_error(&mut guard, "measurement")); + } + return Err("Process disappeared during measurement".into()); + } + }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@perf-monitor/src/monitor.rs` around lines 90 - 93, The branch handling aggregate_descendants returning None currently discards process exit diagnostics and returns a generic error; change it to re-check the child's try_wait (call the same logic used earlier) and, if the child has exited, call format_exit_error to produce a detailed error (including exit status/stderr) instead of the generic message; otherwise propagate the original "process disappeared" error — update the match arm around aggregate_descendants(&system, pid, num_cpus) to perform the try_wait check and delegate to format_exit_error when appropriate so exit diagnostics are preserved.

Copilot

Pull request overview

perf-monitor の計測対象を「親 PID のみ」から「親 + 全子孫プロセス」に拡張し、Tauri(WebView2) 等のマルチプロセス構成で CPU/RSS を過小評価しないようにする変更です。

Changes:

refresh_processes を ProcessesToUpdate::All に切り替え、計測中に spawn された子孫プロセスも継続的に観測できるよう変更
aggregate_descendants / sum_subtree を追加し、親子マップ構築 + cycle-safe DFS で CPU/RSS を合算
子孫合算ロジック向けのユニットテスト追加と README の Metrics 説明更新

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
perf-monitor/src/monitor.rs	プロセスツリー全体の CPU/RSS 合算ロジック追加、refresh を All に変更、ユニットテスト追加
perf-monitor/README.md	Metrics が「親 + 子孫プロセス合算」になることと背景を明記

Surfaces what's actually consuming CPU/RSS so that GHA logs are diagnosable at a glance: - Aggregate tree stats are now followed by a Per-Process Breakdown block (one entry per PID observed during the measurement window, with sample_count, avg/max/P95 CPU, and avg/max/P95 RSS). The root process is listed first and tagged "(root)"; descendants follow in descending order of average CPU. - A Failure Reasons section enumerates exactly which thresholds were exceeded (and by how much) when the run fails, instead of leaving CI to diff the numbers against the config. - The JSON output mirrors both: new "per_process" array and "failure_reasons" array. - A pre-sampling abort (binary crashed during warmup, etc.) now prints under a "=== Performance Test Failed ===" banner with the captured child stderr so the cause is visible at the tail of the log. https://claude.ai/code/session_01XnYxTyEBg9CarpcTbEArjn

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

perf-monitor/src/monitor.rs (1)

126-129: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve exit diagnostics when subtree capture fails

At Line 128, returning a generic error can drop useful exit status/stderr if the child exited between checks. Re-check try_wait here and delegate to format_exit_error before the fallback message.

Suggested patch

-    let snapshots = match capture_subtree(&system, pid) {
-      Some(s) => s,
-      None => return Err("Process disappeared during measurement".into()),
-    };
+    let snapshots = match capture_subtree(&system, pid) {
+      Some(s) => s,
+      None => {
+        if guard.0.try_wait()?.is_some() {
+          return Err(format_exit_error(&mut guard, "measurement"));
+        }
+        return Err("Process disappeared during measurement".into());
+      }
+    };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@perf-monitor/src/monitor.rs` around lines 126 - 129, When
capture_subtree(&system, pid) returns None in monitor.rs, re-check the child
process status by calling try_wait on the child/process handle and, if it
indicates the child exited, call format_exit_error to produce a detailed error
(including exit status/stderr) and return that; only fall back to the generic
"Process disappeared during measurement" message if try_wait shows the child is
still running or no exit info is available. Locate the failing branch around
capture_subtree, use the same PID/child handle used elsewhere for try_wait, and
replace the plain Err(...) return with logic that delegates to format_exit_error
when try_wait reports an exit.

🧹 Nitpick comments (1)

perf-monitor/src/main.rs (1)
75-78: ⚡ Quick win

Use project logging macros in the new failure path

These new direct eprintln! calls should use the repository logging macros for consistency with Rust logging policy.

As per coding guidelines, "Use logging macros (log_debug!, log_info!, log_warn!, log_error!) from core/src/utils/logger.rs instead of direct logging calls."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@perf-monitor/src/main.rs` around lines 75 - 78, The new direct eprintln!
calls should be replaced with the project's logging macros: use log_error! (from
core/src/utils/logger.rs) instead of eprintln! for the failure banner and
reason/details; e.g. replace the three eprintln! invocations (the "===
Performance Test Failed ===", "Failure reason: ..." and "Details: {e}") with
appropriate log_error! calls that include the same messages and the error
variable {e} so logs follow the repository logging policy.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@perf-monitor/src/main.rs`:
- Around line 77-78: The failure message printed by the eprintln! call is too
specific ("monitoring aborted before metrics could be collected") and can be
misleading for mid-run aborts; update the first eprintln! invocation used in
main (the eprintln!("Failure reason: ...") line) to neutral wording such as
"Failure reason: monitoring aborted before completion" (leave the separate
eprintln!("Details: {e}") unchanged so the actual error is still logged).

---

Duplicate comments:
In `@perf-monitor/src/monitor.rs`:
- Around line 126-129: When capture_subtree(&system, pid) returns None in
monitor.rs, re-check the child process status by calling try_wait on the
child/process handle and, if it indicates the child exited, call
format_exit_error to produce a detailed error (including exit status/stderr) and
return that; only fall back to the generic "Process disappeared during
measurement" message if try_wait shows the child is still running or no exit
info is available. Locate the failing branch around capture_subtree, use the
same PID/child handle used elsewhere for try_wait, and replace the plain
Err(...) return with logic that delegates to format_exit_error when try_wait
reports an exit.

---

Nitpick comments:
In `@perf-monitor/src/main.rs`:
- Around line 75-78: The new direct eprintln! calls should be replaced with the
project's logging macros: use log_error! (from core/src/utils/logger.rs) instead
of eprintln! for the failure banner and reason/details; e.g. replace the three
eprintln! invocations (the "=== Performance Test Failed ===", "Failure reason:
..." and "Details: {e}") with appropriate log_error! calls that include the same
messages and the error variable {e} so logs follow the repository logging
policy.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 968b52b3-e435-4ef4-9216-a38e1c05695f

📥 Commits

Reviewing files that changed from the base of the PR and between 9d5c6c4 and 04271a5.

📒 Files selected for processing (4)

perf-monitor/README.md
perf-monitor/src/main.rs
perf-monitor/src/monitor.rs
perf-monitor/src/report.rs

coderabbitai · 2026-05-13T00:04:01Z

+      eprintln!("Failure reason: monitoring aborted before metrics could be collected");
+      eprintln!("Details: {e}");


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Failure reason text can be misleading for mid-run aborts

At Line 77, the message says metrics could not be collected, but aborts can also happen after some ticks. Use neutral wording (e.g., “monitoring aborted before completion”) to avoid misleading CI triage.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@perf-monitor/src/main.rs` around lines 77 - 78, The failure message printed by the eprintln! call is too specific ("monitoring aborted before metrics could be collected") and can be misleading for mid-run aborts; update the first eprintln! invocation used in main (the eprintln!("Failure reason: ...") line) to neutral wording such as "Failure reason: monitoring aborted before completion" (leave the separate eprintln!("Details: {e}") unchanged so the actual error is still logged).

Copilot AI review requested due to automatic review settings May 12, 2026 23:33

github-actions Bot added the docs label May 12, 2026

Copilot started reviewing on behalf of shm11C3 May 12, 2026 23:34 View session

coderabbitai Bot requested changes May 12, 2026

View reviewed changes

Copilot AI reviewed May 12, 2026

View reviewed changes

coderabbitai Bot requested changes May 13, 2026

View reviewed changes

shm11C3 closed this Jun 5, 2026

shm11C3 deleted the claude/include-child-process-metrics-cVM4D branch June 5, 2026 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(perf-monitor): aggregate CPU/RAM across descendant processes#1519

feat(perf-monitor): aggregate CPU/RAM across descendant processes#1519
shm11C3 wants to merge 2 commits into
developfrom
claude/include-child-process-metrics-cVM4D

shm11C3 commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		eprintln!("Failure reason: monitoring aborted before metrics could be collected");
		eprintln!("Details: {e}");

Conversation

shm11C3 commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shm11C3 commented May 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading