Add resource monitor by jmcphers · Pull Request #58 · posit-dev/kallichore

jmcphers · 2026-01-06T17:07:37Z

Adds resource monitoring. By default, resources are monitored and reported every 1 second. Resource usage information is emitted on the kernel's websocket as kernel messages (rather than Jupyter messages, but on the same protocol), and is also returned when querying the active session.

The interval at which resources are collected is configurable (both at the command line and in settings), and can be set to 0 to disable resource monitoring entirely if desired for performance reasons.

Copilot

Pull request overview

This PR adds comprehensive resource monitoring capabilities to Kallichore, enabling real-time tracking of kernel resource usage. The implementation monitors CPU usage, memory consumption, and thread counts for kernel processes and their children. By default, resources are sampled every 1 second and can be configured via command-line arguments or runtime settings. Setting the interval to 0 disables monitoring entirely.

Adds a global resource monitor that periodically collects metrics for all active kernel sessions
Exposes resource usage data through both WebSocket messages and HTTP session queries
Provides configuration API to adjust sampling intervals dynamically (minimum 100ms when enabled)

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 17 comments.

Show a summary per file

File	Description
`kallichore.json`	Adds `resourceUsage` schema and `resource_sample_interval_ms` configuration field to OpenAPI spec
`crates/kcshared/src/kernel_message.rs`	Defines `ResourceUpdate` struct and adds `ResourceUsage` variant to `KernelMessage` enum
`crates/kcserver/tests/resource_usage_test.rs`	Adds integration tests verifying resource usage data in HTTP responses and WebSocket messages
`crates/kcserver/src/server.rs`	Threads `resource_sample_interval_ms` parameter through server initialization and configuration updates
`crates/kcserver/src/resource_monitor.rs`	Implements global resource monitoring task with process tree traversal and metric collection
`crates/kcserver/src/main.rs`	Adds `--resource-sample-interval` CLI argument with 1000ms default
`crates/kcserver/src/lib.rs`	Exports new `resource_monitor` module
`crates/kcserver/src/kernel_state.rs`	Adds `resource_usage` field to store latest metrics
`crates/kcserver/src/kernel_session/mod.rs`	Includes `resource_usage` in session info responses
`crates/kcserver/Cargo.toml`	Bumps version to 0.1.62
`crates/kallichore_api/src/models.rs`	Adds `ResourceUsage` model and updates `ActiveSession` and `ServerConfiguration` models
`crates/kallichore_api/docs/*`	Adds documentation for new API models
`crates/kallichore_api/api/openapi.yaml`	Updates OpenAPI specification with resource usage schemas
`crates/kallichore_api/README.md`	Updates build date metadata
`Cargo.lock`	Updates version lock for kcserver

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-06T17:12:33Z

+#[derive(Debug, Clone, PartialEq, serde::Serialize, serde::Deserialize, validator::Validate)]
+#[cfg_attr(feature = "conversion", derive(frunk::LabelledGeneric))]
+pub struct ResourceUsage {
+    /// The percentage of CPU used by the kernel process


The description says "The percentage of CPU used by the kernel process" but should clarify that this includes child processes, consistent with the actual implementation.

Suggested change

/// The percentage of CPU used by the kernel process

/// The percentage of CPU used by the kernel process, including its child processes

Copilot · 2026-01-06T17:12:34Z

+            #[cfg(not(any(target_os = "linux")))]
+            {
+                // On macOS and Windows, tasks() is not available
+                // Assume 1 thread per process as a baseline
+                total_threads += 1;
+            }


On non-Linux platforms (macOS, Windows), the thread count logic assumes 1 thread per process, which significantly underestimates the actual thread count. Modern Python kernels typically use multiple threads. Consider:

Documenting this limitation in the API

Using platform-specific APIs to get accurate thread counts

Making the field optional when accurate data isn't available
This could mislead users who rely on thread count metrics for capacity planning or debugging.

Updated docs.

Copilot · 2026-01-06T17:12:34Z

+------------ | ------------- | ------------- | -------------
+**cpu_percent** | **i64** | The percentage of CPU used by the kernel process | 
+**memory_bytes** | **i64** | The amount of memory used by the kernel process in bytes | 
+**thread_count** | **i64** | The number of threads used by the kernel process | 


The description says "The number of threads used by the kernel process" but should clarify that this includes child processes.

Suggested change

**thread_count** | **i64** | The number of threads used by the kernel process |

**thread_count** | **i64** | The number of threads used by the kernel process and its child processes |

Copilot · 2026-01-06T17:12:38Z

+    #[serde(rename = "memory_bytes")]
+    pub memory_bytes: i64,
+
+    /// The number of threads used by the kernel process


The description says "The number of threads used by the kernel process" but should clarify that this includes child processes.

Suggested change

/// The number of threads used by the kernel process

/// The number of threads used by the kernel process, including its child processes

Copilot · 2026-01-06T17:12:38Z

+    // Collect all PIDs in process tree using BFS
+    let mut pids_to_check = vec![pid];
+    let mut visited = HashSet::new();
+
+    while let Some(check_pid) = pids_to_check.pop() {
+        if !visited.insert(check_pid) {
+            continue; // Already visited
+        }
+
+        // Find children by checking parent_pid
+        for (child_pid, proc) in system.processes() {
+            if proc.parent() == Some(check_pid) && !visited.contains(child_pid) {
+                pids_to_check.push(*child_pid);
+            }
+        }
+    }


The BFS process tree collection doesn't actually accumulate metrics for visited processes. The code collects all PIDs using BFS, but then only sums metrics in the second loop. However, there's a subtle issue: the visited set is built without checking if processes exist in the system. Then when summing, some PIDs might not have corresponding processes. While this won't cause errors due to the if let Some check, the logic could be clearer. Consider combining the BFS traversal with metric collection to make it more efficient and clearer.

We know that the PIDs have corresponding processes because we just found them! I think this is OK as is.

Copilot · 2026-01-06T17:12:38Z

+            // Enforce a minimum interval of 100 ms so we don't try to sample
+            // too frequently
+            let new_interval_ms = if interval_ms > 0 && interval_ms < 100 {
+                log::warn!("Requested resource_sample_interval_ms is too low: {}. Setting to minimum of 100 ms.", interval_ms);
+                100
+            } else {
+                interval_ms as u64
+            };


Inconsistent validation logic: The code checks for negative values and enforces a minimum of 100ms, but the warning message at line 1976 says "Setting to minimum of 100 ms" while the actual enforcement allows 0 (disabled). The validation should either:

Allow 0 or >= 100 (current behavior), and update the warning to clarify this

Treat 0 specially before the minimum check
The current logic is: if value is between 1-99, set to 100; if 0, keep as 0. This is correct but the warning message doesn't reflect this nuance.

100 is the minimum when metrics collection is enabled, not 0, so I stand by the warning.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

jmcphers added 10 commits December 22, 2025 16:38

initial resource sample impl

fbb1a22

make it configurable

5331ece

format

68e2ed0

don't allow busy loops for sample collection

b098140

include resource usage in active session info

eff9c99

store and emit most recent usage info

64bb79a

use int64s since these values can be big

121a09b

add basic resource usage tests

f35a445

bump version

84e1950

clean up comments

aa4194f

jmcphers requested a review from Copilot January 6, 2026 17:07

Copilot started reviewing on behalf of jmcphers January 6, 2026 17:08 View session

remove unused import

cde1fd5

Copilot AI reviewed Jan 6, 2026

View reviewed changes

jmcphers and others added 2 commits January 6, 2026 09:19

Apply suggestions from code review

26a54b5

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

update documentation for resource fields

31d90ff

jmcphers merged commit 6a425cb into main Jan 6, 2026
5 checks passed

github-actions Bot locked and limited conversation to collaborators Jan 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add resource monitor#58

Add resource monitor#58
jmcphers merged 13 commits into
mainfrom
feature/resource-usage

jmcphers commented Jan 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

jmcphers Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

jmcphers Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

jmcphers Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	/// The percentage of CPU used by the kernel process
	/// The percentage of CPU used by the kernel process, including its child processes

	thread_count \| i64 \| The number of threads used by the kernel process \|
	thread_count \| i64 \| The number of threads used by the kernel process and its child processes \|

	/// The number of threads used by the kernel process
	/// The number of threads used by the kernel process, including its child processes

Conversation

jmcphers commented Jan 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

jmcphers Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

jmcphers Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

jmcphers Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants