feat: Enhanced CLI Watcher with automatic terminal process monitoring by vrubezhny · Pull Request #363 · eclipse-che/che-machine-exec

vrubezhny · 2026-06-27T13:51:59Z

CLI Watcher now automatically detects and tracks terminal-based development activity, distinguishing between interactive CLIs (vim, REPLs) and work processes (builds, deploys) without manual configuration.

Key Features:

Fully automatic: Only requires 'enabled: true' - auto-detects ALL terminal processes
Smart defaults: Adapts activityWindow and gracePeriod to workspace idle timeout
Automatic classification: Interactive (vim, python REPL) vs work (builds, deploys)
Backward compatible: Old configs continue working with deprecation warnings
Safety mechanisms: Grace period (5min), max age limit (6h), activity monitoring
Flexible duration parsing: Accepts 6h, 25m, 60 (seconds), or duration strings

Configuration:

Minimal: enabled: true (that's it!)
Smart defaults: Auto-calculated from workspace idle timeout
Time settings: checkPeriod, activityWindow, gracePeriod, maxProcessAge
Override options: watchedCommands (fix misclassification), ignoredCommands (skip entirely)

Technical Implementation:

User process filtering: Parent TTY chain analysis (filters shells, system processes)
Interactive detection: Foreground process group + TTY read analysis
Activity tracking: TTY access time (Atime) monitoring for user input
Multicall binary support: Uses /proc/comm for proper command detection
Self-skip: che-machine-exec doesn't watch itself
Config validation: Warns about misconfigurations without breaking

Documentation:

timeout/CLI-WATCHER.md - Comprehensive guide with testing scenarios
timeout/.noidle.minimal - Minimal config example
timeout/.noidle.example - Full config with all options

Issue: https://redhat.atlassian.net/browse/CRW-11505

tolusha · 2026-06-29T11:39:49Z

Hi! I'm che-ai-assistant — I help with your pull requests.

Available commands:

/che-ai-assistant generate-che-doc — Generate a documentation PR based on this PR's changes
/che-ai-assistant ok-pr-review — Run a comprehensive PR review (summary, code review, deep review, impact analysis)
/che-ai-assistant help — Show this help message

tolusha · 2026-06-29T11:40:10Z

/che-ai-assistant ok-pr-review

Review is complete. Please check the review comments below.

tolusha

Review Summary

Comprehensive review completed across standard code quality, deep design analysis, and system-level impact assessment.

Critical Issues (Must Fix Before Merge)

1. Hardcoded clock ticks causes incorrect process age on non-x86_64 systems

Location: timeout/cli-watcher.go line ~1112 in getProcessStartTime
Issue: clockTicks := 100 is hardcoded but can be 250 on ARM systems
Additionally, fmt.Sscanf errors are not checked - if parsing fails, startTicks remains 0, making processes appear ancient and incorrectly skipped
Impact: Grace period and max process age checks fail, potentially idling workspaces during active builds
Fix: Check fmt.Sscanf return value and use C.sysconf(C._SC_CLK_TCK) or read from /proc/self/auxv

2. Infinite loop risk in getMainUserProcess

Location: timeout/cli-watcher.go line ~1045
Issue: No cycle detection or maximum depth limit when walking parent chain
Impact: Corrupted /proc/[pid]/stat or race conditions cause goroutine hang
Fix: Add visited set or max depth (64) check

3. Duplicate policy logic in applyCommandPolicy vs applyDefaultPolicy

Location: timeout/cli-watcher.go lines 950-1011
Issue: Identical decision logic duplicated in two functions
Impact: Bug fixes must be applied twice; maintenance burden for code controlling workspace lifecycle
Fix: Merge into single function with InteractiveMode parameter

High Priority Warnings

4. Performance: getSystemBootTime re-reads /proc/stat hundreds of times

Location: timeout/cli-watcher.go line ~1119
Boot time never changes but is read once per process per scan
Fix: Cache with sync.Once

5. Documentation mismatch: per-command activityWindow not implemented

Location: timeout/CLI-WATCHER.md line ~561
Documentation shows activityWindow: 120 in watchedCommands but struct has no such field
Config would be silently ignored
Fix: Remove from docs or implement the feature

6. Breaking change: expanded alwaysIgnoredCommands

Location: timeout/cli-watcher.go line ~71
Adding watch, top, htop to exclusion list changes behavior for existing users
Users who configured watch will find it no longer prevents idling after upgrade
Fix: Warn at config load time when user explicitly lists a globally-excluded command

7. No unit tests for 698 lines of complex logic

Pure functions (parseDuration, applyDefaults, UnmarshalYAML, ignoreExclusions) are untested
Process tree walking, classification, and validation logic all untested
Fix: Add table-driven tests for at minimum the pure configuration functions

Minor Issues

8. Sscanf errors ignored in isInForegroundProcessGroup (safe but inconsistent)
9. Comment-code mismatch in isUserInitiatedProcess (line ~1071)

Positive Observations

✅ Excellent backward compatibility via UnmarshalYAML custom unmarshaler
✅ Smart defaults calculated from workspace idle timeout reduce config complexity
✅ Comprehensive CLI-WATCHER.md documentation with examples and test scenarios
✅ Configuration validation with warnings (doesn't override user values)
✅ Proper goroutine lifecycle management and resource cleanup

Recommendation

🔄 Request Changes - Critical issues #1 (clock ticks + Sscanf), #2 (infinite loop), and #3 (duplicate policy logic) should be fixed before merge. Issue #5 (documentation mismatch) and #6 (breaking change warning) should also be addressed to avoid user confusion. Tests (#7) would significantly improve long-term maintainability but could follow in a subsequent PR.

vrubezhny · 2026-06-29T18:14:33Z

The fixes added to the issues pointed to in /che-ai-assistant 's review:

✅ Platform-portable clock ticks detection (ARM/x86 support)
✅ Infinite loop protection in getMainUserProcess
✅ Merged duplicate policy functions
✅ Boot time cached at startup (performance)
✅ Documentation fix (removed per-command activityWindow)
✅ Warning for globally-excluded commands
✅ Comprehensive unit tests (59 test cases)
✅ Error checking in isInForegroundProcessGroup
✅ Clarified comments in isUserInitiatedProcess

Files changed:

timeout/cli-watcher.go - All fixes implemented
timeout/cli-watcher_test.go - 59 new test cases (NEW)
timeout/CLI-WATCHER.md - Documentation updates + developer testing section
Plus existing: .noidle.example, .noidle.minimal

vrubezhny · 2026-06-29T18:15:09Z

/che-ai-assistant ok-pr-review

Review is complete. Please check the review comments below.

tolusha

Code Review Summary

Critical Issues (must fix before merge):

Big-endian byte order bug in detectClockTicks() - will silently corrupt process age calculations on s390x
YAML parse error swallowing - breaks error handling contract, leaves watcher in silent no-op state on first-load failure

Warnings (should fix):

Duration parsing ambiguity (typos like "30min" silently become "30s")
Unnecessary unsafe package import
Dead code in hasRecentActivity()
Missing unit tests for core detection logic
init() side effects break portability (non-Linux CI)
Repeated /proc/[pid]/stat parsing (3x per process)
Verbose INFO logging (every tick logged, could flood logs)
Behavioral change needs upgrade documentation

Positive Highlights:

Excellent backward compatibility design with deprecation warnings
Smart defaults that adapt to workspace idle timeout
Comprehensive documentation and config parsing test coverage
Defensive programming with cycle detection and error handling

See inline comments for details and suggested fixes.

tolusha · 2026-06-29T18:37:47Z

Additional Observations

Missing Test Coverage for Core Logic

The test file has excellent coverage for configuration parsing (19.1% overall coverage), but the most critical functions have no tests:

applyPolicy() - the core decision function
isUserInitiatedProcess() - the key filter
isInteractiveProcess() - the classification logic
getMainUserProcess() - process tree walking

Consider adding at least tests for applyPolicy() since it is a pure function where the process-dependent parts could be abstracted behind an interface for testability.

Performance Optimization Opportunity

getParentPID(), isInForegroundProcessGroup(), and getProcessStartTime() each independently read and parse /proc/[pid]/stat. For a process that passes all checks, this file is read 3 times. Consider creating a single parseProcStat(pid) function that returns a struct with ppid, pgrp, tpgid, and startTicks, then refactor the individual functions to use this shared parser.

This would eliminate redundant I/O on the hot path and make the code cleaner.

tolusha · 2026-06-30T08:50:03Z

/che-ai-assistant generate-che-doc

Created documentation PR: eclipse-che/che-docs#3137

tolusha · 2026-06-30T10:47:16Z

@vrubezhny
Is there image available for testing ?

tolusha · 2026-06-30T10:57:19Z

/che-ai-assistant check-pr-test-failures

Review is complete. Please check the review comments below.

tolusha · 2026-06-30T11:03:45Z

PR #363 -- eclipse-che/che-machine-exec -- feat: Enhanced CLI Watcher with automatic terminal process monitoring

Check	Category	Status	Duration	CI
build	Build	FAILED	11m9s	GHA
DEPENDENCIES.md file validation	Validation	PASSED	46s	GHA
License	Validation	PASSED	17s	GHA
WIP	External	PASSED	—	External
add-link	External	PASSED	3s	GHA
eclipsefdn/eca	External	PASSED	—	External
unit-tests	Unit Tests	PASSED	1m33s	GHA
workspaces.openshift.com	External	PASSED	—	External

FAILED build (GHA)

The build fails during the multi-platform Docker image build, specifically on the linux/arm64 platform during the yarn installation and build step. The error is caused by an illegal instruction (SIGILL) when running under QEMU emulation: qemu: uncaught target signal 4 (Illegal instruction) - core dumped. This typically occurs when QEMU ARM64 emulation encounters incompatible CPU instructions, causing the process to fail with exit code 132 during the execution of yarn install --immutable && yarn run build.

Log: build
Rerun: Re-run failed jobs

Analysis Summary:
The only failing check is the multi-platform Docker build. The linux/amd64 build appears to succeed, but the linux/arm64 build fails due to QEMU emulation issues. This is likely an infrastructure/compatibility issue rather than a code problem. All other checks including unit tests, validation, and license checks pass successfully.

vrubezhny · 2026-06-30T12:44:36Z

@tolusha I've made another validation pass and added some more tests.

I thought some GH Action had been creating an image for testing and providing an URL to it... Don't see it anymore... Or do I mixing the things up?

vrubezhny · 2026-06-30T13:08:21Z

Building the docker image locally succeeds for me. And as far as I can see the build on GH is passing with no errors now - so it looks like a temporary glitch..

If you need I can build and publish an image from this PR.

But I'd still like to add some changes to the Dockerfile in some following PRs to fix the warnings and some cloud-shell related things, f.i.:

=> [cloud_shell_builder 5/5] RUN yarn install --immutable &&     yarn run build &&     mkdir /app &&     cp -rf index.html dist node_modules /app                                                           26.0s

we don't need copying node_modules for the cloud-shell as it's bundled and has all the required dependencies bundled together - so it just unnecessarily to have that folder on the image

Plus some more minor issues

vrubezhny · 2026-06-30T13:22:27Z

/che-ai-assistant help

tolusha · 2026-06-30T13:31:06Z

Hi! I'm che-ai-assistant — I help with your pull requests.

Available commands:

/che-ai-assistant generate-che-doc — Generate a documentation PR based on this PR's changes
/che-ai-assistant ok-pr-review — Run a comprehensive PR review (summary, code review, deep review, impact analysis)
/che-ai-assistant check-pr-test-failures — Analyze failing CI checks, identify root causes, and suggest fixes
/che-ai-assistant help — Show this help message

vrubezhny · 2026-06-30T15:10:03Z

@vrubezhny Is there image available for testing ?

You can try it from https://quay.io/repository/vrubezhny/che-machine-exec:

# Pull the image
docker pull quay.io/vrubezhny/che-machine-exec:cli-watcher-test

vrubezhny · 2026-07-01T10:00:45Z

/che-ai-assistant ok-pr-review

Review is complete. Please check the review comments below.

tolusha

Review Summary

Comprehensive review completed (standard + deep + impact analysis). The PR implements a valuable feature with good defensive programming and backward compatibility. However, there are critical issues that need to be addressed before merge.

Critical Issues

ignoredCommands config broken - documented but non-functional (see inline comments)
findUpward symlink asymmetry - config search fails if PROJECTS_ROOT is a symlink
Atime dependency undocumented - interactive detection silently breaks with noatime mounts

Key Findings

Behavioral change (watch ALL processes) well-documented for users but needs platform admin awareness
Core decision pipeline lacks test coverage (isWatchedProcessRunning, isUserInitiatedProcess, etc.)
Configuration struct conflates concerns (YAML input + resolved state + bookkeeping)

Positive Highlights

Excellent backward compatibility via custom UnmarshalYAML
Smart adaptive defaults from workspace idle timeout
Defensive process tree traversal with cycle detection
All prior review feedback addressed

See inline comments for specific fixes and recommendations.

Verdict: 🔄 Request Changes - Fix critical bugs (#1-3) before merge.

tolusha · 2026-07-01T12:37:53Z

@vrubezhny

docker pull quay.io/vrubezhny/che-machine-exec:cli-watcher-test

I'm not sure how to test using this image. I thought it would be a VS Code image.

vrubezhny · 2026-07-01T13:03:32Z

@vrubezhny

docker pull quay.io/vrubezhny/che-machine-exec:cli-watcher-test

I'm not sure how to test using this image. I thought it would be a VS Code image.

No, I have no VSCode image built from this change (need to find instructions on how to create one).

I'm testing it by starting a WS from https://workspaces.openshift.com#https://github.com/vrubezhny/che-machine-exec/tree/feat-idling-interative-cli-activity-tracking, then executing in terminal (Run Task -> devfile):

build cloud-shell
compile

Then exporting LOG_LEVEL=debug and running ./che-machine-exec --url 0.0.0.0:5555 from /projects/che-machine-exec folder (it would be default folder in terminal).

then adding .noidle config - minimal or with some preferences, like setting up some minutes for time-values to make it to start/stop watching earlier than defults.

Then in other terminal tab (I split it to have both tabs visible) I execute interactive (like more or claude) commands and non-interactive (like sleep <timeout> or building something (./compile.sh) - and see reactions in che-machine-exec output.

CLI Watcher now automatically detects and tracks terminal-based development activity, distinguishing between interactive CLIs (vim, REPLs) and work processes (builds, deploys) without manual configuration. Key Features: - Fully automatic: Only requires 'enabled: true' - auto-detects ALL terminal processes - Smart defaults: Adapts activityWindow and gracePeriod to workspace idle timeout - Automatic classification: Interactive (vim, python REPL) vs work (builds, deploys) - Backward compatible: Old configs continue working with deprecation warnings - Safety mechanisms: Grace period (5min), max age limit (6h), activity monitoring - Flexible duration parsing: Accepts 6h, 25m, 60 (seconds), or duration strings Configuration: - Minimal: enabled: true (that's it!) - Smart defaults: Auto-calculated from workspace idle timeout - Time settings: checkPeriod, activityWindow, gracePeriod, maxProcessAge - Override options: watchedCommands (fix misclassification), ignoredCommands (skip entirely) Technical Implementation: - User process filtering: Parent TTY chain analysis (filters shells, system processes) - Interactive detection: Foreground process group + TTY read analysis - Activity tracking: TTY access time (Atime) monitoring for user input - Multicall binary support: Uses /proc/comm for proper command detection - Self-skip: che-machine-exec doesn't watch itself - Config validation: Warns about misconfigurations without breaking - Platform portability: Detects clock ticks from /proc/self/auxv with platform-specific fallbacks - Performance: Boot time and clock ticks cached at startup (not per process) - Robustness: Infinite loop protection, cycle detection, error checking on all parsing Code Quality Improvements: - Added comprehensive unit tests (79 test cases covering pure functions and concurrency) - Tests cover: duration parsing, YAML unmarshaling, defaults, validation, platform detection, applyPolicy, race conditions - Merged duplicate policy logic into single unified function - Added error checking to all fmt.Sscanf calls for consistency - Fixed comment-code mismatches for clarity - Warning on breaking changes (globally-excluded commands) Test Coverage: - Added concurrent Start/Stop tests (catches race conditions) - Added clock skew tests (documents negative age handling) - Enhanced /proc parsing integration tests - All 79 tests pass with -race flag - No data races detected All issues verified with test-driven approach: wrote tests first to detect bugs, then fixed implementation. Thread-safety verified with Go race detector (-race flag). Documentation: - timeout/CLI-WATCHER.md - Comprehensive guide with testing scenarios - timeout/.noidle.minimal - Minimal config example - timeout/.noidle.example - Full config with all options - Added auto-detection test scenario for manual testing - Added comprehensive upgrade guide for breaking behavioral changes - Developer testing section added Issue: https://redhat.atlassian.net/browse/CRW-11505 Assisted-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Victor Rubezhny <vrubezhny@redhat.com>

vrubezhny · 2026-07-01T23:20:23Z

I'm not sure how to test using this image. I thought it would be a VS Code image.

CheCode Editor images for the PR are published ✨

Editor amd64: quay.io/che-incubator-pull-requests/che-code:pr-742-amd64
Editor arm64: quay.io/che-incubator-pull-requests/che-code:pr-742-arm64
Dev image: quay.io/che-incubator-pull-requests/che-code-dev:pr-742-dev-amd64

sbouchet · 2026-07-02T08:50:04Z

same/similar error as #370 :

tolusha · 2026-07-02T09:15:00Z

I have the same error

Activity tracker: Failed to ping che-machine-exec: fetch failed

2026-07-02 09:12:38.591 [error] Activating extension eclipse-che.terminal failed due to an error:
2026-07-02 09:12:38.591 [error] Error: Failed to connect to machine-exec after 30 attempts: 
	at e.init (/checode/checode-linux-libc/ubi9/extensions/che-terminal/dist/extension.js:44:850)
	at process.processTicksAndRejections (node:internal/process/task_queues:103:5)
	at async jh (/checode/checode-linux-libc/ubi9/extensions/che-terminal/dist/extension.js:44:3959)
	at async dS.n (file:///checode/checode-linux-libc/ubi9/out/vs/workbench/api/node/extensionHostProcess.js:116:13398)
	at async dS.m (file:///checode/checode-linux-libc/ubi9/out/vs/workbench/api/node/extensionHostProcess.js:116:13361)
	at async dS.l (file:///checode/checode-linux-libc/ubi9/out/vs/workbench/api/node/extensionHostProcess.js:116:12817)
2026-07-02 09:12:38.592 [info] Eager extensions activated

sbouchet · 2026-07-02T10:51:09Z

same comment than #370 (review) . test using devspace-code downstream editor definition.

vrubezhny requested review from amisevsk, azatsarynnyy and l0rd as code owners June 27, 2026 13:51

vrubezhny requested review from akurinnoy and tolusha June 27, 2026 14:12

ibuziuk reviewed Jun 29, 2026

View reviewed changes

Comment thread timeout/.noidle.example Outdated

vrubezhny force-pushed the feat-idling-interative-cli-activity-tracking branch from 43e8ae4 to 6513728 Compare June 29, 2026 09:53

vrubezhny requested a review from ibuziuk June 29, 2026 09:55