feat: Enhanced CLI Watcher with automatic terminal process monitoring#363
feat: Enhanced CLI Watcher with automatic terminal process monitoring#363vrubezhny wants to merge 1 commit into
Conversation
43e8ae4 to
6513728
Compare
|
Hi! I'm che-ai-assistant — I help with your pull requests. Available commands:
|
|
/che-ai-assistant ok-pr-review Review is complete. Please check the review comments below. |
tolusha
left a comment
There was a problem hiding this comment.
Review Summary
Comprehensive review completed across standard code quality, deep design analysis, and system-level impact assessment.
Critical Issues (Must Fix Before Merge)
1. Hardcoded clock ticks causes incorrect process age on non-x86_64 systems
- Location:
timeout/cli-watcher.goline ~1112 ingetProcessStartTime - Issue:
clockTicks := 100is hardcoded but can be 250 on ARM systems - Additionally,
fmt.Sscanferrors are not checked - if parsing fails,startTicksremains 0, making processes appear ancient and incorrectly skipped - Impact: Grace period and max process age checks fail, potentially idling workspaces during active builds
- Fix: Check
fmt.Sscanfreturn value and useC.sysconf(C._SC_CLK_TCK)or read from/proc/self/auxv
2. Infinite loop risk in getMainUserProcess
- Location:
timeout/cli-watcher.goline ~1045 - Issue: No cycle detection or maximum depth limit when walking parent chain
- Impact: Corrupted
/proc/[pid]/stator race conditions cause goroutine hang - Fix: Add visited set or max depth (64) check
3. Duplicate policy logic in applyCommandPolicy vs applyDefaultPolicy
- Location:
timeout/cli-watcher.golines 950-1011 - Issue: Identical decision logic duplicated in two functions
- Impact: Bug fixes must be applied twice; maintenance burden for code controlling workspace lifecycle
- Fix: Merge into single function with InteractiveMode parameter
High Priority Warnings
4. Performance: getSystemBootTime re-reads /proc/stat hundreds of times
- Location:
timeout/cli-watcher.goline ~1119 - Boot time never changes but is read once per process per scan
- Fix: Cache with
sync.Once
5. Documentation mismatch: per-command activityWindow not implemented
- Location:
timeout/CLI-WATCHER.mdline ~561 - Documentation shows
activityWindow: 120in watchedCommands but struct has no such field - Config would be silently ignored
- Fix: Remove from docs or implement the feature
6. Breaking change: expanded alwaysIgnoredCommands
- Location:
timeout/cli-watcher.goline ~71 - Adding
watch,top,htopto exclusion list changes behavior for existing users - Users who configured
watchwill find it no longer prevents idling after upgrade - Fix: Warn at config load time when user explicitly lists a globally-excluded command
7. No unit tests for 698 lines of complex logic
- Pure functions (
parseDuration,applyDefaults,UnmarshalYAML,ignoreExclusions) are untested - Process tree walking, classification, and validation logic all untested
- Fix: Add table-driven tests for at minimum the pure configuration functions
Minor Issues
8. Sscanf errors ignored in isInForegroundProcessGroup (safe but inconsistent)
9. Comment-code mismatch in isUserInitiatedProcess (line ~1071)
Positive Observations
✅ Excellent backward compatibility via UnmarshalYAML custom unmarshaler
✅ Smart defaults calculated from workspace idle timeout reduce config complexity
✅ Comprehensive CLI-WATCHER.md documentation with examples and test scenarios
✅ Configuration validation with warnings (doesn't override user values)
✅ Proper goroutine lifecycle management and resource cleanup
Recommendation
🔄 Request Changes - Critical issues #1 (clock ticks + Sscanf), #2 (infinite loop), and #3 (duplicate policy logic) should be fixed before merge. Issue #5 (documentation mismatch) and #6 (breaking change warning) should also be addressed to avoid user confusion. Tests (#7) would significantly improve long-term maintainability but could follow in a subsequent PR.
6513728 to
c954c98
Compare
|
The fixes added to the issues pointed to in ✅ Platform-portable clock ticks detection (ARM/x86 support) Files changed: timeout/cli-watcher.go - All fixes implemented |
|
/che-ai-assistant ok-pr-review Review is complete. Please check the review comments below. |
tolusha
left a comment
There was a problem hiding this comment.
Code Review Summary
Critical Issues (must fix before merge):
- Big-endian byte order bug in
detectClockTicks()- will silently corrupt process age calculations on s390x - YAML parse error swallowing - breaks error handling contract, leaves watcher in silent no-op state on first-load failure
Warnings (should fix):
- Duration parsing ambiguity (typos like "30min" silently become "30s")
- Unnecessary
unsafepackage import - Dead code in
hasRecentActivity() - Missing unit tests for core detection logic
init()side effects break portability (non-Linux CI)- Repeated
/proc/[pid]/statparsing (3x per process) - Verbose INFO logging (every tick logged, could flood logs)
- Behavioral change needs upgrade documentation
Positive Highlights:
- Excellent backward compatibility design with deprecation warnings
- Smart defaults that adapt to workspace idle timeout
- Comprehensive documentation and config parsing test coverage
- Defensive programming with cycle detection and error handling
See inline comments for details and suggested fixes.
Additional ObservationsMissing Test Coverage for Core Logic The test file has excellent coverage for configuration parsing (19.1% overall coverage), but the most critical functions have no tests:
Consider adding at least tests for Performance Optimization Opportunity
This would eliminate redundant I/O on the hot path and make the code cleaner. |
c954c98 to
3f961ce
Compare
|
/che-ai-assistant generate-che-doc Created documentation PR: eclipse-che/che-docs#3137 |
3f961ce to
d08a2d4
Compare
|
@vrubezhny |
|
/che-ai-assistant check-pr-test-failures Review is complete. Please check the review comments below. |
|
PR #363 -- eclipse-che/che-machine-exec -- feat: Enhanced CLI Watcher with automatic terminal process monitoring
FAILED build (GHA)The build fails during the multi-platform Docker image build, specifically on the linux/arm64 platform during the yarn installation and build step. The error is caused by an illegal instruction (SIGILL) when running under QEMU emulation: Log: build Analysis Summary: |
d08a2d4 to
919ce97
Compare
|
@tolusha I've made another validation pass and added some more tests. I thought some GH Action had been creating an image for testing and providing an URL to it... Don't see it anymore... Or do I mixing the things up? |
|
Building the docker image locally succeeds for me. And as far as I can see the build on GH is passing with no errors now - so it looks like a temporary glitch.. If you need I can build and publish an image from this PR. But I'd still like to add some changes to the Dockerfile in some following PRs to fix the warnings and some cloud-shell related things, f.i.:
Plus some more minor issues |
|
/che-ai-assistant help |
|
Hi! I'm che-ai-assistant — I help with your pull requests. Available commands:
|
You can try it from https://quay.io/repository/vrubezhny/che-machine-exec: |
|
/che-ai-assistant ok-pr-review Review is complete. Please check the review comments below. |
tolusha
left a comment
There was a problem hiding this comment.
Review Summary
Comprehensive review completed (standard + deep + impact analysis). The PR implements a valuable feature with good defensive programming and backward compatibility. However, there are critical issues that need to be addressed before merge.
Critical Issues
ignoredCommandsconfig broken - documented but non-functional (see inline comments)findUpwardsymlink asymmetry - config search fails if PROJECTS_ROOT is a symlink- Atime dependency undocumented - interactive detection silently breaks with noatime mounts
Key Findings
- Behavioral change (watch ALL processes) well-documented for users but needs platform admin awareness
- Core decision pipeline lacks test coverage (isWatchedProcessRunning, isUserInitiatedProcess, etc.)
- Configuration struct conflates concerns (YAML input + resolved state + bookkeeping)
Positive Highlights
- Excellent backward compatibility via custom UnmarshalYAML
- Smart adaptive defaults from workspace idle timeout
- Defensive process tree traversal with cycle detection
- All prior review feedback addressed
See inline comments for specific fixes and recommendations.
Verdict: 🔄 Request Changes - Fix critical bugs (#1-3) before merge.
I'm not sure how to test using this image. I thought it would be a VS Code image. |
No, I have no VSCode image built from this change (need to find instructions on how to create one). I'm testing it by starting a WS from https://workspaces.openshift.com#https://github.com/vrubezhny/che-machine-exec/tree/feat-idling-interative-cli-activity-tracking, then executing in terminal (Run Task -> devfile):
Then exporting then adding Then in other terminal tab (I split it to have both tabs visible) I execute interactive (like |
919ce97 to
afaca0b
Compare
CLI Watcher now automatically detects and tracks terminal-based development activity, distinguishing between interactive CLIs (vim, REPLs) and work processes (builds, deploys) without manual configuration. Key Features: - Fully automatic: Only requires 'enabled: true' - auto-detects ALL terminal processes - Smart defaults: Adapts activityWindow and gracePeriod to workspace idle timeout - Automatic classification: Interactive (vim, python REPL) vs work (builds, deploys) - Backward compatible: Old configs continue working with deprecation warnings - Safety mechanisms: Grace period (5min), max age limit (6h), activity monitoring - Flexible duration parsing: Accepts 6h, 25m, 60 (seconds), or duration strings Configuration: - Minimal: enabled: true (that's it!) - Smart defaults: Auto-calculated from workspace idle timeout - Time settings: checkPeriod, activityWindow, gracePeriod, maxProcessAge - Override options: watchedCommands (fix misclassification), ignoredCommands (skip entirely) Technical Implementation: - User process filtering: Parent TTY chain analysis (filters shells, system processes) - Interactive detection: Foreground process group + TTY read analysis - Activity tracking: TTY access time (Atime) monitoring for user input - Multicall binary support: Uses /proc/comm for proper command detection - Self-skip: che-machine-exec doesn't watch itself - Config validation: Warns about misconfigurations without breaking - Platform portability: Detects clock ticks from /proc/self/auxv with platform-specific fallbacks - Performance: Boot time and clock ticks cached at startup (not per process) - Robustness: Infinite loop protection, cycle detection, error checking on all parsing Code Quality Improvements: - Added comprehensive unit tests (79 test cases covering pure functions and concurrency) - Tests cover: duration parsing, YAML unmarshaling, defaults, validation, platform detection, applyPolicy, race conditions - Merged duplicate policy logic into single unified function - Added error checking to all fmt.Sscanf calls for consistency - Fixed comment-code mismatches for clarity - Warning on breaking changes (globally-excluded commands) Test Coverage: - Added concurrent Start/Stop tests (catches race conditions) - Added clock skew tests (documents negative age handling) - Enhanced /proc parsing integration tests - All 79 tests pass with -race flag - No data races detected All issues verified with test-driven approach: wrote tests first to detect bugs, then fixed implementation. Thread-safety verified with Go race detector (-race flag). Documentation: - timeout/CLI-WATCHER.md - Comprehensive guide with testing scenarios - timeout/.noidle.minimal - Minimal config example - timeout/.noidle.example - Full config with all options - Added auto-detection test scenario for manual testing - Added comprehensive upgrade guide for breaking behavioral changes - Developer testing section added Issue: https://redhat.atlassian.net/browse/CRW-11505 Assisted-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Victor Rubezhny <vrubezhny@redhat.com>
afaca0b to
5598331
Compare
CheCode Editor images for the PR are published ✨ Editor amd64: quay.io/che-incubator-pull-requests/che-code:pr-742-amd64 |
|
same/similar error as #370 : |
|
I have the same error |
|
same comment than #370 (review) . test using |

CLI Watcher now automatically detects and tracks terminal-based development activity, distinguishing between interactive CLIs (vim, REPLs) and work processes (builds, deploys) without manual configuration.
Key Features:
Configuration:
Technical Implementation:
Documentation:
Issue: https://redhat.atlassian.net/browse/CRW-11505