Skip to content

feat: full Linux support for unbound discover and onboard#128

Open
AakashVelusamy wants to merge 3 commits into
stagingfrom
feat/linux-full-support
Open

feat: full Linux support for unbound discover and onboard#128
AakashVelusamy wants to merge 3 commits into
stagingfrom
feat/linux-full-support

Conversation

@AakashVelusamy
Copy link
Copy Markdown

@AakashVelusamy AakashVelusamy commented May 16, 2026

Summary

Adds full Linux support for unbound discover and unbound onboard, eliminating ValueError("Unsupported operating system: Linux") crashes on Ubuntu agent runners.

47 new factory branches covering all 16 AI coding tools: Claude Code, Cursor, Windsurf, Codex, Gemini CLI, Cursor CLI, Cline, Roo Code, Antigravity, Kilo Code, OpenCode, OpenClaw, Replit, JetBrains, GitHub Copilot — detectors, rules extractors, MCP config extractors, settings extractors, and skills extractors.


Key Linux path adaptations

macOS Linux
~/Library/Application Support/<IDE>/ ~/.config/<IDE>/
~/Library/Application Support/JetBrains/ ~/.config/JetBrains/
Device ID via ioreg / SMBIOS /etc/machine-id
/Users/<username> /home/<username>
Multi-user scanning via dscl Multi-user scanning via /etc/passwd

Multi-user / root correctness — all P1 bugs fixed

Root scanning

  • get_linux_user_homes(): returns /home/* dirs AND /root when root — /root was previously omitted when /home had other users
  • scan_user_directories(): always checks /root unconditionally when running as root
  • is_user_level_tool_dir(): recognises both /home/<user>/.<tool> and /root/.<tool> as user-scope

Docker / CI root-only containers (/home absent)

  • get_linux_user_homes(): falls back to [Path("/root")] when /home does not exist
  • get_all_users_linux(): falls back to ["root"] when /home is absent or empty
  • ai_tools_discovery.py: both user-home loop sites special-case "root"Path("/root") instead of building the non-existent /home/root

System path skipping

  • Added Linux-aware walk_for_tool_directories override using _LINUX_SKIP_SYSTEM_DIRS; the macOS version had /home in SKIP_SYSTEM_DIRS, silently dropping all project-level configs under /home/*

Skills extractors — user-scope classification

  • Replaced is_user_level_claude_subdir() with is_user_level_tool_dir() in all Linux skills extractors. The macOS version computes users_root = Path.home().parent = / when running as root, so /home/*/ skills dirs were misclassified as project-scope and duplicated.

Extractor correctness (multi-user)

  • All global-config extractors (Antigravity, Codex, Gemini CLI, OpenCode, Cursor CLI) fixed to accumulate configs for all users instead of returning after the first match
  • All 14 rules/settings/skills extractors converted from scan_user_directories(void_callback) to explicit get_linux_user_homes() loops — scan_user_directories short-circuits on truthy returns, which would silently skip users if any callback ever returned one
  • Cursor/Windsurf MCP extractors: global_*_dir now computed per-user inside the loop (was always pointing at the first user or root)

JetBrains multi-user deduplication

  • _filter_old_versions now applied per-user before extending the combined list. The previous cross-user dedup silently dropped any user whose IDE version was older than another user's (e.g. alice's WebStorm2024.3 was discarded because bob had WebStorm2025.1, taking her Copilot plugin detection with it).

mcp_extraction_helpers platform guards

  • All 5 root-support helpers extended with elif platform.system() == "Linux" branches

All Greptile P1 threads resolved (17 total)

# Issue
1–2 JetBrains detector and MCP extractor: wrong base dir (~/.local/share~/.config)
3–4 mcp_extraction_helpers: Linux missing from all 5 platform guards; Cursor missing Linux guard
5–6 Cursor/Windsurf MCP: static global_*_dir duplication across users
7 Codex/Gemini/OpenCode/CursorCLI global configs: return after first user
8 GitHub Copilot: _detect_jetbrains_for_user rescanned all users per call
9 /root was in _LINUX_SKIP_SYSTEM_DIRS
10 get_linux_user_homes omitted /root on multi-user systems
11 scan_user_directories returned None when /home absent (Docker/CI)
12 get_all_users_linux returned [] when /home absent (Docker/CI)
13 is_user_level_tool_dir returned False for /root/.<tool>
14 scan_user_directories skipped /root on multi-user machines
15 walk_for_tool_directories used macOS skip-list (/home blocked all project configs); is_user_level_claude_subdir misclassified dirs when root; Antigravity early return; 14 void-callback misuses
16 ai_tools_discovery.py: username "root" built /home/root instead of /root
17 JetBrains _filter_old_versions: cross-user dedup dropped users with older IDE versions

Test plan

  • All 504 tests pass (pytest tests/ -v)
  • End-to-end validated on Ubuntu agent runner — Claude Code v2.1.110 detected at /usr/bin/claude
  • All 17 Greptile P1 review threads resolved

Greptile Summary

This PR adds full Linux support for unbound discover and unbound onboard by introducing 47 new factory branches across all 16 AI coding tools, a linux_extraction_helpers.py module, and extending setup-scheduled-scan.sh to install via systemd --user timer or crontab on Linux.

  • Linux extraction layer: New linux_extraction_helpers.py provides Linux-aware get_linux_user_homes(), scan_user_directories(), is_user_level_tool_dir(), walk_for_tool_directories(), and should_skip_system_path() — with correct /root handling for Docker/CI root-only containers and multi-user systems.
  • 47 new extractor/detector classes: Each tool (Claude Code, Cursor, Windsurf, Codex, Gemini CLI, Cursor CLI, Cline, Roo Code, etc.) now has a Linux factory branch using direct get_linux_user_homes() iteration to avoid the per-user early-return and cross-user dedup bugs fixed from previous reviews.
  • setup-scheduled-scan.sh: Extended to support macOS + Linux with systemd timer (with Persistent=true) or crontab fallback; NPM_BIN assignment now guarded with || true and conditional assignment to avoid aborting on npm-absent hosts and leaving a leading : in PATH; Junie MCP parent_levels corrected from 2 to 3.

Confidence Score: 5/5

Safe to merge — all previously-flagged blocking issues are resolved and no new defects were found.

The two remaining items called out in the reviewer instructions are both confirmed fixed: the NPM_BIN assignment is now guarded with || true and is only written when non-empty (no leading : in the wrapper PATH), and the Junie MCP parent_levels is correctly set to 3. Every other previously-flagged concern — JetBrains base directory, cross-user version deduplication, /root omission in get_linux_user_homes, sys.exit dead-code gate, Windsurf unused import, Cursor/Windsurf per-user global dir, GitHub Copilot rescan, mcp_extraction_helpers platform guards, and ai_tools_discovery.py root home path — has also been addressed in this commit. No new defects were introduced.

No files require special attention.

Important Files Changed

Filename Overview
setup-scheduled-scan.sh Fully rewritten to support both macOS (launchd) and Linux (systemd --user timer with crontab fallback). NPM_BIN assignment now guarded with
scripts/coding_discovery_tools/linux_extraction_helpers.py New Linux-specific extraction helper module. Correctly excludes /root from _LINUX_SKIP_SYSTEM_DIRS, includes /root in get_linux_user_homes(), adds /root check to scan_user_directories(), and recognises both /home//. and /root/. as user-scope in is_user_level_tool_dir().
scripts/coding_discovery_tools/mcp_extraction_helpers.py Five mcp helper functions extended with elif platform.system() == Linux branches. extract_dual_path_configs_with_root_support and extract_claudeai_mcp_servers_with_root_support both include a separate check for root's own config path after the /home/* loop.
scripts/coding_discovery_tools/ai_tools_discovery.py Linux branches added to main() for user enumeration (get_all_users_linux()) and home-path resolution. The prior sys.exit(3) platform guard is removed; Linux code is no longer dead.
scripts/coding_discovery_tools/utils.py New get_all_users_linux() parses /etc/passwd (UID >= 1000, interactive shell) with /home directory fallback; always includes root when running as root; falls back to [Path.home().name] when /home is absent.
scripts/coding_discovery_tools/linux/jetbrains/jetbrains.py Base directory corrected to ~/.config/JetBrains. _filter_old_versions now applied per-user before extending all_detected_ides, preventing cross-user version deduplication that silently dropped users with older IDEs.
scripts/coding_discovery_tools/linux/github_copilot/detect_copilot.py JetBrains detection now calls LinuxJetBrainsDetector().detect() once outside the per-user loop, eliminating the O(N^2) rescan and triplicated results bug.
scripts/coding_discovery_tools/linux/cursor/mcp_config_extractor.py Replaced extract_global_mcp_config_with_root_support with a direct get_linux_user_homes() loop. global_cursor_dir now computed per-user inside the project-level walk.
scripts/coding_discovery_tools/linux/windsurf/windsurf_rules_extractor.py is_user_level_tool_dir is now correctly used in _extract_rules_from_windsurf_directory to skip user-level ~/.windsurf dirs. Global Windsurf rules path correctly targets ~/codeium/.windsurf/memories/global_rules.md.
scripts/coding_discovery_tools/linux/junie/mcp_config_extractor.py _PARENT_LEVELS = 3 (fixed from prior 2), so ~/.junie/mcp/mcp.json keys correctly to ~ (home), consistent with every other global MCP config tool.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[setup-scheduled-scan.sh] --> B{OS?}
    B -->|macOS| C[install_macos - launchd daily 09:00]
    B -->|Linux| D[install_linux]
    D --> E{systemd user available?}
    E -->|Yes| F[install_linux_systemd - Persistent timer]
    E -->|No| G[install_linux_crontab - cron 0 9 daily]
    F -->|Fails| G
    C --> H[store_credentials_macos - macOS Keychain]
    D --> I[store_credentials_linux - ~/.unbound/scheduled-creds.json mode 0600]
    C & D --> J[create_wrapper_script - NPM_BIN guarded with OR true]
    J --> K{COMMAND?}
    K -->|discover| L[Download install.sh run with UNBOUND_API_KEY env var]
    K -->|onboard| M[Run unbound onboard with UNBOUND_API_KEY and UNBOUND_DISCOVERY_KEY]

    subgraph Linux Discovery
        N[main - enumerate_users] --> O[get_all_users_linux - /etc/passwd UID ge 1000]
        O --> P[resolve user_home - root to /root others to /home/user]
        P --> Q[detect_tool_for_user per user]
        Q --> R[47 Linux extractor factories]
        R --> S[get_linux_user_homes - always includes /root]
    end
Loading

Comments Outside Diff (1)

  1. scripts/coding_discovery_tools/ai_tools_discovery.py, line 1763-1770 (link)

    P1 Linux branches in main() are dead code

    The sys.exit(3) guard at the top of main() exits for every non-Darwin/Windows platform before execution ever reaches the elif platform.system() == "Linux" branches added later in this same function (user enumeration via get_all_users_linux(), root home-path resolution, etc.). On a Linux machine, unbound discover will always terminate at the guard and never run any of the Linux-specific extraction logic. Either the guard should be narrowed to exclude Linux (e.g., if current_platform not in ("Darwin", "Windows", "Linux"):) so that discover actually works on Linux, or the unreachable Linux branches in main() should be removed to match the declared intent that only macOS/Windows are supported for discovery.

Reviews (28): Last reviewed commit: "fix: scheduler setup no longer aborts on..." | Re-trigger Greptile

@AakashVelusamy
Copy link
Copy Markdown
Author

@copilot resolve the merge conflicts in this pull request

Comment thread scripts/coding_discovery_tools/linux/jetbrains/jetbrains.py
Comment thread scripts/coding_discovery_tools/linux/cursor/mcp_config_extractor.py Outdated
Comment thread scripts/coding_discovery_tools/linux/cursor/mcp_config_extractor.py
Comment thread scripts/coding_discovery_tools/linux/codex/mcp_config_extractor.py Outdated
Comment thread scripts/coding_discovery_tools/linux/github_copilot/detect_copilot.py Outdated
@websentry-ai websentry-ai deleted a comment from Copilot AI May 16, 2026
Comment thread scripts/coding_discovery_tools/linux_extraction_helpers.py
Comment thread scripts/coding_discovery_tools/linux_extraction_helpers.py
@AakashVelusamy
Copy link
Copy Markdown
Author

@greptile review my pr

Comment thread scripts/coding_discovery_tools/utils.py
@AakashVelusamy
Copy link
Copy Markdown
Author

All Greptile P1 issues addressed

Resolved all 4 remaining issues from the latest Greptile review (commit 7f18bac):

P1: /root omitted from user homes on multi-user systems (id=3252993206)

  • get_linux_user_homes() now always appends /root when running as root, even when /home contains other user directories. Previously /root was silently excluded unless /home was empty.

P1: get_all_users_linux() returns empty list on Docker/CI (id=3253004885)

  • Added fallback to [Path.home().name] when /home is absent (early-return path) and when /home exists but is empty — both cases occur in root-only Docker/CI containers.

P1: scan_user_directories no fallback when /home absent (id=3252993219)

  • Already fixed in commit d449225: function checks dirs_checked == 0 after iterating (or skipping) /home and falls back to check_func(Path.home()).

P1: /root in _LINUX_SKIP_SYSTEM_DIRS (id=3252975152)

  • Already fixed in commit 19df603: /root removed from the skip-set with an explanatory comment.

All 504 tests pass.

Comment thread scripts/coding_discovery_tools/linux_extraction_helpers.py
Comment thread scripts/coding_discovery_tools/linux_extraction_helpers.py Outdated
Comment thread scripts/coding_discovery_tools/ai_tools_discovery.py Outdated
@AakashVelusamy
Copy link
Copy Markdown
Author

@greptile review the pr

Comment thread scripts/coding_discovery_tools/linux/jetbrains/jetbrains.py Outdated
@AakashVelusamy
Copy link
Copy Markdown
Author

@greptile review the pr

Comment thread scripts/coding_discovery_tools/utils.py Outdated
@AakashVelusamy AakashVelusamy changed the base branch from main to staging May 23, 2026 20:10
Comment thread setup-scheduled-scan.sh Outdated
Adds Linux implementations for every AI coding tool detector and
extractor, bringing Linux to full parity with macOS and Windows
(detectors, rules, MCP config, settings, skills), plus cross-platform
--set-cron support for user-level onboard/discover.

Highlights:
- linux/ package mirroring macos/windows for all 17 tools, wired through
  coding_tool_factory and ai_tools_discovery (multi-user /home + /root,
  Docker/CI root-only containers, /etc/machine-id device id).
- linux_extraction_helpers with Linux-aware system-path skipping and a
  walk_for_tool_directories that does not blocklist /home.
- Claude Cowork (~/.config/Claude) and Junie (~/.junie) Linux parity.
- Plugin provenance (plugin_lookup) threaded through Linux Claude/Cursor
  skills and Claude MCP extractors.
- setup-scheduled-scan cron support hardened across Linux/macOS/Windows.
- tests/test_cowork_skills_extraction_linux.py.

Rebased cleanly onto staging (which already contains plugin provenance
#129); main-only plan-detection commits are intentionally excluded — they
reach staging via the normal main->staging path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@AakashVelusamy AakashVelusamy force-pushed the feat/linux-full-support branch from e575843 to 12d56dc Compare May 23, 2026 20:42
@AakashVelusamy
Copy link
Copy Markdown
Author

@greptile review the pr throughly

@AakashVelusamy AakashVelusamy force-pushed the feat/linux-full-support branch from 055ae82 to 12d56dc Compare May 23, 2026 21:58
@AakashVelusamy
Copy link
Copy Markdown
Author

@greptile review

This PR was rebased onto staging as a single clean commit. Changes since the last review:

  • Rebased onto staging (which already contains plugin provenance [WEB-4358] Add plugin provenance detection for Claude Code and Cursor #129); dropped the merge commits and the main-only plan-detection commits so the diff is clean against the new base.
  • Added Linux parity for Claude Cowork (~/.config/Claude) and Junie (~/.junie).
  • Threaded plugin provenance (plugin_lookup) through the Linux Claude/Cursor skills and Claude MCP extractors.
  • Added tests/test_cowork_skills_extraction_linux.py.
  • Windows Junie was split out into a separate PR (feat: Junie support for Windows #137), so it's no longer here.

All 527 tests pass; validated end-to-end on a real Ubuntu VM (multi-user + root).

~/.junie/mcp/mcp.json needs 3 parent levels to resolve its reported path
to ~ (home), matching every other global MCP config. Same fix applied to
Windows Junie in PR #137.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
create_wrapper_script ran NPM_BIN=$(npm config get prefix 2>/dev/null)/bin
with no failure guard, so under 'set -euo pipefail' a host without npm
aborted the entire scheduler setup before the wrapper or cron entry was
written. Guard the lookup with '|| true' and only append the bin dir when
a prefix resolves. NPM_BIN now carries its own trailing colon when set and
is empty otherwise, so an npm-absent host no longer leaves a leading ':'
in the wrapper PATH (which would have put the CWD on PATH).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@AakashVelusamy
Copy link
Copy Markdown
Author

Addressed (commit a712ef1)

P1 — NPM_BIN assignment aborts scheduler setup on npm-absent hosts

create_wrapper_script ran NPM_BIN=$(npm config get prefix 2>/dev/null)/bin with no failure guard, so under set -euo pipefail a Linux host without npm aborted the entire setup before the wrapper or cron entry was written.

  • Guarded the lookup with || true.
  • Only append /bin when a prefix actually resolves; NPM_BIN now carries its own trailing colon when set and is empty otherwise — so an npm-absent host no longer leaves a leading : in the wrapper PATH (which would have put the CWD on PATH).

Verified both paths under set -euo pipefail: npm-present prepends the bin dir, npm-absent produces a clean PATH and does not abort.

@AakashVelusamy
Copy link
Copy Markdown
Author

@greptile review

Addressed the remaining P1 from the last review (commit a712ef1): the NPM_BIN assignment in create_wrapper_script (setup-scheduled-scan.sh) is now guarded with || true so an npm-absent Linux host no longer aborts scheduler setup under set -euo pipefail. It also no longer leaves a leading : in the wrapper PATH when npm is absent (avoids putting CWD on PATH).

Also fixed an off-by-one in the Linux Junie MCP parent_levels (2 → 3) so the config path keys to ~ like every other tool.

All review threads are resolved; 527 tests pass.

@AakashVelusamy
Copy link
Copy Markdown
Author

@copilot check whether the cron runs once a day or for every login in macos.

Copy link
Copy Markdown

Copilot AI commented May 25, 2026

@copilot check whether the cron runs once a day or for every login in macos.

On macOS it’s configured as a LaunchAgent (not cron) with both:

  • StartCalendarInterval at 09:00 daily
  • RunAtLoad=true

So it runs once per day at 09:00 and also triggers when the agent loads (install/login).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants