feat(discovery): early "detected tools" pre-report for snappy live UI by rajaramsrinivas · Pull Request #130 · websentry-ai/coding-discovery-tool

rajaramsrinivas · 2026-05-20T20:44:20Z

Summary

After the client's detection phase (every detected tool already known), send one lightweight tools-only report per user — listing every detected tool with empty projects — before the slow per-tool extraction (rules / MCP / skills / projects) begins.

Why

The existing flow serially fully-extracts each tool then sends one full report per tool — typically ~80 s per tool / ~5 min total on a real Mac. The onboarding panel currently shows one tile every ~minute as each report lands. After this change, the dashboard shows every detected agent tile within seconds of scan start, then enriches each tile with its full data as per-tool reports trickle in.

Implementation

Reuses the existing send_report_to_backend path (curl-only, per CLAUDE.md Zscaler-cert constraint — no urllib, system cert store).
Per-tool preview entries = the detected tool dicts with internal keys (_*) stripped and projects = [].
Best-effort, wrapped in try / except per user — a preview-send failure must never break the rest of discovery.
Zero server-side changes needed. The existing per-(device, tool, home_user) retire-then-recreate in process_tool_report supersedes the preview entries when the full per-tool reports later land, so this changes when a tile first appears, not the final data.

Test plan

unbound-cli discover --api-key <key> --domain <host> against any tenant.
Within seconds of scan start, observe every detected agent tile in the onboarding "+ Agents" panel.
As each per-tool extraction completes, tiles enrich (rules / MCP / project counts populate). Final state is unchanged from prior.

Cross-references

Companion server-side PR in ai-gateway-data lands the live per-run aggregates that surface the preview installs interactively (no cross-repo runtime dependency — the server already accepts a tools-only payload).
Separate customer-impacting finding discovered while testing this end-to-end: get_claude_subscription_type invokes claude auth status, which can silently downgrade Pro/Max → API on the user's Mac. Team has been notified; reportedly already resolved on main. Write-up: /home/raj/Projects/CODING-DISCOVERY-CLAUDE-SUBSCRIPTION-P0.md and Notion: https://www.notion.so/3667e55b6e678142968eef7b7b9224fd. Out of scope for this PR; flagged for visibility.

🤖 Generated with Claude Code

Greptile Summary

This PR inserts an early "detected-tools preview" report for each user immediately after the detection phase completes, before the slow per-tool extraction loop. It also fixes the previously flagged phantom-tile bug by tracking per-user tool-key sets (tool_keys_by_user) so each user only receives stubs for tools their own detection pass found.

Preview report dispatch (lines 1799–1858): iterates all_users, builds a stripped preview stub for each per-user tool set (internal _* keys and projects omitted), and calls send_report_to_backend best-effort inside a try/except.
Per-user scoping fix (lines 1750–1778): introduces tool_keys_by_user populated during the detection loop to replace the prior device-wide union approach.

Confidence Score: 3/5

The preview path dispatches reports that the server must handle as previews using logic that only exists in companion PR ai-gateway-data#1945; without that change deployed first, re-scanning users will see existing project data temporarily cleared.

The client-side per-user scoping is correct, but the preview payload omits the projects key and relies on a server-side short-circuit that requires ai-gateway-data#1945 to be live. The PR description says zero server-side changes are needed, yet the inline comment directly contradicts this for returning users.

scripts/coding_discovery_tools/ai_tools_discovery.py — specifically the preview dispatch block and its dependency on the ai-gateway-data companion change.

Important Files Changed

Filename	Overview
scripts/coding_discovery_tools/ai_tools_discovery.py	Adds early per-user detected-tools preview report before the slow per-tool extraction loop; per-user scoping is correctly implemented using tool_keys_by_user, but the preview relies on an undeclared server-side companion change (ai-gateway-data#1945) that must be deployed first to avoid corrupting returning users' project data during the preview window.

_{Reviews (4): Last reviewed commit: "fix(discovery): omit projects key from p..." | Re-trigger Greptile}

Greptile also left 1 inline comment on this PR.

After the client's detection phase (every detected tool already known), send one lightweight tools-only report per user listing every detected tool with empty projects — BEFORE the slow per-tool extraction (rules / MCP / skills / projects) begins. Why --- With the existing flow the client serially fully-extracts each tool then sends one full report per tool, taking ~80s per tool / ~5 min total on a typical Mac. The onboarding panel shows one tile every ~minute as each report lands. After this change the dashboard shows every detected agent tile within seconds of scan start, then enriches each with its full data as the per-tool reports trickle in. Implementation -------------- - Reuses the existing send_report_to_backend path (curl-only, per repo CLAUDE.md — no urllib, system cert store). - Per-tool entries are the detected tool dicts minus internal keys minus projects (projects=[]). - Best-effort, wrapped in try/except per user — a preview-send failure must never break the rest of discovery. - Zero server-side changes needed. The existing per-(device, tool, home_user) retire-then-recreate in process_tool_report supersedes the preview entries when the full per-tool reports later land, so this changes *when* a tile first appears, not the final data. Test plan --------- - Run unbound-cli discover --api-key <key> --domain <host> against any tenant. - Within seconds of scan start, observe every detected agent tile in the onboarding "+ Agents" panel. - As each per-tool extraction completes, tiles enrich (rules / MCP / project counts populate). Final state is unchanged from prior. Notes ----- - Discovered while testing agent-scanning-always-present end-to-end locally. Companion server-side PR in ai-gateway-data lands the live per-run aggregates that surface the preview installs interactively. - A separate, customer-impacting issue was found during this work: get_claude_subscription_type invokes `claude auth status` which can silently downgrade Pro/Max -> API on the user's Mac. Team has been notified; reportedly already resolved on main. Write-up: /home/raj/Projects/CODING-DISCOVERY-CLAUDE-SUBSCRIPTION-P0.md (also on Notion). Out of scope for this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-05-20T20:51:37Z

+        if tools:
+            logger.info("Sending early detected-tools report (for live UI)...")
+            preview_tools = []
+            for t in tools:
+                pt = {k: v for k, v in t.items()
+                      if not k.startswith('_') and k != 'projects'}
+                pt['projects'] = []
+                preview_tools.append(pt)
+            with time_step("send_detected_preview", "send"):
+                for user_name in all_users:
+                    preview_report = {
+                        "home_user": user_name,
+                        "system_user": system_user,
+                        "device_id": device_id,
+                        "run_id": run_id,
+                        "tools": preview_tools,
+                    }
+                    try:
+                        ok, _ = send_report_to_backend(
+                            args.domain, args.api_key, preview_report,
+                            args.app_name, sentry_context=sentry_ctx,
+                        )
+                        logger.info(
+                            f"  {'OK' if ok else 'XX'} detected-tools preview "
+                            f"for {user_name}"
+                        )
+                    except Exception as e:
+                        logger.warning(
+                            f"  detected-tools preview failed for "
+                            f"{user_name}: {e}"
+                        )
+            logger.info("")


Preview wipes existing data for re-scanning users

The preview fires the backend's retire-then-recreate for every (device, tool, home_user) keyed record with projects=[]. On a first-time scan this is fine — tiles go from absent → empty → enriched. But on any subsequent scan, all existing per-user install records (which may hold rules, MCP servers, project counts from the previous run) are retired and replaced with empty-project stubs immediately, then each tile sits in that degraded state for up to ~80 s per tool while the main extraction loop re-populates them. The PR description frames this as an onboarding feature, but there is no guard (e.g. checking whether this device_id has ever reported before) preventing it from running on every invocation and briefly erasing existing user data each time.

greptile-apps · 2026-05-20T20:51:38Z

+            preview_tools = []
+            for t in tools:
+                pt = {k: v for k, v in t.items()
+                      if not k.startswith('_') and k != 'projects'}
+                pt['projects'] = []
+                preview_tools.append(pt)
+            with time_step("send_detected_preview", "send"):
+                for user_name in all_users:
+                    preview_report = {
+                        "home_user": user_name,
+                        "system_user": system_user,
+                        "device_id": device_id,
+                        "run_id": run_id,
+                        "tools": preview_tools,


preview_tools sends all device-detected tools to every user without per-user scoping

preview_tools is built from all_tools, which is the deduplicated union of every tool found across all users on the device. Every user_name in all_users then receives the full list. On a shared Mac where User A has Claude Code at /Users/alice/.nvm/bin/claude and User B has only Cursor, User B's preview report will include a Claude Code entry whose install_path points into Alice's home directory. The backend would then create an install record for (device, claude_code, bob) with Alice's path — before the main loop has a chance to correct it. The tools_by_user dict comment says "Track which tools belong to which user" but it only maps tool_key → tool and that mapping is never consulted here.

sumit-badsara

Review

Right idea (snappy live UI) — making tiles appear within seconds of scan start is a real UX win. Two correctness concerns in the implementation are worth fixing before merge:

P1 — Phantom cross-user tool associations

scripts/coding_discovery_tools/ai_tools_discovery.py:1792-1823. preview_tools is the device-wide union of all detected tools, and the loop broadcasts it to every user in all_users. So if user A has Cursor and user B has Claude, user B's preview report says they have both Cursor and Claude — with user A's install_path embedded in the Cursor entry.

The later full per-tool reports only retire (device, tool, home_user) rows they actually carry, so the phantom row for user B's Cursor persists. Cleanup relies on reconcile_device_scan_tools at scan completion — but that has a device_id_trust == strong gate, and no client currently emits device_id_source, so trust is never strong on current data → phantoms persist indefinitely.

Fix: filter preview_tools per user the same way filter_tool_projects_by_user does later in the same loop — only preview tools detected in that user's scope.

P1 — Returning-user data hollowing

Server-side process_tool_report retires the old is_current row and creates a fresh one with projects=[] for every tool the preview carries. For a returning user re-running discovery, every tile drops to "0 projects, 0 MCP servers, 0 rules" for the ~80s/tool window until the full per-tool report lands. For a 5-tool scan that's a multi-minute regression on data that was previously displayed correctly.

A few possible fixes (in rough order of effort):

Client-side: only emit the preview on a first-ever scan for the device (track via local marker), or
Server-side: treat a tools-only payload (empty projects) as a no-op when an is_current row already exists, or as a detected_at bump + is_current=True restore (similar to the dedup-skip path in ai-gateway-data#1945), or
Server-side: have the preview path skip the retire-and-recreate entirely.

Option 2 feels cleanest because it keeps the FE/CLI contract uniform — worth coordinating with the companion server PR.

P3 — Outer `time_step` reports success even if every per-user send failed

The per-user try/except swallows individual failures (good — best-effort), but the wrapping time_step("send_detected_preview", "send") has no awareness of whether any send actually succeeded. Consider tracking a success count and surfacing it (or at least logging "0/N previews delivered") so a fully-broken preview path is observable.

Companion to ai-gateway-data#1945 (server hardening + run grouping) and unbound-fe#1498 (FE rewrite). The phantom-user issue is the highest-risk piece across the trio — recommend addressing before merging.

Addresses #130 review (Sumit + Greptile P1 "phantom tiles for the wrong user"). The early detected-tools preview (sent before per-tool extraction so the dashboard shows tiles within seconds) was building a single device-wide `preview_tools` list — the union of every user's tools — and posting that same list under each `home_user` in turn. On a multi-user box this created phantom tiles: Alice would get a Cursor tile because Bob has Cursor, and vice-versa. The full per-tool reports sent later are correctly per-user, but until they landed (or for tools the wrong user genuinely has nothing of) the phantom tile stuck around. - Track per-user tool-key sets during the detection loop (`tool_keys_by_user[user] = {tool_name:install_path, ...}`), in parallel to the existing device-wide `tools_by_user` dedup map. - When sending the preview, look up each user's own set and post only those stubs. Users with no detected tools skip the preview send entirely — no empty payload, no empty tile. Pairs with the server-side preview-only short-circuit on `ai-gateway-data#1945` (preview no longer hollows a returning user's existing rich row), but is independently sufficient: the wrong user never receives a preview for a tool they don't own in the first place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rajaramsrinivas · 2026-05-20T22:15:58Z

Review feedback addressed — pushed 0069f6d on feat/early-detected-tools-preview. @greptile-apps please re-review.

P1 (Sumit + Greptile) — phantom tiles for the wrong user.

The early detected-tools preview was building a single device-wide preview_tools list (the union of every user's tools) and posting that same list under each home_user in turn. On a multi-user box this created phantom tiles: Alice would get a Cursor tile because Bob has Cursor, and vice-versa.

The fix tracks per-user attribution during detection and scopes the preview send to it:

Added tool_keys_by_user[user] = {tool_name:install_path, ...} alongside the existing device-wide tools_by_user dedup map. Populated in the same detect-loop pass — no extra detection work.
When sending the preview, look up each user's own set and post only those stubs. Users with no detected tools skip the preview send entirely (no empty payload, no empty tile).

This pairs with the server-side preview-only short-circuit landing on ai-gateway-data#1945 (commit 4) for the returning-user data-hollowing case, but the two fixes are independently sufficient: the wrong user no longer receives a preview for a tool they don't own in the first place.

No new HTTP code, no new dependencies — still curl-only (per CLAUDE.md Zscaler constraint).

Companion to ai-gateway-data#1945 Greptile P1. The server's preview-only short-circuit uses key-absence as the disambiguator: a payload missing the `projects` key is a preview, a payload with `projects: <list>` (even empty) is a full per-tool report. Setting `pt['projects'] = []` here defeated that disambiguation — making a preview indistinguishable on the wire from a real full-report-with-zero-projects, which would either misroute a real report into the short-circuit (stale projects forever) or, if the server tightened the check, defeat Item 3 entirely. Drop the explicit empty list. The preview payload now contains only the detector's identity fields (name, install_path, version, etc.). The server short-circuits on key-absence, the full per-tool report that lands later always supplies `projects`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rajaramsrinivas · 2026-05-20T22:59:26Z

Companion to ai-gateway-data#1945 Greptile P1 on the preview-only short-circuit — pushed b5ad81a.

The server's preview-only short-circuit (returning-user hollowing fix) needs an unambiguous "this is a preview" signal. projects: [] on the wire is ambiguous — a returning user with zero projects for a tool also sends projects: [] in a full per-tool report, and the server can't tell the two apart.

Drop pt['projects'] = [] from the preview generation; the server now uses key-absence as the disambiguator ('projects' not in t). Preview payload contains only the detector's identity fields.

Wire-contract after both PRs:

preview → no projects key
full per-tool report → projects key always present (list, possibly empty)

The two PRs must land together — either alone ships the bug. Also note: #1945 and #130 still need Greptile re-review on the latest commits (82351e92 and b5ad81a).

greptile-apps · 2026-05-20T23:02:11Z

Want your agent to iterate on Greptile's feedback? Try greploops.

greptile-apps · 2026-05-20T23:02:15Z

+            # IMPORTANT: the preview must OMIT the `projects` key entirely (not
+            # set it to []). The server uses key-absence as the only
+            # unambiguous "this is a preview" signal — a full per-tool report
+            # from a user with zero projects legitimately sends `projects: []`,
+            # so an explicit empty list here would be indistinguishable from a
+            # real zero-projects report and the server's short-circuit would
+            # misroute it, leaving stale projects on returning users.
+            # Companion change: ai-gateway-data#1945
+            # (webapp/services/ai_tools_service.py preview-only short-circuit).


Undeclared server-side deployment dependency

The code comment explicitly states the server needs ai-gateway-data#1945 to distinguish a preview payload (absent projects key) from a genuine zero-project report. Without that companion change the server has no "preview-only short-circuit," so it processes the incoming payload as a regular tool report — which on a returning user's device would either overwrite or drop their existing project data for ~80 s per tool while the full extraction runs. The PR description claims "zero server-side changes needed," but the inline comment contradicts that for the re-scan case. There is no guard, feature-flag, or version check here that ensures ai-gateway-data#1945 is deployed before this code path is exercised.

greptile-apps Bot reviewed May 20, 2026

View reviewed changes

sumit-badsara reviewed May 20, 2026

View reviewed changes

rajaramsrinivas changed the base branch from main to staging May 20, 2026 21:43

greptile-apps Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(discovery): early "detected tools" pre-report for snappy live UI#130

feat(discovery): early "detected tools" pre-report for snappy live UI#130
rajaramsrinivas wants to merge 3 commits into
stagingfrom
feat/early-detected-tools-preview

rajaramsrinivas commented May 20, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

greptile-apps Bot May 20, 2026

Uh oh!

greptile-apps Bot May 20, 2026

Uh oh!

sumit-badsara left a comment

Uh oh!

rajaramsrinivas commented May 20, 2026

Uh oh!

rajaramsrinivas commented May 20, 2026

Uh oh!

greptile-apps Bot commented May 20, 2026

Uh oh!

greptile-apps Bot May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rajaramsrinivas commented May 20, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Implementation

Test plan

Cross-references

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Uh oh!

greptile-apps Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

sumit-badsara left a comment

Choose a reason for hiding this comment

Review

P1 — Phantom cross-user tool associations

P1 — Returning-user data hollowing

P3 — Outer time_step reports success even if every per-user send failed

Uh oh!

rajaramsrinivas commented May 20, 2026

Uh oh!

rajaramsrinivas commented May 20, 2026

Uh oh!

greptile-apps Bot commented May 20, 2026

Uh oh!

greptile-apps Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rajaramsrinivas commented May 20, 2026 •

edited by greptile-apps Bot

Loading

P3 — Outer `time_step` reports success even if every per-user send failed