WEB-4679: declare a scan manifest so the backend can prune uninstalled tools#187
Conversation
…d tools The discovery agent never told the backend which tools were still installed, so an uninstalled tool's row persisted on the dashboard forever. The completed scan event now carries a manifest of the (home_user, tool_name) pairs successfully scanned this run plus covered_home_users, letting the backend reconcile by set-difference and soft-delete what's gone. - ai_tools_discovery.py: accumulate scanned_manifest on the send-success and dedup hash-match paths only (a tool that errored on read is left out so it is never mistaken for uninstalled); pass manifest + covered_home_users (= all enumerated OS users, so a user who removed their last tool is still in scope) to the completed send_scan_event. - utils.py: send_scan_event gains optional manifest / covered_home_users, inserted into the payload only when present (backward compatible). Stdlib-only; the accumulation is pure in-memory and cannot raise. Pairs with the ai-gateway-data WEB-4679 reconcile change (forward/backward compatible: an old backend simply ignores the new fields). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…covery-never-prunes-uninstalled-tools-removed-tools
Addresses Greptile P2: accumulate the manifest as a set of tuples (a pair can never be double-recorded) and serialize to a sorted list of dicts at the send site. Functionally equivalent today (the success and hash-match branches are mutually exclusive, one entry per (tool, user)), but removes the latent duplicate risk and makes the output deterministic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tighten comments Greptile flag: a tool whose READ succeeded but whose upload failed transiently was dropped from the manifest, so the backend could mistake a network blip for an uninstall and prune a live tool (enforce mode). Record it in the send-failure branch too — the manifest tracks what was SEEN, not what uploaded. Adds a regression test (read-success + upload-failure stays in the manifest). Also condense the WEB-4679 comments to concise one-liners. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ve tool A tool whose read errors without a scan_event=failed (the generic per-user except, the per-tool except, or a PermissionError whose failed-event send itself fails) leaves the manifest possibly missing an installed tool — the backend would then set-diff it as uninstalled and wrongly prune it. Track scanned_manifest_complete and send manifest=None (backend treats as legacy = no prune) when the run wasn't fully read, deferring cleanup to a clean run. Adds a test forcing a generic (non-Permission) read error -> manifest=None. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… success A read/extraction error no longer drops a live tool from the manifest (presence is recorded before extraction) and never fail-closes the whole manifest to None — so one tool's failure can't block pruning every other tool on the device. A detector that errors is kept in the manifest (presence unknown, not an uninstall). Removes the global scanned_manifest_complete fail-close. Tests rewritten to the new contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🛡️ Automated Security Review (consensus)1 finding — 0 high-confidence, 1 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. Findings🟡 TRIAGE: Device-wide error over-expands manifest across all users
Impact: On a device-wide processing exception, the handler adds Fix: On device-wide failure, add manifest entries only for users where the tool was actually detected this run (e.g., track per-user detection before the outer Flagged by: Cursor Notes
🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
…talled-tools-removed-tools Conflicts were purely additive — union both sides: - utils.send_scan_event: keep staging's system_user param alongside WEB-4679's manifest + covered_home_users (and their docstrings); body already adds all three. - ai_tools_discovery completed event: pass system_user + manifest + covered_home_users. Manifest-from-presence fix auto-merged cleanly. Manifest tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🛡️ Automated Security Review (consensus)1 finding — 0 high-confidence, 1 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. Findings🟡 TRIAGE: Device-wide
|
…n detector error
Addresses two Greptile P1s on the scan-manifest:
- Phantom ownership: the manifest was added for every (user x globally-deduped
tool), so a user-scoped tool one user has was attributed to co-resident users
who never detected it -> backend could never prune their stale rows. Build the
manifest from per-user detection, and only emit a report for a (user, tool)
the user actually detected (gate on manifest membership). Add the missing
Copilot CLI ownership discard (only Augment had it).
- Umbrella names: a detector failure recorded detector.tool_name (e.g.
'GitHub Copilot'), not the concrete row names ('GitHub Copilot (VS Code)'),
so the set-diff would prune the real surface rows. Instead mark the scan
unclean (a 'failed' event before 'completed') so the backend skips pruning.
Also add manifest_size + scan_incomplete to discovery metrics.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 83a19b3. Configure here.
| args.domain, args.api_key, device_id, run_id, "completed", | ||
| args.app_name, sentry_context=sentry_ctx, system_user=system_user | ||
| args.app_name, sentry_context=sentry_ctx, system_user=system_user, | ||
| manifest=manifest, covered_home_users=all_users, |
There was a problem hiding this comment.
Incomplete scan still sends prune manifest
Medium Severity
When incomplete_reasons is non-empty, the client sends a failed event with ScanIncomplete and then always sends completed with manifest and covered_home_users. The failed send’s success is ignored, and the completed payload carries no incompleteness flag. If the failed event never reaches the backend, reconcile may run on a partial manifest and soft-delete tools that were skipped because of detector errors.
Reviewed by Cursor Bugbot for commit 83a19b3. Configure here.
🛡️ Automated Security Review (consensus)0 findings — 0 high-confidence, 0 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. ✅ Security consensus: no issues found. (reviewers: Cursor, Claude, Semgrep, Gitleaks) Previously acknowledged (not re-flagged)
🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
Regression guard for the discovery prune key: the JetBrains tool name must stay the bare display_name (version + license/plan kept in separate fields), so a version bump or Free<->Licensed change never orphans the install row against the manifest. Fails CI if a future change re-embeds version/plan into the name. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🛡️ Automated Security Review (consensus)1 finding — 0 high-confidence, 1 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. Findings🟡 TRIAGE: Incomplete scan still ships prune manifest on
|
| send_scan_event( | ||
| args.domain, args.api_key, device_id, run_id, "failed", | ||
| args.app_name, | ||
| scan_error={ | ||
| "error_type": "ScanIncomplete", | ||
| "message": "; ".join(incomplete_reasons[:20]), | ||
| "timestamp": datetime.utcnow().isoformat() + "Z", | ||
| }, | ||
| sentry_context=sentry_ctx, system_user=system_user, | ||
| ) |
There was a problem hiding this comment.
Unchecked failure marker When a detector errors, this branch relies on the backend receiving the
failed event before the following completed event triggers reconciliation. The return value is ignored, so if this POST exhausts its retries or times out while the next completed POST succeeds, the backend can receive an incomplete manifest with no ScanIncomplete marker. That can make a transient detector failure look like an uninstall and soft-delete live tools. Please only send the prune-bearing completed payload after the unclean marker is persisted, or include an explicit skip-prune signal on the completed event itself.


Summary
The discovery agent never told the backend which tools were still installed, so an uninstalled tool's row persisted on the dashboard forever. The
completedscan event now carries a manifest of the(home_user, tool_name)pairs successfully scanned this run, pluscovered_home_users, so the backend can reconcile by set-difference and soft-delete what's gone.Changes
ai_tools_discovery.py: accumulatescanned_manifeston the send-success and dedup hash-match paths only — a tool that errored on read is deliberately left out so it's never mistaken for uninstalled. Passmanifest+covered_home_users(= all enumerated OS users, so a user who removed their last tool is still in scope) to the completedsend_scan_event.utils.py:send_scan_eventgains optionalmanifest/covered_home_users, inserted into the payload only when present (backward compatible).Stdlib-only; the accumulation is pure in-memory and cannot raise (runs on customer machines).
Cross-repo
Pairs with the ai-gateway-data WEB-4679 PR (reconcile +
removed_atsoft-delete). Forward/backward compatible — an old backend simply ignores the new fields, deploy order independent.Test plan
tests/test_scan_completed_manifest.py(8 tests): manifest carried on the completed event; success + hash-match included, errored read excluded;covered_home_usersincludes a zero-tool user; legacy call omits both keys; non-completed events carry no manifest.🤖 Generated with Claude Code
Note
Medium Risk
Changes drive backend inventory reconciliation; incorrect manifest or pruning could soft-delete live rows, though unclean scans skip pruning and presence-based manifest reduces false negatives from transient failures.
Overview
Enables the backend to soft-delete uninstalled tools by sending a per-scan manifest of
(home_user, tool_name)pairs on thecompletedlifecycle event, pluscovered_home_usersto bound which accounts were scanned.Manifest semantics (shift from upload success): entries are recorded during per-user detection, so a tool stays listed even when later read or upload fails (avoiding false “uninstalled” pruning). Reporting is gated so only users who actually detected a tool get rows—fixing phantom multi-user attribution from globally deduped
all_tools. Copilot CLI and Augment ownership skips remove manifest entries for non-owners.Safety when presence is unknown: detector exceptions feed a
failuresset; any detector error marks the run unclean (ScanIncompletefailedevent beforecompleted) so reconcile can skip pruning. Discovery metrics addmanifest_sizeandscan_incomplete.API:
send_scan_eventoptionally includesmanifest/covered_home_users(backward compatible). New tests lock manifest behavior and stable JetBrainstool_nameprune keys (no version/plan in the name).Reviewed by Cursor Bugbot for commit a62873d. Bugbot is set up for automated code reviews on this repo. Configure here.
Greptile Summary
This PR adds scan manifests so the backend can prune tools that are no longer installed. The main changes are:
(home_user, tool_name)pairs detected during a scan.manifestandcovered_home_userson completed scan events.Confidence Score: 4/5
This is close, but the incomplete-scan path should be fixed before merging.
scripts/coding_discovery_tools/ai_tools_discovery.py
Important Files Changed
Comments Outside Diff (2)
scripts/coding_discovery_tools/ai_tools_discovery.py, line 2646-2649 (link)When
send_report_to_backendreturns(False, retryable=True), the report is queued for the next run but the tool is never appended toscanned_manifest. The backend then receives acovered_home_usersentry for the user but no manifest entry for this tool, which it may interpret as "tool uninstalled" and issue a soft-delete — even though the tool is still present; the upload just failed transiently.The PR description states the exclusion criterion as "a tool that errored on read", and the comment on line 2644 echoes "Successfully read and uploaded." Upload failures are not read failures, so by the stated design intent the tool should still be manifested. The fix is to record the tool in
scanned_manifestas soon asfilter_tool_projects_by_userandgenerate_single_tool_reportsucceed (i.e., after the report is built), independent of whether the HTTP upload succeeds.scripts/coding_discovery_tools/ai_tools_discovery.py, line 2712-2736 (link)The existing
sentry_metrics_payload(sent viasend_discovery_metrics) trackstool_countanduser_count, but adds nothing for the new manifest. Without manifest-specific metrics it will be impossible to diagnose backend pruning anomalies in production. Concrete metrics to add to themetadatablock:manifest_size:len(scanned_manifest)— how many (tool, user) pairs were recorded this run.manifest_excluded_reads: tools that errored on read (the key correctness signal for the errored-read exclusion logic).Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Reviews (8): Last reviewed commit: "test(web-4679): lock JetBrains prune-key..." | Re-trigger Greptile