fix: expose search delegate dedup failures#559
Conversation
Record dedup model availability, provider/model selection, and failure details in tracing spans instead of masking them as normal allow decisions. Also adds the internal Probe Server architecture and scaling research plan under docs/internal.
PR Overview: Expose Search Delegate Dedup FailuresSummaryThis PR improves observability for the search delegate's LLM-based deduplication system by exposing detailed failure information in OpenTelemetry tracing spans. Previously, dedup failures (model unavailable, API errors, malformed responses) were masked behind generic "allow" actions. Now these failures are explicitly recorded with structured error codes and messages while preserving the existing allow-on-failure behavior. Files Changednpm/src/tools/vercel.js (+25, -7)Key changes to checkDelegateDedup function (lines 143-212):
Key changes to dedup span attributes (lines 715-760):
npm/tests/unit/search-delegate.test.js (+66, -1)Added comprehensive test for dedup failure tracing that verifies span captures provider/model availability and error details when generateText rejects with "503 model overloaded" error. docs/internal/probe-server-plan.md (+658, -0)New internal architecture document for proposed Probe Server - company-wide code answering system. While valuable context, this document is orthogonal to the dedup tracing changes. Architecture & Impact AssessmentWhat This PR AccomplishesPrimary Goal: Expose deduplication failure details in observability traces without changing runtime behavior. Before: Dedup failures all returned generic allow action with no way to distinguish between normal operation and configuration issues. After: Three distinct failure modes with structured error codes - dedup_model_unavailable, unexpected_response, and API error details. Traces capture provider/model configuration and availability. Key Technical Changes
Affected System ComponentsThe changes affect the searchTool function in vercel.js, specifically the checkDelegateDedup function and the tracer.withSpan callback that records dedup decisions. The dedup model is created via createLanguageModel from provider.js, and spans are recorded by SimpleAppTracer. Scope Discovery & Context ExpansionImmediate ImpactDirectly affected: npm/src/tools/vercel.js (core search/delegate tool), npm/tests/unit/search-delegate.test.js (unit test coverage). Indirectly affected: any observability dashboards consuming OTEL traces from search.delegate.dedup spans. Related ConfigurationEnvironment variables for dedup model: PROBE_SEARCH_DELEGATE_PROVIDER, PROBE_SEARCH_DELEGATE_MODEL. Fallback chain: options to env vars to parent provider/model to FORCE_PROVIDER. Entry Points & CallersPrimary entry point: searchTool({ searchDelegate: true, ... }) creates tool with delegate mode enabled. Tool's execute method triggers dedup on subsequent calls when previousDelegations.length > 0. ReferencesCode ReferencesModified files:
Related infrastructure:
Metadata
Powered by Visor from Probelabs Last updated: 2026-05-06T06:14:42.712Z | Triggered by: pr_opened | Commit: 0ac81bb 💡 TIP: You can chat with Visor using |
Security Issues (1)
Architecture Issues (1)
Security Issues (1)
Quality Issues (1)
Powered by Visor from Probelabs Last updated: 2026-05-06T06:07:15.387Z | Triggered by: pr_opened | Commit: 0ac81bb 💡 TIP: You can chat with Visor using |
Summary
Tests