Skip to content

fix: expose search delegate dedup failures#559

Merged
buger merged 1 commit into
mainfrom
fix/search-dedup-tracing-docs
May 6, 2026
Merged

fix: expose search delegate dedup failures#559
buger merged 1 commit into
mainfrom
fix/search-dedup-tracing-docs

Conversation

@buger
Copy link
Copy Markdown
Collaborator

@buger buger commented May 6, 2026

Summary

  • expose delegate dedup model/provider availability and error details in tracing spans
  • preserve allow-on-failure behavior while distinguishing unavailable model, malformed responses, and LLM errors
  • add targeted unit coverage for dedup failure tracing
  • include internal Probe Server architecture/scaling research doc under docs/internal

Tests

  • npm test -- --runInBand npm/tests/unit/search-delegate.test.js

Record dedup model availability, provider/model selection, and failure details in tracing spans instead of masking them as normal allow decisions.

Also adds the internal Probe Server architecture and scaling research plan under docs/internal.
@buger buger merged commit 1c2f799 into main May 6, 2026
13 checks passed
@probelabs
Copy link
Copy Markdown
Contributor

probelabs Bot commented May 6, 2026

PR Overview: Expose Search Delegate Dedup Failures

Summary

This PR improves observability for the search delegate's LLM-based deduplication system by exposing detailed failure information in OpenTelemetry tracing spans. Previously, dedup failures (model unavailable, API errors, malformed responses) were masked behind generic "allow" actions. Now these failures are explicitly recorded with structured error codes and messages while preserving the existing allow-on-failure behavior.

Files Changed

npm/src/tools/vercel.js (+25, -7)

Key changes to checkDelegateDedup function (lines 143-212):

  1. Split model unavailable handling - Separated "no previous queries" from "dedup model unavailable" into distinct early returns with specific error codes
  2. Enhanced LLM response parsing - Added explicit handling for allow action and malformed responses with structured error codes
  3. Structured error capture - Changed catch block to return detailed error information including error name and message

Key changes to dedup span attributes (lines 715-760):

  1. Moved provider/model resolution before model creation for better traceability
  2. Added input span attributes: dedup.provider, dedup.model, dedup.model_available
  3. Added output span attribute: dedup.error field

npm/tests/unit/search-delegate.test.js (+66, -1)

Added comprehensive test for dedup failure tracing that verifies span captures provider/model availability and error details when generateText rejects with "503 model overloaded" error.

docs/internal/probe-server-plan.md (+658, -0)

New internal architecture document for proposed Probe Server - company-wide code answering system. While valuable context, this document is orthogonal to the dedup tracing changes.

Architecture & Impact Assessment

What This PR Accomplishes

Primary Goal: Expose deduplication failure details in observability traces without changing runtime behavior.

Before: Dedup failures all returned generic allow action with no way to distinguish between normal operation and configuration issues.

After: Three distinct failure modes with structured error codes - dedup_model_unavailable, unexpected_response, and API error details. Traces capture provider/model configuration and availability.

Key Technical Changes

  1. Error taxonomy - Introduced structured error field in dedup result object
  2. Provider/model observability - Added span attributes for resolved provider name, model name, and availability
  3. Graceful degradation preserved - All error paths still return action 'allow' to prevent blocking searches

Affected System Components

The changes affect the searchTool function in vercel.js, specifically the checkDelegateDedup function and the tracer.withSpan callback that records dedup decisions. The dedup model is created via createLanguageModel from provider.js, and spans are recorded by SimpleAppTracer.

Scope Discovery & Context Expansion

Immediate Impact

Directly affected: npm/src/tools/vercel.js (core search/delegate tool), npm/tests/unit/search-delegate.test.js (unit test coverage). Indirectly affected: any observability dashboards consuming OTEL traces from search.delegate.dedup spans.

Related Configuration

Environment variables for dedup model: PROBE_SEARCH_DELEGATE_PROVIDER, PROBE_SEARCH_DELEGATE_MODEL. Fallback chain: options to env vars to parent provider/model to FORCE_PROVIDER.

Entry Points & Callers

Primary entry point: searchTool({ searchDelegate: true, ... }) creates tool with delegate mode enabled. Tool's execute method triggers dedup on subsequent calls when previousDelegations.length > 0.

References

Code References

Modified files:

  • npm/src/tools/vercel.js:143-212 - checkDelegateDedup function with enhanced error handling
  • npm/src/tools/vercel.js:715-760 - Dedup span creation with new attributes
  • npm/tests/unit/search-delegate.test.js:6-73 - New test for dedup failure tracing

Related infrastructure:

  • npm/src/utils/provider.js - createLanguageModel function
  • npm/src/agent/simpleTelemetry.js - SimpleAppTracer.withSpan implementation
  • npm/src/tools/vercel.js:471-530 - Search tool state initialization
Metadata
  • Review Effort: 2 / 5
  • Primary Label: bug

Powered by Visor from Probelabs

Last updated: 2026-05-06T06:14:42.712Z | Triggered by: pr_opened | Commit: 0ac81bb

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link
Copy Markdown
Contributor

probelabs Bot commented May 6, 2026

Security Issues (1)

Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'

Architecture Issues (1)

Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'

Security Issues (1)

Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'
\n\n ### Architecture Issues (1)
Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'
\n\n \n\n

Quality Issues (1)

Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'

Powered by Visor from Probelabs

Last updated: 2026-05-06T06:07:15.387Z | Triggered by: pr_opened | Commit: 0ac81bb

💡 TIP: You can chat with Visor using /visor ask <your question>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant