feat(llm): pluggable web search providers (Exa, Tavily, SearXNG)#9556
feat(llm): pluggable web search providers (Exa, Tavily, SearXNG)#9556tgonzalezc5 wants to merge 1 commit into
Conversation
Introduces a search-provider registry that mirrors the existing `llmProviders` pattern, letting users configure one or more third-party web search engines for the LLM agent. When any provider is configured it replaces each LLM provider's built-in web search; when empty, behaviour is unchanged (Anthropic, OpenAI and Google continue to use their native search). - New `apps/server/src/services/search_providers/` module with a common `SearchProvider` interface, raw-fetch implementations of Exa, Tavily and SearXNG, and an AI-SDK tool wrapper. - Single `searchProviders` option (JSON array of setups) persisted and synced, whitelisted in the options API, added to `OptionDefinitions`. - `BaseProvider.buildTools` prefers a configured search provider and falls back to the provider-native web search if none is set. - New `SearchProviderSettings` section in the AI / LLM options page with an add-modal that shows per-provider fields (API key for Exa and Tavily, base URL for SearXNG) and a list widget for managing configured instances. - Unit specs for each provider covering response parsing, content-field fallbacks, error handling and the registry's caching and unknown-type behaviour.
There was a problem hiding this comment.
Code Review
This pull request introduces pluggable web search providers (Exa, Tavily, and SearXNG) for the AI agent, allowing it to use third-party search engines instead of relying solely on built-in LLM search capabilities. The changes include new UI components for managing search providers, server-side registry and tool integration, and comprehensive unit tests. Feedback includes a correction for a CSS class typo and recommendations to add array validation when parsing search provider configurations to prevent potential runtime crashes.
| if (!providersJson) { | ||
| return []; | ||
| } | ||
| return JSON.parse(providersJson) as SearchProviderSetup[]; |
There was a problem hiding this comment.
The result of JSON.parse should be validated to ensure it is an array before returning. If the searchProviders option is manually set to a non-array JSON value (like an object or null), subsequent calls to .length or .find on the returned value will throw an error.
const parsed = JSON.parse(providersJson);
return Array.isArray(parsed) ? parsed : [];| const [providersJson, setProvidersJson] = useTriliumOption("searchProviders"); | ||
| const providers = useMemo<SearchProviderConfig[]>(() => { | ||
| try { | ||
| return providersJson ? JSON.parse(providersJson) : []; |
There was a problem hiding this comment.
Similar to the server-side registry, the parsed JSON should be validated as an array. If the option contains a non-array value, the useMemo hook will return a value that causes providers.filter or providers.map to crash the UI.
const parsed = providersJson ? JSON.parse(providersJson) : [];
return Array.isArray(parsed) ? parsed : [];
|
|
||
| return ( | ||
| <div style={{ overflow: "auto" }}> | ||
| <table className="table table-stripped"> |
|
Hi, what's the difference compared to #9342? |
|
Hello @eliandoran ! I saw your feedback in the other PR asking for a provider-type abstraction, multiple instances, and a few other things. I generally did the following:
Also made sure it's backward compat with a default empty config. Happy to have you fold this into #9342 , whatever you prefer. |
|
@tgonzalezc5 , what's your opinion of the LLM chat we have so far? |
Summary
Adds a pluggable web-search provider abstraction for the LLM agent. Each provider is a user-configured instance stored in a new
searchProvidersoption, mirroring the shape of the existingllmProvidersregistry so multiple instances of the same provider (for example two SearXNG URLs) can coexist and the same UI patterns are reused.Three providers ship with the abstraction: Exa, Tavily and SearXNG. When any provider is configured the agent routes
web_searchthrough it; when the list is empty the existing per-provider built-in search (Anthropic, OpenAI, Google) is preserved, so this change is fully backward-compatible with current installs.What's in it
apps/server/src/services/search_providers/— new module containing:base_search_provider.ts:SearchProviderinterface + normalizedSearchResultshape (title,url,snippet, optionalpublishedDate,author) and aSearchOptionstype exposingnumResults, domain and date filters, and category.exa.ts,tavily.ts,searxng.ts: raw-fetch implementations. Each returns typedSearchResult[]. Exa requests highlights, a summary and a 500-char text extract in a single call and cascades through them so the LLM always sees a meaningful snippet even when one field is missing.index.ts: registry withgetSearchProvider(id?),getFirstSearchProvider(),hasConfiguredSearchProviders(),clearSearchProviderCache(), and graceful handling of unknown provider types or bad config.tool.ts: AI-SDKtool(...)wrapper that exposes anySearchProvideras theweb_searchtool.base_provider.ts:buildTools()now consults the search registry first and only falls through toaddWebSearchTool()(native) when nothing is configured.packages/commons/src/lib/options_interface.ts: newsearchProviders: stringoption (JSON-serialized array).apps/server/src/services/options_init.ts: defaultsearchProviders: \"[]\"(synced).apps/server/src/routes/api/options.ts: option whitelisted for API updates.apps/client/src/widgets/type_widgets/options/llm.tsx: newSearchProviderSettingssection with add/delete flow that mirrors the existingProviderSettingswidget.apps/client/src/widgets/type_widgets/options/llm/AddSearchProviderModal.tsx: add-provider modal; fields are driven by provider metadata (requiresApiKeyfor Exa/Tavily,requiresBaseUrlfor SearXNG).llm.*.exa.spec.ts,tavily.spec.ts,searxng.spec.tsandindex.spec.tscovering response parsing, content-field fallbacks, filter propagation, error paths, disabled state, registry caching and unknown-type handling.Backward compatibility
searchProvidersdefaults to\"[]\", so every existing install falls through to the current provider-native search with no behaviour change.Usage
Adding a provider is done through Options → AI / LLM → Web Search Providers → Add Search Provider.
Files changed
apps/client/src/translations/en/translation.jsonapps/client/src/widgets/type_widgets/options/llm.tsxapps/client/src/widgets/type_widgets/options/llm/AddSearchProviderModal.tsx(new)apps/server/src/routes/api/options.tsapps/server/src/services/llm/providers/base_provider.tsapps/server/src/services/options_init.tsapps/server/src/services/search_providers/*.ts(new module, 10 files incl. specs)packages/commons/src/lib/options_interface.tsTest plan
pnpm -C apps/server exec vitest run src/services/search_providers/— 28/28 new tests pass.pnpm -C apps/server exec vitest run— full server suite: 725 passed / 1 expected fail / 26 skipped (same as main).search_providers/orAddSearchProviderModal.tsx).web_searchtool hits Exa and returns results.