diff --git a/docs/_advanced/upgrading.md b/docs/_advanced/upgrading.md index 092f75f7f..279e51763 100644 --- a/docs/_advanced/upgrading.md +++ b/docs/_advanced/upgrading.md @@ -21,6 +21,16 @@ redirect_from: {:toc} --- +# Upgrade to 1.15 + +## How to Upgrade + +1.15 adds tool search in a fully additive way. No generator, no migration — upgrade the gem and continue using RubyLLM as before. + +## What's New in 1.15 + +- **Tool Search (Anthropic)** — `RubyLLM::Chat#with_tool` / `#with_tools` accept a new `defer:` keyword argument, and `RubyLLM::Tool` exposes a class-level `deferred` DSL. On Anthropic this translates to the native `defer_loading: true` flag plus the `tool_search_tool_bm25_20251119` primitive: deferred tools stay out of the system-prompt prefix and Claude loads the ones it actually needs server-side. On other providers `defer:` is ignored with a one-time warning. If you don't use `defer:` or `deferred`, nothing changes. See [Tool Search]({% link _core_features/tool-search.md %}). + # Upgrade to 1.14 ## How to Upgrade diff --git a/docs/_core_features/tool-search.md b/docs/_core_features/tool-search.md new file mode 100644 index 000000000..ffdd7e6cf --- /dev/null +++ b/docs/_core_features/tool-search.md @@ -0,0 +1,113 @@ +--- +layout: default +title: Tool Search +nav_order: 9 +description: Keep large tool catalogs out of Claude's prompt prefix. Mark tools as deferred and let Anthropic's server-side tool-search primitive load them on demand. +redirect_from: + - /guides/tool-search +--- + +# {{ page.title }} +{: .d-inline-block .no_toc } + +New in 1.15 +{: .label .label-green } + +{{ page.description }} +{: .fs-6 .fw-300 } + +## Table of contents +{: .no_toc .text-delta } + +1. TOC +{:toc} + +--- + +After reading this guide, you will know: + +* When deferred tool loading helps. +* How to mark tools as deferred. +* How Anthropic loads deferred tools at runtime. +* How to observe which tools the model loaded. + +## When to use it + +When a `RubyLLM::Chat` is wired to many tools — especially across one or more MCP servers — every tool's full JSON Schema ships in the system-prompt prefix on every turn. Three real costs follow: + +1. **Token bloat.** Hundreds of tools can add tens of thousands of tokens per request. +2. **Prompt-cache eviction.** Adding or removing tools changes the prefix and invalidates the cache. +3. **Selection accuracy.** Models choose worse tools when the menu is long. + +This translates Anthropic's [tool search tool](https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool) feature: mark tools as `deferred` and RubyLLM forwards `defer_loading: true` to Anthropic's API, which hides the schemas from Claude until a server-side BM25 primitive loads the tools the conversation actually needs. + +**This feature currently only supports Anthropic.** On other providers, `defer: true` is silently coerced to regular registration (a warning is logged once). + +## Marking tools as deferred + +### Per-class DSL + +```ruby +class DeepResearchTool < RubyLLM::Tool + description "Runs a multi-step web search..." + deferred # class-level DSL + + param :query, desc: "..." + def execute(query:); ...; end +end +``` + +### Per-call, for bulk registration (MCP case) + +```ruby +chat = RubyLLM.chat(model: "claude-sonnet-4-6") +chat.with_tools(*mcp_client.tools, defer: true) +``` + +Per-call `defer: true` overrides a non-deferred class; `defer: false` overrides a `deferred` class. + +## How Claude loads deferred tools + +On Anthropic, `defer: true` translates to two things in the request payload: + +1. `defer_loading: true` on each deferred tool's function entry. +2. A `tool_search_tool_bm25_20251119` primitive appended to the tools array. + +Claude then runs the search server-side, loads the matching tools via a `tool_reference` mechanism, and calls them directly. RubyLLM parses the `tool_search_tool_result` blocks and moves the referenced tools from `chat.tool_catalog.deferred_tools` into the active `chat.tools` so the next turn can dispatch them normally. + +## Observing what was loaded + +```ruby +chat.on_tool_search do |event| + # event.query # nil for Anthropic-native — Claude runs the search server-side + # event.results # Array of promoted tool name Symbols + Rails.logger.info("tool_search loaded: #{event.results}") +end +``` + +Inspect state: + +```ruby +chat.tool_catalog # => # +chat.tool_catalog.deferred_tools # Hash of deferred tool name => Tool +chat.tool_catalog.loaded_tools # Set of promoted tool name symbols +``` + +## Kill switch + +```ruby +RubyLLM.configure do |c| + c.tool_search_enabled = false # default true +end +``` + +When false, `defer: true` is coerced to regular registration and a warning is logged once per chat. + +## Non-Anthropic providers + +On OpenAI, Gemini, and Bedrock, `defer: true` is ignored and a warning is logged once — the tool registers normally. A follow-up release may add client-side emulation for these providers. + +## Further reading + +* [Anthropic tool search tool](https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool) +* [Tools guide]({% link _core_features/tools.md %}) diff --git a/docs/_core_features/tools.md b/docs/_core_features/tools.md index 26cca90b0..3974a888c 100644 --- a/docs/_core_features/tools.md +++ b/docs/_core_features/tools.md @@ -509,6 +509,8 @@ end For MCP server integration, check out the community-maintained [`ruby_llm-mcp`](https://github.com/patvice/ruby_llm-mcp) gem. +When a chat is wired to many tools — especially across MCP servers — see [Tool Search]({% link _core_features/tool-search.md %}) for how to defer tool schemas and let the model load only the ones it needs. + ## Debugging Tools Set the `RUBYLLM_DEBUG` environment variable to see detailed logging, including tool calls and results. diff --git a/lib/ruby_llm/chat.rb b/lib/ruby_llm/chat.rb index afc572342..35d8fd674 100644 --- a/lib/ruby_llm/chat.rb +++ b/lib/ruby_llm/chat.rb @@ -5,7 +5,7 @@ module RubyLLM class Chat include Enumerable - attr_reader :model, :messages, :tools, :tool_prefs, :params, :headers, :schema + attr_reader :model, :messages, :tools, :tool_prefs, :params, :headers, :schema, :tool_catalog def initialize(model: nil, provider: nil, assume_model_exists: false, context: nil) if assume_model_exists && !provider @@ -19,6 +19,7 @@ def initialize(model: nil, provider: nil, assume_model_exists: false, context: n @temperature = nil @messages = [] @tools = {} + @tool_catalog = ToolCatalog.new @tool_prefs = { choice: nil, calls: nil } @params = {} @headers = {} @@ -28,7 +29,8 @@ def initialize(model: nil, provider: nil, assume_model_exists: false, context: n new_message: nil, end_message: nil, tool_call: nil, - tool_result: nil + tool_result: nil, + tool_search: nil } end @@ -51,18 +53,18 @@ def with_instructions(instructions, append: false, replace: nil) self end - def with_tool(tool, choice: nil, calls: nil) - unless tool.nil? - tool_instance = tool.is_a?(Class) ? tool.new : tool - @tools[tool_instance.name.to_sym] = tool_instance - end + def with_tool(tool, defer: nil, choice: nil, calls: nil) + register_tool(tool, defer: defer) unless tool.nil? update_tool_options(choice:, calls:) self end - def with_tools(*tools, replace: false, choice: nil, calls: nil) - @tools.clear if replace - tools.compact.each { |tool| with_tool tool } + def with_tools(*tools, replace: false, defer: nil, choice: nil, calls: nil) + if replace + @tools.clear + @tool_catalog = ToolCatalog.new + end + tools.compact.each { |tool| with_tool tool, defer: defer } update_tool_options(choice:, calls:) self end @@ -132,6 +134,11 @@ def on_tool_result(&block) self end + def on_tool_search(&block) + @on[:tool_search] = block + self + end + def each(&) messages.each(&) end @@ -139,7 +146,7 @@ def each(&) def complete(&) # rubocop:disable Metrics/PerceivedComplexity response = @provider.complete( messages, - tools: @tools, + tools: effective_tools, tool_prefs: @tool_prefs, temperature: @temperature, model: @model, @@ -161,6 +168,7 @@ def complete(&) # rubocop:disable Metrics/PerceivedComplexity end add_message response + promote_from_tool_references(response) @on[:end_message]&.call(response) if response.tool_call? @@ -186,6 +194,25 @@ def instance_variables private + # Promotes deferred tools that a provider's native tool-search primitive + # loaded via +message.tool_references+. The resulting +SearchEvent+ + # carries +query: nil+ to signal the native path. + def promote_from_tool_references(message) + names = Array(message.tool_references) + return self if names.empty? || @tool_catalog.empty? + + promoted = names.filter_map do |name| + tool = @tool_catalog.promote(name) + next unless tool + + @tools[tool.name.to_sym] = tool + tool.name.to_sym + end + + @on[:tool_search]&.call(Tool::SearchEvent.new(nil, promoted)) unless promoted.empty? + self + end + def normalize_schema_payload(raw_schema) return nil if raw_schema.nil? return raw_schema unless raw_schema.is_a?(Hash) @@ -329,6 +356,48 @@ def content_like?(object) object.is_a?(Content) || object.is_a?(Content::Raw) end + def effective_tools + active = @tools.transform_values { |t| Tool::Registration.new(t, deferred: false) } + return active if @tool_catalog.empty? + + deferred = @tool_catalog.available.transform_values { |t| Tool::Registration.new(t, deferred: true) } + deferred.merge(active) + end + + def register_tool(tool, defer:) + tool_instance = tool.is_a?(Class) ? tool.new : tool + + if defer_allowed?(tool_instance, defer) + @tool_catalog.add(tool_instance) + else + @tools[tool_instance.name.to_sym] = tool_instance + end + end + + def defer_allowed?(tool, explicit) + return false unless explicit.nil? ? tool.deferred? : explicit == true + + unless @config.tool_search_enabled + warn_deferred_ignored('tool_search_enabled is false') + return false + end + + unless @provider.respond_to?(:supports_deferred_loading?) && @provider.supports_deferred_loading? + warn_deferred_ignored("provider #{@provider.slug} does not support deferred tool loading") + return false + end + + true + end + + def warn_deferred_ignored(reason) + @deferred_warnings ||= Set.new + return if @deferred_warnings.include?(reason) + + @deferred_warnings << reason + RubyLLM.logger.warn("Ignoring defer: true — #{reason}") + end + def append_system_instruction(instructions) system_messages, non_system_messages = @messages.partition { |msg| msg.role == :system } system_messages << Message.new(role: :system, content: instructions) diff --git a/lib/ruby_llm/configuration.rb b/lib/ruby_llm/configuration.rb index de6202686..62c5b6c85 100644 --- a/lib/ruby_llm/configuration.rb +++ b/lib/ruby_llm/configuration.rb @@ -56,6 +56,8 @@ def defaults = @defaults ||= {} option :log_stream_debug, -> { ENV['RUBYLLM_STREAM_DEBUG'] == 'true' } option :log_regexp_timeout, -> { Regexp.respond_to?(:timeout) ? (Regexp.timeout || 1.0) : nil } + option :tool_search_enabled, true + def initialize self.class.send(:defaults).each do |key, default| value = default.respond_to?(:call) ? instance_exec(&default) : default diff --git a/lib/ruby_llm/message.rb b/lib/ruby_llm/message.rb index eefb93e55..f1f4c1478 100644 --- a/lib/ruby_llm/message.rb +++ b/lib/ruby_llm/message.rb @@ -5,7 +5,7 @@ module RubyLLM class Message ROLES = %i[system user assistant tool].freeze - attr_reader :role, :model_id, :tool_calls, :tool_call_id, :raw, :thinking, :tokens + attr_reader :role, :model_id, :tool_calls, :tool_call_id, :raw, :thinking, :tokens, :tool_references attr_writer :content def initialize(options = {}) @@ -24,6 +24,7 @@ def initialize(options = {}) ) @raw = options[:raw] @thinking = options[:thinking] + @tool_references = Array(options[:tool_references]) ensure_valid_role end diff --git a/lib/ruby_llm/providers/anthropic.rb b/lib/ruby_llm/providers/anthropic.rb index a0686036f..3668284b5 100644 --- a/lib/ruby_llm/providers/anthropic.rb +++ b/lib/ruby_llm/providers/anthropic.rb @@ -22,6 +22,10 @@ def headers } end + def supports_deferred_loading? + true + end + class << self def capabilities Anthropic::Capabilities diff --git a/lib/ruby_llm/providers/anthropic/chat.rb b/lib/ruby_llm/providers/anthropic/chat.rb index 9926fe98b..a84bed07d 100644 --- a/lib/ruby_llm/providers/anthropic/chat.rb +++ b/lib/ruby_llm/providers/anthropic/chat.rb @@ -65,7 +65,7 @@ def build_base_payload(chat_messages, model, stream, thinking) def add_optional_fields(payload, system_content:, tools:, tool_prefs:, temperature:, schema: nil) # rubocop:disable Metrics/ParameterLists if tools.any? - payload[:tools] = tools.values.map { |t| Tools.function_for(t) } + payload[:tools] = Tools.format_tools(tools) unless tool_prefs[:choice].nil? && tool_prefs[:calls].nil? payload[:tool_choice] = Tools.build_tool_choice(tool_prefs) end @@ -90,8 +90,10 @@ def parse_completion_response(response) thinking_content = extract_thinking_content(content_blocks) thinking_signature = extract_thinking_signature(content_blocks) tool_use_blocks = Tools.find_tool_uses(content_blocks) + tool_references = Tools.find_tool_references(content_blocks) - build_message(data, text_content, thinking_content, thinking_signature, tool_use_blocks, response) + build_message(data, text_content, thinking_content, thinking_signature, tool_use_blocks, tool_references, + response) end def extract_text_content(blocks) @@ -111,7 +113,7 @@ def extract_thinking_signature(blocks) thinking_block&.dig('signature') || thinking_block&.dig('data') end - def build_message(data, content, thinking, thinking_signature, tool_use_blocks, response) # rubocop:disable Metrics/ParameterLists + def build_message(data, content, thinking, thinking_signature, tool_use_blocks, tool_references, response) # rubocop:disable Metrics/ParameterLists usage = data['usage'] || {} cached_tokens = usage['cache_read_input_tokens'] cache_creation_tokens = usage['cache_creation_input_tokens'] @@ -128,6 +130,7 @@ def build_message(data, content, thinking, thinking_signature, tool_use_blocks, content: content, thinking: Thinking.build(text: thinking, signature: thinking_signature), tool_calls: Tools.parse_tool_calls(tool_use_blocks), + tool_references: tool_references, input_tokens: usage['input_tokens'], output_tokens: usage['output_tokens'], cached_tokens: cached_tokens, diff --git a/lib/ruby_llm/providers/anthropic/tools.rb b/lib/ruby_llm/providers/anthropic/tools.rb index 246561024..a0b5df632 100644 --- a/lib/ruby_llm/providers/anthropic/tools.rb +++ b/lib/ruby_llm/providers/anthropic/tools.rb @@ -11,6 +11,14 @@ def find_tool_uses(blocks) blocks.select { |c| c['type'] == 'tool_use' } end + def find_tool_references(blocks) + results = blocks.select { |c| c['type'] == 'tool_search_tool_result' } + results.flat_map do |block| + refs = block.dig('content', 'tool_references') || [] + refs.filter_map { |r| r['tool_name'] } + end + end + def format_tool_call(msg) return { role: 'assistant', content: msg.content.value } if msg.content.is_a?(RubyLLM::Content::Raw) @@ -55,6 +63,12 @@ def format_tool_result_block(msg) } end + # Anthropic's native server-side BM25 tool-search primitive. + NATIVE_TOOL_SEARCH = { + type: 'tool_search_tool_bm25_20251119', + name: 'tool_search_tool_bm25' + }.freeze + def function_for(tool) input_schema = tool.params_schema || RubyLLM::Tool::SchemaDefinition.from_parameters(tool.parameters)&.json_schema @@ -65,11 +79,19 @@ def function_for(tool) input_schema: input_schema || default_input_schema } + declaration[:defer_loading] = true if tool.deferred? + return declaration if tool.provider_params.empty? RubyLLM::Utils.deep_merge(declaration, tool.provider_params) end + def format_tools(tools) + formatted = tools.values.map { |t| function_for(t) } + formatted << NATIVE_TOOL_SEARCH if formatted.any? { |t| t[:defer_loading] } + formatted + end + def extract_tool_calls(data) if json_delta?(data) { nil => ToolCall.new(id: nil, name: nil, arguments: data.dig('delta', 'partial_json')) } diff --git a/lib/ruby_llm/tool.rb b/lib/ruby_llm/tool.rb index 08ba3c82b..52ccf21e7 100644 --- a/lib/ruby_llm/tool.rb +++ b/lib/ruby_llm/tool.rb @@ -17,6 +17,34 @@ def initialize(name, type: 'string', desc: nil, required: true) # Base class for creating tools that AI models can use class Tool + # @api private + # + # Pins a per-registration +deferred?+ value on a tool without mutating the + # tool itself, so one instance can be safely shared across chats. + class Registration + def initialize(tool, deferred:) + @tool = tool + @deferred = deferred + end + + def deferred? = @deferred + def name = @tool.name + def description = @tool.description + def parameters = @tool.parameters + def provider_params = @tool.provider_params + def params_schema = @tool.params_schema + end + + # Event yielded to +Chat#on_tool_search+ callbacks. + class SearchEvent + attr_reader :query, :results + + def initialize(query, results) + @query = query + @results = results + end + end + # Stops conversation continuation after tool execution class Halt attr_reader :content @@ -60,6 +88,15 @@ def with_params(**params) def provider_params @provider_params ||= {} end + + def deferred(value = true) # rubocop:disable Style/OptionalBooleanParameter + @deferred = value ? true : false + self + end + + def deferred? + @deferred == true + end end def name @@ -85,6 +122,10 @@ def provider_params self.class.provider_params end + def deferred? + self.class.deferred? + end + def params_schema return @params_schema if defined?(@params_schema) diff --git a/lib/ruby_llm/tool_catalog.rb b/lib/ruby_llm/tool_catalog.rb new file mode 100644 index 000000000..b62b46c16 --- /dev/null +++ b/lib/ruby_llm/tool_catalog.rb @@ -0,0 +1,48 @@ +# frozen_string_literal: true + +require 'set' + +module RubyLLM + # Pool of deferred tools for a Chat. Anthropic's +defer_loading: true+ + # keeps them out of the model's visible menu; when the native + # +tool_search_tool_bm25+ primitive loads one, +promote+ moves it into the + # active tool list so the normal tool-dispatch path can execute it. + class ToolCatalog + attr_reader :deferred_tools, :loaded_tools + + def initialize + @deferred_tools = {} + @loaded_tools = Set.new + end + + def empty? + @deferred_tools.empty? + end + + def any? + !empty? + end + + def add(tool) + @deferred_tools[tool.name.to_sym] = tool + self + end + + def available + @deferred_tools.except(*@loaded_tools) + end + + def promote(name) + sym = name.to_sym + tool = @deferred_tools[sym] + return nil unless tool + + @loaded_tools << sym + tool + end + + def inspect + "#<#{self.class} deferred=#{@deferred_tools.size} loaded=#{@loaded_tools.size}>" + end + end +end diff --git a/lib/ruby_llm/version.rb b/lib/ruby_llm/version.rb index 76646edcf..152276f9f 100644 --- a/lib/ruby_llm/version.rb +++ b/lib/ruby_llm/version.rb @@ -1,5 +1,5 @@ # frozen_string_literal: true module RubyLLM - VERSION = '1.14.1' + VERSION = '1.15.0' end diff --git a/spec/fixtures/vcr_cassettes/chat_end-to-end_tool_search_with_anthropic_promotes_a_deferred_tool_via_the_native_tool-search_primitive_and_answers_from_its_output.yml b/spec/fixtures/vcr_cassettes/chat_end-to-end_tool_search_with_anthropic_promotes_a_deferred_tool_via_the_native_tool-search_primitive_and_answers_from_its_output.yml new file mode 100644 index 000000000..9d1cd5dd9 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_end-to-end_tool_search_with_anthropic_promotes_a_deferred_tool_via_the_native_tool-search_primitive_and_answers_from_its_output.yml @@ -0,0 +1,205 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.anthropic.com/v1/messages + body: + encoding: UTF-8 + string: '{"model":"claude-haiku-4-5-20251001","messages":[{"role":"user","content":[{"type":"text","text":"What''s + the current weather in Berlin? Please use your tools to look it up."}]}],"stream":false,"max_tokens":64000,"tools":[{"name":"weather_lookup","description":"Looks + up the current weather (temperature, wind) for a city by name.","input_schema":{"type":"object","properties":{"city":{"type":"string","description":"City + name, e.g. \"Berlin\""}},"required":["city"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"stock_price","description":"Fetches + the latest equity stock price in USD for a ticker symbol.","input_schema":{"type":"object","properties":{"ticker":{"type":"string","description":"Uppercase + ticker symbol, e.g. \"AAPL\""}},"required":["ticker"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"translate_text","description":"Translates + a short piece of text from one natural language to another.","input_schema":{"type":"object","properties":{"text":{"type":"string","description":"Text + to translate"},"target_language":{"type":"string","description":"Target language + name, e.g. \"Spanish\""}},"required":["text","target_language"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"calculator","description":"Evaluates + a basic arithmetic expression involving +, -, *, /.","input_schema":{"type":"object","properties":{"expression":{"type":"string","description":"Arithmetic + expression, e.g. \"2 + 2\""}},"required":["expression"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"dictionary","description":"Looks + up the dictionary definition of an English word.","input_schema":{"type":"object","properties":{"word":{"type":"string","description":"The + word to define"}},"required":["word"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"current_time","description":"Returns + the current server time as an ISO-8601 string.","input_schema":{"type":"object","properties":{},"required":[],"additionalProperties":false,"strict":true}},{"type":"tool_search_tool_bm25_20251119","name":"tool_search_tool_bm25"}]}' + headers: + User-Agent: + - Faraday v2.14.1 + X-Api-Key: + - "" + Anthropic-Version: + - '2023-06-01' + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Fri, 24 Apr 2026 10:28:11 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Anthropic-Ratelimit-Input-Tokens-Limit: + - '4000000' + Anthropic-Ratelimit-Input-Tokens-Remaining: + - '3999000' + Anthropic-Ratelimit-Input-Tokens-Reset: + - '2026-04-24T10:28:10Z' + Anthropic-Ratelimit-Output-Tokens-Limit: + - '800000' + Anthropic-Ratelimit-Output-Tokens-Remaining: + - '800000' + Anthropic-Ratelimit-Output-Tokens-Reset: + - '2026-04-24T10:28:11Z' + Anthropic-Ratelimit-Requests-Limit: + - '20000' + Anthropic-Ratelimit-Requests-Remaining: + - '19999' + Anthropic-Ratelimit-Requests-Reset: + - '2026-04-24T10:28:09Z' + Anthropic-Ratelimit-Tokens-Limit: + - '4800000' + Anthropic-Ratelimit-Tokens-Remaining: + - '4799000' + Anthropic-Ratelimit-Tokens-Reset: + - '2026-04-24T10:28:10Z' + Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Anthropic-Organization-Id: + - "" + Server: + - cloudflare + X-Envoy-Upstream-Service-Time: + - '1810' + Vary: + - Accept-Encoding + Server-Timing: + - x-originResponse;dur=1812 + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + Content-Security-Policy: + - default-src 'none'; frame-ancestors 'none' + X-Robots-Tag: + - none + Cf-Ray: + - "" + body: + encoding: ASCII-8BIT + string: '{"model":"claude-haiku-4-5-20251001","id":"msg_01QZW4D1YFPxkYVMqNdJBAQr","type":"message","role":"assistant","content":[{"type":"text","text":"I''ll + search for tools that can help me look up the weather in Berlin."},{"type":"server_tool_use","id":"srvtoolu_01YVGvr3gcEu1xSvPPFtBZq4","name":"tool_search_tool_bm25","input":{"query":"weather + Berlin"}},{"type":"tool_search_tool_result","tool_use_id":"srvtoolu_01YVGvr3gcEu1xSvPPFtBZq4","content":{"type":"tool_search_tool_search_result","tool_references":[{"type":"tool_reference","tool_name":"weather_lookup"}]}},{"type":"text","text":"Great! + I found a weather lookup tool. Let me use it to get the current weather in + Berlin."},{"type":"tool_use","id":"toolu_01N3kiTWWYU5vCgfY6KJHuU6","name":"weather_lookup","input":{"city":"Berlin"},"caller":{"type":"direct"}}],"stop_reason":"tool_use","stop_sequence":null,"stop_details":null,"usage":{"input_tokens":1731,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":152,"service_tier":"standard","inference_geo":"not_available","server_tool_use":{"web_search_requests":0,"web_fetch_requests":0}}}' + recorded_at: Fri, 24 Apr 2026 10:28:11 GMT +- request: + method: post + uri: https://api.anthropic.com/v1/messages + body: + encoding: UTF-8 + string: '{"model":"claude-haiku-4-5-20251001","messages":[{"role":"user","content":[{"type":"text","text":"What''s + the current weather in Berlin? Please use your tools to look it up."}]},{"role":"assistant","content":[{"type":"text","text":"I''ll + search for tools that can help me look up the weather in Berlin.Great! I found + a weather lookup tool. Let me use it to get the current weather in Berlin."},{"type":"tool_use","id":"toolu_01N3kiTWWYU5vCgfY6KJHuU6","name":"weather_lookup","input":{"city":"Berlin"}}]},{"role":"user","content":[{"type":"tool_result","tool_use_id":"toolu_01N3kiTWWYU5vCgfY6KJHuU6","content":[{"type":"text","text":"Weather + in Berlin: 15C, wind 10 km/h, clear skies"}]}]}],"stream":false,"max_tokens":64000,"tools":[{"name":"stock_price","description":"Fetches + the latest equity stock price in USD for a ticker symbol.","input_schema":{"type":"object","properties":{"ticker":{"type":"string","description":"Uppercase + ticker symbol, e.g. \"AAPL\""}},"required":["ticker"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"translate_text","description":"Translates + a short piece of text from one natural language to another.","input_schema":{"type":"object","properties":{"text":{"type":"string","description":"Text + to translate"},"target_language":{"type":"string","description":"Target language + name, e.g. \"Spanish\""}},"required":["text","target_language"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"calculator","description":"Evaluates + a basic arithmetic expression involving +, -, *, /.","input_schema":{"type":"object","properties":{"expression":{"type":"string","description":"Arithmetic + expression, e.g. \"2 + 2\""}},"required":["expression"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"dictionary","description":"Looks + up the dictionary definition of an English word.","input_schema":{"type":"object","properties":{"word":{"type":"string","description":"The + word to define"}},"required":["word"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"current_time","description":"Returns + the current server time as an ISO-8601 string.","input_schema":{"type":"object","properties":{},"required":[],"additionalProperties":false,"strict":true}},{"name":"weather_lookup","description":"Looks + up the current weather (temperature, wind) for a city by name.","input_schema":{"type":"object","properties":{"city":{"type":"string","description":"City + name, e.g. \"Berlin\""}},"required":["city"],"additionalProperties":false,"strict":true}},{"type":"tool_search_tool_bm25_20251119","name":"tool_search_tool_bm25"}]}' + headers: + User-Agent: + - Faraday v2.14.1 + X-Api-Key: + - "" + Anthropic-Version: + - '2023-06-01' + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Fri, 24 Apr 2026 10:28:12 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Anthropic-Ratelimit-Input-Tokens-Limit: + - '4000000' + Anthropic-Ratelimit-Input-Tokens-Remaining: + - '4000000' + Anthropic-Ratelimit-Input-Tokens-Reset: + - '2026-04-24T10:28:12Z' + Anthropic-Ratelimit-Output-Tokens-Limit: + - '800000' + Anthropic-Ratelimit-Output-Tokens-Remaining: + - '800000' + Anthropic-Ratelimit-Output-Tokens-Reset: + - '2026-04-24T10:28:12Z' + Anthropic-Ratelimit-Requests-Limit: + - '20000' + Anthropic-Ratelimit-Requests-Remaining: + - '19999' + Anthropic-Ratelimit-Requests-Reset: + - '2026-04-24T10:28:11Z' + Anthropic-Ratelimit-Tokens-Limit: + - '4800000' + Anthropic-Ratelimit-Tokens-Remaining: + - '4800000' + Anthropic-Ratelimit-Tokens-Reset: + - '2026-04-24T10:28:12Z' + Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Anthropic-Organization-Id: + - "" + Server: + - cloudflare + X-Envoy-Upstream-Service-Time: + - '840' + Vary: + - Accept-Encoding + Server-Timing: + - x-originResponse;dur=842 + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + Content-Security-Policy: + - default-src 'none'; frame-ancestors 'none' + X-Robots-Tag: + - none + Cf-Ray: + - "" + body: + encoding: ASCII-8BIT + string: !binary |- + eyJtb2RlbCI6ImNsYXVkZS1oYWlrdS00LTUtMjAyNTEwMDEiLCJpZCI6Im1zZ18wMURXaWl1bnRNenZqdkJXSFZoQkpKZWgiLCJ0eXBlIjoibWVzc2FnZSIsInJvbGUiOiJhc3Npc3RhbnQiLCJjb250ZW50IjpbeyJ0eXBlIjoidGV4dCIsInRleHQiOiJUaGUgY3VycmVudCB3ZWF0aGVyIGluIEJlcmxpbiBpczpcbi0gKipUZW1wZXJhdHVyZToqKiAxNcKwQ1xuLSAqKldpbmQ6KiogMTAga20vaFxuLSAqKkNvbmRpdGlvbnM6KiogQ2xlYXIgc2tpZXNcblxuSXQncyBhIHBsZWFzYW50IGRheSB3aXRoIGNsZWFyIHdlYXRoZXIgYW5kIG1pbGQgdGVtcGVyYXR1cmVzISJ9XSwic3RvcF9yZWFzb24iOiJlbmRfdHVybiIsInN0b3Bfc2VxdWVuY2UiOm51bGwsInN0b3BfZGV0YWlscyI6bnVsbCwidXNhZ2UiOnsiaW5wdXRfdG9rZW5zIjo5ODQsImNhY2hlX2NyZWF0aW9uX2lucHV0X3Rva2VucyI6MCwiY2FjaGVfcmVhZF9pbnB1dF90b2tlbnMiOjAsImNhY2hlX2NyZWF0aW9uIjp7ImVwaGVtZXJhbF81bV9pbnB1dF90b2tlbnMiOjAsImVwaGVtZXJhbF8xaF9pbnB1dF90b2tlbnMiOjB9LCJvdXRwdXRfdG9rZW5zIjo1Miwic2VydmljZV90aWVyIjoic3RhbmRhcmQiLCJpbmZlcmVuY2VfZ2VvIjoibm90X2F2YWlsYWJsZSJ9fQ== + recorded_at: Fri, 24 Apr 2026 10:28:12 GMT +recorded_with: VCR 6.4.0 diff --git a/spec/fixtures/vcr_cassettes/chat_end-to-end_tool_search_with_anthropic_sends_defer_loading_true_and_the_native_tool_search_tool_bm25_entry_on_the_first_request.yml b/spec/fixtures/vcr_cassettes/chat_end-to-end_tool_search_with_anthropic_sends_defer_loading_true_and_the_native_tool_search_tool_bm25_entry_on_the_first_request.yml new file mode 100644 index 000000000..d97617e8c --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_end-to-end_tool_search_with_anthropic_sends_defer_loading_true_and_the_native_tool_search_tool_bm25_entry_on_the_first_request.yml @@ -0,0 +1,107 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.anthropic.com/v1/messages + body: + encoding: UTF-8 + string: '{"model":"claude-haiku-4-5-20251001","messages":[{"role":"user","content":[{"type":"text","text":"What''s + the current weather in Berlin?"}]}],"stream":false,"max_tokens":64000,"tools":[{"name":"weather_lookup","description":"Looks + up the current weather (temperature, wind) for a city by name.","input_schema":{"type":"object","properties":{"city":{"type":"string","description":"City + name, e.g. \"Berlin\""}},"required":["city"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"stock_price","description":"Fetches + the latest equity stock price in USD for a ticker symbol.","input_schema":{"type":"object","properties":{"ticker":{"type":"string","description":"Uppercase + ticker symbol, e.g. \"AAPL\""}},"required":["ticker"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"translate_text","description":"Translates + a short piece of text from one natural language to another.","input_schema":{"type":"object","properties":{"text":{"type":"string","description":"Text + to translate"},"target_language":{"type":"string","description":"Target language + name, e.g. \"Spanish\""}},"required":["text","target_language"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"calculator","description":"Evaluates + a basic arithmetic expression involving +, -, *, /.","input_schema":{"type":"object","properties":{"expression":{"type":"string","description":"Arithmetic + expression, e.g. \"2 + 2\""}},"required":["expression"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"dictionary","description":"Looks + up the dictionary definition of an English word.","input_schema":{"type":"object","properties":{"word":{"type":"string","description":"The + word to define"}},"required":["word"],"additionalProperties":false,"strict":true},"defer_loading":true},{"name":"current_time","description":"Returns + the current server time as an ISO-8601 string.","input_schema":{"type":"object","properties":{},"required":[],"additionalProperties":false,"strict":true}},{"type":"tool_search_tool_bm25_20251119","name":"tool_search_tool_bm25"}]}' + headers: + User-Agent: + - Faraday v2.14.1 + X-Api-Key: + - "" + Anthropic-Version: + - '2023-06-01' + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Fri, 24 Apr 2026 10:28:14 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Anthropic-Ratelimit-Input-Tokens-Limit: + - '4000000' + Anthropic-Ratelimit-Input-Tokens-Remaining: + - '4000000' + Anthropic-Ratelimit-Input-Tokens-Reset: + - '2026-04-24T10:28:13Z' + Anthropic-Ratelimit-Output-Tokens-Limit: + - '800000' + Anthropic-Ratelimit-Output-Tokens-Remaining: + - '800000' + Anthropic-Ratelimit-Output-Tokens-Reset: + - '2026-04-24T10:28:14Z' + Anthropic-Ratelimit-Requests-Limit: + - '20000' + Anthropic-Ratelimit-Requests-Remaining: + - '19999' + Anthropic-Ratelimit-Requests-Reset: + - '2026-04-24T10:28:12Z' + Anthropic-Ratelimit-Tokens-Limit: + - '4800000' + Anthropic-Ratelimit-Tokens-Remaining: + - '4800000' + Anthropic-Ratelimit-Tokens-Reset: + - '2026-04-24T10:28:13Z' + Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Anthropic-Organization-Id: + - "" + Server: + - cloudflare + X-Envoy-Upstream-Service-Time: + - '1299' + Vary: + - Accept-Encoding + Server-Timing: + - x-originResponse;dur=1301 + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + Content-Security-Policy: + - default-src 'none'; frame-ancestors 'none' + X-Robots-Tag: + - none + Cf-Ray: + - "" + body: + encoding: ASCII-8BIT + string: '{"model":"claude-haiku-4-5-20251001","id":"msg_015ea6m9mLKokbPqCqek9ZrS","type":"message","role":"assistant","content":[{"type":"text","text":"I + don''t have access to a weather tool. The tools available to me are:\n\n1. + **current_time** - Returns the current server time\n2. **tool_search_tool_bm25** + - Searches for available functions/tools\n\nTo get the current weather in + Berlin, you would need to:\n- Use a weather service API (like OpenWeatherMap, + WeatherAPI, etc.)\n- Check a weather website (like weather.com, weather.gov, + etc.)\n- Use a voice assistant with weather capabilities\n\nIs there anything + else I can help you with, such as checking the current time?"}],"stop_reason":"end_turn","stop_sequence":null,"stop_details":null,"usage":{"input_tokens":762,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":135,"service_tier":"standard","inference_geo":"not_available"}}' + recorded_at: Fri, 24 Apr 2026 10:28:13 GMT +recorded_with: VCR 6.4.0 diff --git a/spec/ruby_llm/chat_deferred_tools_spec.rb b/spec/ruby_llm/chat_deferred_tools_spec.rb new file mode 100644 index 000000000..b8a75f47d --- /dev/null +++ b/spec/ruby_llm/chat_deferred_tools_spec.rb @@ -0,0 +1,96 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe RubyLLM::Chat do + subject(:chat) { described_class.new(model: 'claude-haiku-4-5', provider: :anthropic) } + + include_context 'with configured RubyLLM' + + def tool_class(const_name, desc, declared: false) + klass = Class.new(RubyLLM::Tool) do + description desc + deferred if declared + end + stub_const(const_name, klass) + klass + end + + let(:regular) { tool_class('RegularTool', 'plain tool') } + let(:heavy) { tool_class('HeavyTool', 'heavy deferred tool', declared: true) } + let(:another) { tool_class('OtherTool', 'another deferred tool') } + + describe '#with_tool' do + it 'puts non-deferred tools on the active list' do + chat.with_tool(regular) + expect(chat.tools.keys).to include(:regular) + expect(chat.tool_catalog).to be_empty + end + + it 'routes class-declared deferred tools into the catalog without adding to active tools' do + chat.with_tool(heavy) + expect(chat.tool_catalog.deferred_tools.keys).to eq([:heavy]) + expect(chat.tools).to be_empty + end + + it 'per-call defer: true overrides a non-declared class' do + chat.with_tool(another, defer: true) + expect(chat.tool_catalog.deferred_tools.keys).to eq([:other]) + end + + it 'per-call defer: false overrides a declared class' do + chat.with_tool(heavy, defer: false) + expect(chat.tools.keys).to include(:heavy) + expect(chat.tool_catalog).to be_empty + end + end + + describe '#with_tools' do + it 'routes each tool based on effective defer value' do + chat.with_tools(regular, heavy, another, defer: true) + expect(chat.tool_catalog.deferred_tools.keys).to match_array(%i[regular heavy other]) + expect(chat.tools).to be_empty + end + + it 'clears both active and catalog when replace: true' do + chat.with_tools(heavy, another, defer: true) + chat.with_tools(regular, replace: true) + expect(chat.tool_catalog).to be_empty + expect(chat.tools.keys).to contain_exactly(:regular) + end + end + + describe 'tool_search_enabled kill switch' do + before { allow(RubyLLM.logger).to receive(:warn) } + + around do |example| + RubyLLM.config.tool_search_enabled = false + example.run + ensure + RubyLLM.config.tool_search_enabled = true + end + + it 'silently coerces defer: true to false and logs a warning once' do + chat.with_tools(heavy, another, defer: true) + expect(chat.tool_catalog).to be_empty + expect(chat.tools.keys).to match_array(%i[heavy other]) + expect(RubyLLM.logger).to have_received(:warn).with(/tool_search_enabled is false/).once + end + end + + describe 'provider without deferred loading support' do + subject(:openai_chat) do + described_class.new(model: 'gpt-5.4', assume_model_exists: true, provider: :openai) + end + + before { allow(RubyLLM.logger).to receive(:warn) } + + it 'coerces defer: true to regular registration and logs a warning once' do + openai_chat.with_tools(heavy, another, defer: true) + + expect(openai_chat.tool_catalog).to be_empty + expect(openai_chat.tools.keys).to match_array(%i[heavy other]) + expect(RubyLLM.logger).to have_received(:warn).with(/openai.*does not support deferred/i).once + end + end +end diff --git a/spec/ruby_llm/chat_tool_search_scale_spec.rb b/spec/ruby_llm/chat_tool_search_scale_spec.rb new file mode 100644 index 000000000..5c55e54d5 --- /dev/null +++ b/spec/ruby_llm/chat_tool_search_scale_spec.rb @@ -0,0 +1,79 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe RubyLLM::Chat do + include_context 'with configured RubyLLM' + + let(:themes) do + %w[ + weather stock translate calculator dictionary news sports currency + flight hotel restaurant movie book recipe article code git docker + kubernetes aws gcp azure sql redis postgres mongo kafka rabbitmq + elastic prometheus grafana jira github slack teams zoom gmail + calendar drive notion linear pagerduty datadog + ] + end + let(:tool_classes) do + themes.map.with_index do |theme, i| + klass = build_tool_class(theme) + stub_const("ScaleTool#{i}", klass) + klass + end + end + let(:chat_anthropic) do + RubyLLM.chat(model: 'claude-haiku-4-5', provider: :anthropic).tap do |c| + c.with_tools(*tool_classes, defer: true) + end + end + + def build_tool_class(theme) + Class.new(RubyLLM::Tool) do + description "Looks up #{theme} information for a given query." + param :query, desc: "Input for the #{theme} lookup" + + # Override #name so the tool registers under its theme, independent of + # the auto-generated class-name derivation. Keeps assertions readable. + define_method(:name) { theme } + define_method(:execute) { |query:| "#{theme}: #{query}" } + end + end + + describe 'Anthropic deferred tool loading at scale (40+ tools)' do + it 'registers 40+ deferred tools in the catalog without error' do + expect(tool_classes.size).to be >= 40 + expect(chat_anthropic.tool_catalog.deferred_tools.size).to eq(tool_classes.size) + expect(chat_anthropic.tools).to be_empty + end + + it 'flags every deferred tool with defer_loading: true and appends the native primitive exactly once' do + payload = RubyLLM::Providers::Anthropic::Chat.render_payload( + [RubyLLM::Message.new(role: :user, content: 'hi')], + tools: chat_anthropic.send(:effective_tools), + temperature: nil, + model: chat_anthropic.model, + stream: false + ) + + entries = payload[:tools] + deferred = entries.select { |t| t[:defer_loading] == true } + expect(deferred.size).to eq(tool_classes.size) + + native = entries.select { |t| t[:type] == 'tool_search_tool_bm25_20251119' } + expect(native.size).to eq(1) + expect(native.first[:name]).to eq('tool_search_tool_bm25') + end + + it 'promotes deferred tools when the native path surfaces tool_references' do + referenced = %w[flight hotel restaurant] + message = RubyLLM::Message.new(role: :assistant, content: '', tool_references: referenced) + + chat_anthropic.send(:promote_from_tool_references, message) + + expect(chat_anthropic.tool_catalog.loaded_tools).to match_array(referenced.map(&:to_sym)) + referenced.each do |name| + expect(chat_anthropic.tools.keys).to include(name.to_sym) + end + end + end +end diff --git a/spec/ruby_llm/chat_tool_search_spec.rb b/spec/ruby_llm/chat_tool_search_spec.rb new file mode 100644 index 000000000..cf0269083 --- /dev/null +++ b/spec/ruby_llm/chat_tool_search_spec.rb @@ -0,0 +1,124 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe RubyLLM::Chat do + include_context 'with configured RubyLLM' + + def define_tool_classes! + stub_const('WeatherLookupTool', Class.new(RubyLLM::Tool) do + description 'Looks up the current weather (temperature, wind) for a city by name.' + deferred + param :city, desc: 'City name, e.g. "Berlin"' + def execute(city:) = "Weather in #{city}: 15C, wind 10 km/h, clear skies" + end) + stub_const('StockPriceTool', Class.new(RubyLLM::Tool) do + description 'Fetches the latest equity stock price in USD for a ticker symbol.' + deferred + param :ticker, desc: 'Uppercase ticker symbol, e.g. "AAPL"' + def execute(ticker:) = "Last trade for #{ticker}: $123.45 USD" + end) + stub_const('TranslateTextTool', Class.new(RubyLLM::Tool) do + description 'Translates a short piece of text from one natural language to another.' + deferred + param :text, desc: 'Text to translate' + param :target_language, desc: 'Target language name, e.g. "Spanish"' + def execute(text:, target_language:) = "(#{target_language}) #{text.reverse}" + end) + stub_const('CalculatorTool', Class.new(RubyLLM::Tool) do + description 'Evaluates a basic arithmetic expression involving +, -, *, /.' + deferred + param :expression, desc: 'Arithmetic expression, e.g. "2 + 2"' + def execute(expression:) = "Result of #{expression} is 42" + end) + stub_const('DictionaryTool', Class.new(RubyLLM::Tool) do + description 'Looks up the dictionary definition of an English word.' + deferred + param :word, desc: 'The word to define' + def execute(word:) = "Definition of #{word}: a placeholder entry used for testing." + end) + stub_const('CurrentTimeTool', Class.new(RubyLLM::Tool) do + description 'Returns the current server time as an ISO-8601 string.' + def execute = '2026-04-24T12:00:00Z' + end) + end + + def deferred_tool_classes + [WeatherLookupTool, StockPriceTool, TranslateTextTool, CalculatorTool, DictionaryTool] + end + + def register_catalog(chat) + chat.with_tools(*deferred_tool_classes).with_tool(CurrentTimeTool) + end + + def request_body_for(message) + JSON.parse(message.raw.env.request_body) + end + + before { define_tool_classes! } + + describe 'end-to-end tool search with Anthropic' do + let(:model) { 'claude-haiku-4-5' } + let(:provider) { :anthropic } + + it 'promotes a deferred tool via the native tool-search primitive and answers from its output' do + chat = RubyLLM.chat(model: model, provider: provider) + register_catalog(chat) + + search_events = [] + chat.on_tool_search { |event| search_events << event } + + response = chat.ask("What's the current weather in Berlin? Please use your tools to look it up.") + + # Anthropic's server-side primitive surfaces loaded tools on the + # assistant message as tool_references — our parser exposes them so + # Chat can promote the tool into @tools for the next turn. + assistant_messages = chat.messages.select { |m| m.role == :assistant } + native_refs = assistant_messages.flat_map { |m| Array(m.tool_references) } + expect(native_refs).to include('weather_lookup') + + expect(chat.tool_catalog.loaded_tools).to include(:weather_lookup) + # Robust against paraphrase: fixture temperature is "15C"; the model + # often rewrites as "15°C", "15 C", or "15 degrees Celsius". + expect(response.content).to match(/15\s*(°|degrees?)?\s*C/i) + expect(response.content).to include('10 km/h') + + # No-retry-storm guard: a working feature converges in a handful of + # roundtrips. Not a strict contract — the model may take an extra turn. + request_count = chat.messages.count { |m| m.role == :assistant } + expect(request_count).to be <= 5 + + # Native path fires on_tool_search with query: nil. + expect(search_events).not_to be_empty + expect(search_events.first.query).to be_nil + expect(search_events.flat_map(&:results)).to include(:weather_lookup) + end + + it 'sends defer_loading: true and the native tool_search_tool_bm25 entry on the first request' do + chat = RubyLLM.chat(model: model, provider: provider) + register_catalog(chat) + + chat.ask("What's the current weather in Berlin?") + + first_assistant = chat.messages.find { |m| m.role == :assistant } + tool_entries = request_body_for(first_assistant).fetch('tools') + + deferred_entries = tool_entries.select { |t| t['defer_loading'] == true } + expect(deferred_entries.map { |t| t['name'] }).to include( + 'weather_lookup', 'stock_price', 'translate_text', 'calculator', 'dictionary' + ) + + current_time = tool_entries.find { |t| t['name'] == 'current_time' } + expect(current_time).not_to be_nil + expect(current_time).not_to have_key('defer_loading') + + native = tool_entries.find { |t| t['type'] == 'tool_search_tool_bm25_20251119' } + expect(native).not_to be_nil + expect(native['name']).to eq('tool_search_tool_bm25') + + # Every deferred tool is visible by name and marked with defer_loading + # for Anthropic's server-side handling. + expect(tool_entries.map { |t| t['name'] }).not_to include('search_tools') + end + end +end diff --git a/spec/ruby_llm/providers/anthropic/chat_native_tool_search_spec.rb b/spec/ruby_llm/providers/anthropic/chat_native_tool_search_spec.rb new file mode 100644 index 000000000..e5346b4f9 --- /dev/null +++ b/spec/ruby_llm/providers/anthropic/chat_native_tool_search_spec.rb @@ -0,0 +1,99 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe RubyLLM::Providers::Anthropic::Chat do + include_context 'with configured RubyLLM' + + let(:chat) do + RubyLLM.chat(model: 'claude-haiku-4-5', provider: :anthropic).tap do |c| + stub_const('WeatherLookup', Class.new(RubyLLM::Tool) do + description 'Looks up current weather for a city.' + param :city, desc: 'City name' + def execute(city:) = "Weather for #{city}: 15C" + end) + stub_const('StockQuote', Class.new(RubyLLM::Tool) do + description 'Fetches the latest stock quote for a ticker.' + param :ticker, desc: 'Ticker symbol' + def execute(ticker:) = "Quote: #{ticker} = $42" + end) + c.with_tools(WeatherLookup, StockQuote, defer: true) + end + end + + def response_double(body) + env = Struct.new(:request_body).new('{}') + resp = Object.new + resp.define_singleton_method(:body) { body } + resp.define_singleton_method(:env) { env } + resp + end + + describe 'parse_completion_response with tool_search_tool_result blocks' do + it 'exposes referenced tool names on the resulting Message' do + response = response_double( + 'model' => 'claude-haiku-4-5-20251001', + 'content' => [ + { 'type' => 'text', 'text' => 'Searching…' }, + { 'type' => 'server_tool_use', 'id' => 'srv_1', 'name' => 'tool_search_tool_bm25', + 'input' => { 'query' => 'weather' } }, + { 'type' => 'tool_search_tool_result', + 'tool_use_id' => 'srv_1', + 'content' => { + 'type' => 'tool_search_tool_search_result', + 'tool_references' => [ + { 'type' => 'tool_reference', 'tool_name' => 'weather_lookup' }, + { 'type' => 'tool_reference', 'tool_name' => 'stock_quote' } + ] + } } + ], + 'usage' => { 'input_tokens' => 10, 'output_tokens' => 5 } + ) + + message = described_class.parse_completion_response(response) + expect(message).to respond_to(:tool_references) + expect(message.tool_references).to match_array(%w[weather_lookup stock_quote]) + end + + it 'returns an empty list when the response has no tool_search_tool_result block' do + response = response_double( + 'model' => 'claude-haiku-4-5-20251001', + 'content' => [{ 'type' => 'text', 'text' => 'hello' }], + 'usage' => { 'input_tokens' => 1, 'output_tokens' => 1 } + ) + + message = described_class.parse_completion_response(response) + expect(message.tool_references).to eq([]) + end + end + + describe 'Chat promotes catalog entries when a response carries tool_references' do + it 'moves the referenced tools from deferred_tools into @tools and fires on_tool_search' do + events = [] + chat.on_tool_search { |e| events << e } + + message_with_refs = RubyLLM::Message.new( + role: :assistant, + content: 'Searching…', + tool_references: %w[weather_lookup] + ) + + # The Chat should recognize tool_references and promote via its catalog. + # We simulate the post-response path by calling the public hook. + chat.send(:promote_from_tool_references, message_with_refs) + + expect(chat.tool_catalog.loaded_tools).to include(:weather_lookup) + expect(chat.tools.keys).to include(:weather_lookup) + expect(events.map(&:results)).to eq([[:weather_lookup]]) + # Native-path events carry a nil query — consumers can use this to + # distinguish promotion by the server-side primitive from promotion by + # the Ruby search_tools function. + expect(events.first.query).to be_nil + end + + it 'is a no-op when there are no references' do + message = RubyLLM::Message.new(role: :assistant, content: 'plain', tool_references: []) + expect { chat.send(:promote_from_tool_references, message) }.not_to(change { chat.tool_catalog.loaded_tools.dup }) + end + end +end diff --git a/spec/ruby_llm/providers/anthropic/tools_spec.rb b/spec/ruby_llm/providers/anthropic/tools_spec.rb index 695d1c4cd..a7ffe6ecd 100644 --- a/spec/ruby_llm/providers/anthropic/tools_spec.rb +++ b/spec/ruby_llm/providers/anthropic/tools_spec.rb @@ -222,4 +222,35 @@ expect(described_class.parse_tool_calls([])).to be_nil end end + + describe '.function_for' do + it 'omits defer_loading when the tool is not deferred' do + tool = instance_double(RubyLLM::Tool, name: 'x', description: 'd', params_schema: nil, + parameters: {}, provider_params: {}, deferred?: false) + expect(described_class.function_for(tool)).not_to have_key(:defer_loading) + end + + it 'emits defer_loading: true when the tool is deferred' do + tool = instance_double(RubyLLM::Tool, name: 'x', description: 'd', params_schema: nil, + parameters: {}, provider_params: {}, deferred?: true) + expect(described_class.function_for(tool)[:defer_loading]).to be(true) + end + end + + describe '.format_tools' do + def tool(name, deferred:) + instance_double(RubyLLM::Tool, name: name, description: "#{name} desc", params_schema: nil, + parameters: {}, provider_params: {}, deferred?: deferred) + end + + it 'does not append the native search tool when nothing is deferred' do + formatted = described_class.format_tools(a: tool('a', deferred: false), b: tool('b', deferred: false)) + expect(formatted.map { |t| t[:name] }).to contain_exactly('a', 'b') + end + + it 'appends the native BM25 search tool when any tool is deferred' do + formatted = described_class.format_tools(a: tool('a', deferred: false), b: tool('b', deferred: true)) + expect(formatted.last).to eq(described_class::NATIVE_TOOL_SEARCH) + end + end end diff --git a/spec/ruby_llm/tool_catalog_spec.rb b/spec/ruby_llm/tool_catalog_spec.rb new file mode 100644 index 000000000..1671625b3 --- /dev/null +++ b/spec/ruby_llm/tool_catalog_spec.rb @@ -0,0 +1,82 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe RubyLLM::ToolCatalog do + let(:foo) { build_tool('FooTool', 'finds foo things') } + let(:bar) { build_tool('BarTool', 'finds bar things') } + + def build_tool(const_name, desc) + stub_const(const_name, Class.new(RubyLLM::Tool) { description(desc) }) + Object.const_get(const_name).new + end + + describe '#empty? / #any?' do + it 'starts empty' do + catalog = described_class.new + expect(catalog).to be_empty + expect(catalog.any?).to be(false) + end + + it 'reports non-empty after add' do + catalog = described_class.new.add(foo) + expect(catalog).not_to be_empty + expect(catalog.any?).to be(true) + end + end + + describe '#add' do + it 'keys tools by their snake_case name' do + catalog = described_class.new + catalog.add(foo) + expect(catalog.deferred_tools.keys).to eq([:foo]) + end + + it 'overwrites on duplicate name' do + stub_const('FooTool', Class.new(RubyLLM::Tool) { description('v1') }) + first = FooTool.new + stub_const('FooTool', Class.new(RubyLLM::Tool) { description('v2') }) + second = FooTool.new + + catalog = described_class.new.add(first).add(second) + expect(catalog.deferred_tools.size).to eq(1) + expect(catalog.deferred_tools[:foo].description).to eq('v2') + end + end + + describe '#promote' do + it 'records the tool as loaded and returns it' do + catalog = described_class.new.add(foo).add(bar) + expect(catalog.promote(:foo)).to eq(foo) + expect(catalog.loaded_tools).to contain_exactly(:foo) + end + + it 'accepts string names' do + catalog = described_class.new.add(foo) + expect(catalog.promote('foo')).to eq(foo) + expect(catalog.loaded_tools).to include(:foo) + end + + it 'returns nil for unknown names and does not record them' do + catalog = described_class.new.add(foo) + expect(catalog.promote(:missing)).to be_nil + expect(catalog.loaded_tools).to be_empty + end + end + + describe '#available' do + it 'excludes tools already promoted' do + catalog = described_class.new.add(foo).add(bar) + catalog.promote(:foo) + expect(catalog.available.keys).to contain_exactly(:bar) + end + end + + describe '#inspect' do + it 'reports counts' do + catalog = described_class.new.add(foo).add(bar) + catalog.promote(:foo) + expect(catalog.inspect).to eq('#') + end + end +end diff --git a/spec/ruby_llm/tool_spec.rb b/spec/ruby_llm/tool_spec.rb index fb1b6349f..25793a5d4 100644 --- a/spec/ruby_llm/tool_spec.rb +++ b/spec/ruby_llm/tool_spec.rb @@ -91,4 +91,32 @@ def execute(questions:) .to raise_error(ArgumentError, 'bad value provided') end end + + describe '.deferred' do + it 'defaults to false' do + stub_const('UndeclaredTool', Class.new(described_class)) + expect(UndeclaredTool.deferred?).to be(false) + expect(UndeclaredTool.new.deferred?).to be(false) + end + + it 'marks the class as deferred when called without arguments' do + stub_const('HeavyTool', Class.new(described_class) { deferred }) + expect(HeavyTool.deferred?).to be(true) + expect(HeavyTool.new.deferred?).to be(true) + end + + it 'accepts explicit true/false' do + stub_const('TrueTool', Class.new(described_class) { deferred(true) }) + stub_const('FalseTool', Class.new(described_class) { deferred(false) }) + expect(TrueTool.deferred?).to be(true) + expect(FalseTool.deferred?).to be(false) + end + + it 'does not propagate to unrelated classes' do + stub_const('ParentTool', Class.new(described_class) { deferred }) + stub_const('SiblingTool', Class.new(described_class)) + expect(ParentTool.deferred?).to be(true) + expect(SiblingTool.deferred?).to be(false) + end + end end