Skip to content

Commit b484893

Browse files
swistaczekclaude
andcommitted
Add tool search abstraction (defer_loading + search_tools)
Translates Anthropic's native deferred tool loading into a Ruby-idiomatic API that works across providers. Developers opt in via a `deferred` class DSL on RubyLLM::Tool, or by passing `defer: true` to `Chat#with_tool` / `#with_tools`. When any tool is deferred, a built-in `search_tools` function is exposed to the model so it can load schemas on demand. On Anthropic, this forwards `defer_loading: true` and appends the native `tool_search_tool_bm25_20251119` primitive; on OpenAI/Gemini/Bedrock (and OpenAI/Gemini-compatible providers), deferred tools are emitted as name+description stubs and a pure-Ruby BM25 ranker drives the client-side search. Zero new runtime dependencies. Fully non-breaking: absent `defer:` or `deferred`, behaviour is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 4371a1b commit b484893

26 files changed

Lines changed: 1089 additions & 18 deletions

File tree

docs/_advanced/upgrading.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,16 @@ redirect_from:
2121
{:toc}
2222

2323
---
24+
# Upgrade to 1.15
25+
26+
## How to Upgrade
27+
28+
1.15 adds tool search in a fully additive way. No generator, no migration — upgrade the gem and continue using RubyLLM as before.
29+
30+
## What's New in 1.15
31+
32+
- **Tool Search**`RubyLLM::Chat#with_tool` and `#with_tools` now accept a `defer:` keyword argument, and `RubyLLM::Tool` exposes a class-level `deferred` DSL. Marking a tool as deferred keeps its JSON schema out of the system-prompt prefix; the model loads it on demand via a built-in `search_tools` function. On Anthropic, this translates to the native `defer_loading: true` + `tool_search_tool_bm25_20251119`. On other providers (OpenAI, Gemini, Bedrock, and their OpenAI/Gemini-compatible descendants), the behavior is emulated client-side using a pure-Ruby BM25 ranker. If you don't use `defer:` or `deferred`, nothing changes. See [Tool Search]({% link _core_features/tool-search.md %}).
33+
2434
# Upgrade to 1.14
2535

2636
## How to Upgrade

docs/_core_features/tool-search.md

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
---
2+
layout: default
3+
title: Tool Search
4+
nav_order: 9
5+
description: Scale to hundreds of tools without blowing up your token budget. Defer tool schemas and let the model search for what it needs.
6+
redirect_from:
7+
- /guides/tool-search
8+
---
9+
10+
# {{ page.title }}
11+
{: .d-inline-block .no_toc }
12+
13+
New in 1.15
14+
{: .label .label-green }
15+
16+
{{ page.description }}
17+
{: .fs-6 .fw-300 }
18+
19+
## Table of contents
20+
{: .no_toc .text-delta }
21+
22+
1. TOC
23+
{:toc}
24+
25+
---
26+
27+
After reading this guide, you will know:
28+
29+
* When tool search helps (and when it doesn't).
30+
* How to mark tools as deferred.
31+
* How the model discovers and loads deferred tools at runtime.
32+
* How providers differ: Anthropic's native path vs. client-side emulation.
33+
* How to plug in a custom search function (e.g. embeddings).
34+
35+
## When to use it
36+
37+
When a `RubyLLM::Chat` is wired to many tools — especially across one or more MCP servers — every tool's full JSON Schema ships in the system-prompt prefix on every turn. Three real costs follow:
38+
39+
1. **Token bloat.** Hundreds of tools can add tens of thousands of tokens per request.
40+
2. **Prompt-cache eviction.** Adding or removing tools changes the prefix and invalidates the cache.
41+
3. **Selection accuracy.** Models choose worse tools when the menu is long.
42+
43+
Tool search solves this by withholding the schemas of "deferred" tools from the prompt prefix and exposing a built-in `search_tools` function the model can call to load the schemas it actually needs.
44+
45+
Reach for it when:
46+
47+
* You use MCP servers with more than ~20 tools.
48+
* You have a handful of rarely-needed heavy tools (deep-research, large schemas) bloating your prefix.
49+
* You run on Anthropic and want to preserve prompt cache across long conversations.
50+
51+
Skip it when:
52+
53+
* You have fewer than a dozen tools and no prompt-cache concerns.
54+
* All your tools are always relevant to every turn.
55+
56+
## Marking tools deferred
57+
58+
### Per-call (best for MCP bulk registration)
59+
60+
```ruby
61+
chat = RubyLLM.chat(model: "claude-sonnet-4-6")
62+
chat.with_tools(*mcp_client.tools, defer: true)
63+
```
64+
65+
### Per-class (best for tools that are intrinsically heavy)
66+
67+
```ruby
68+
class DeepResearchTool < RubyLLM::Tool
69+
description "Runs a multi-step web research task."
70+
deferred
71+
param :query, desc: "Research question"
72+
def execute(query:)
73+
# ...
74+
end
75+
end
76+
77+
chat.with_tool(DeepResearchTool)
78+
```
79+
80+
The per-call value wins when both are specified, so you can override:
81+
82+
```ruby
83+
chat.with_tool(DeepResearchTool, defer: false) # force it onto the active list
84+
```
85+
86+
## How the model discovers deferred tools
87+
88+
When the catalog contains at least one deferred tool, `RubyLLM::Chat` automatically adds a built-in `search_tools` function to the active tool list. The model sees a compact description like:
89+
90+
> **search_tools** — Search and load deferred tools by keyword, or load specific tools by name using `select:Name1,Name2`.
91+
92+
The typical flow:
93+
94+
```
95+
user : "Delete the stale S3 bucket"
96+
assistant : tool_call(search_tools, query: "delete s3 bucket")
97+
tool : { loaded: [:s3_delete_bucket], descriptions: {...} }
98+
assistant : tool_call(s3_delete_bucket, bucket: "stale-bucket-2023")
99+
tool : "deleted"
100+
assistant : "Done — stale-bucket-2023 has been deleted."
101+
```
102+
103+
Once a tool is promoted by `search_tools`, it stays active for the rest of the conversation.
104+
105+
## The `select:` shortcut
106+
107+
For deterministic loading — "when event X happens, load tools Y and Z" — prefix the query with `select:` and list the names exactly. No ranking, no LLM reasoning required:
108+
109+
```ruby
110+
chat.tools[:search_tools].execute(query: "select:s3_delete_bucket,s3_list_buckets")
111+
```
112+
113+
You can also drive `select:` entirely from application code when you know which tools a task needs, skipping the discovery turn.
114+
115+
## Anthropic: native server-side search
116+
117+
On Anthropic models, ruby_llm forwards `defer_loading: true` to the API on every deferred tool and appends the native `tool_search_tool_bm25_20251119` server-side search primitive. The practical win: deferred tool schemas never enter the cached system-prompt prefix, so prompt caching stays intact across turns.
118+
119+
## Other providers: client-side emulation
120+
121+
On OpenAI, Gemini, Bedrock, and providers that inherit from them (Azure, DeepSeek, Mistral, OpenRouter, Perplexity, xAI, Ollama, GPUStack, Vertex AI), ruby_llm emits deferred tools as name-plus-description stubs with an empty parameters schema. The model cannot invoke a stub directly — it must call `search_tools` first to load the full schema, at which point the tool becomes callable normally.
122+
123+
The default client-side ranker is a pure-Ruby BM25 implementation over `"#{name} #{description}"`. No external gem, no embedding infrastructure.
124+
125+
## Custom search
126+
127+
If BM25 over name + description is not enough — for example, you want to rank by your own tool metadata or use embeddings — supply your own ranker:
128+
129+
```ruby
130+
# Per-chat
131+
chat.with_tool_search do |query, candidates, max:|
132+
MyEmbeddingIndex.rank(candidates, query, k: max) # returns Array<Symbol>
133+
end
134+
135+
# Or, globally, once per process
136+
RubyLLM.configure do |c|
137+
c.tool_search_function = lambda do |query, candidates, max:|
138+
MyEmbeddingIndex.rank(candidates, query, k: max)
139+
end
140+
end
141+
```
142+
143+
The block receives the query string, the hash of candidate tools (keyed by name symbol), and `max:`. It must return an array of tool-name symbols ordered by relevance.
144+
145+
## Kill switch
146+
147+
Tool search is on by default. To disable it globally — in which case `defer: true` and the `deferred` DSL become no-ops and a one-time warning is logged — set:
148+
149+
```ruby
150+
RubyLLM.configure { |c| c.tool_search_enabled = false }
151+
```
152+
153+
## Observing activation
154+
155+
```ruby
156+
chat.on_tool_search do |query:, results:|
157+
Rails.logger.info("tool_search: #{query}#{results.join(', ')}")
158+
end
159+
160+
chat.tool_catalog.loaded_tools # => #<Set: {:s3_delete_bucket}>
161+
chat.tool_catalog.deferred_tools.keys # all tools still hidden
162+
```
163+
164+
## Configuration reference
165+
166+
| Location | Setting | Default | Purpose |
167+
|---|---|---|---|
168+
| `RubyLLM.config` | `tool_search_enabled` | `true` | Global on/off kill switch |
169+
| `RubyLLM.config` | `tool_search_function` | `nil` (uses BM25) | Global default ranker |
170+
| `RubyLLM::Tool` | `deferred` (class DSL) | not set | Marks every instance deferred by default |
171+
| `RubyLLM::Chat#with_tool` | `defer:` kwarg | `nil` (inherits class) | Per-registration override |
172+
| `RubyLLM::Chat#with_tools` | `defer:` kwarg | `nil` | Bulk per-registration override |
173+
| `RubyLLM::Chat#with_tool_search` | block || Per-chat ranker |
174+
| `RubyLLM::Chat#on_tool_search` | block || Activation callback |
175+
| `RubyLLM::Chat#tool_catalog` | reader || Inspect deferred / loaded sets |
176+
177+
## Further reading
178+
179+
* [Tools guide]({% link _core_features/tools.md %})
180+
* [Agents guide]({% link _core_features/agents.md %})
181+
* [Anthropic tool search tool](https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool)

docs/_core_features/tools.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -509,6 +509,8 @@ end
509509

510510
For MCP server integration, check out the community-maintained [`ruby_llm-mcp`](https://github.com/patvice/ruby_llm-mcp) gem.
511511

512+
When a chat is wired to many tools — especially across MCP servers — see [Tool Search]({% link _core_features/tool-search.md %}) for how to defer tool schemas and let the model load only the ones it needs.
513+
512514
## Debugging Tools
513515

514516
Set the `RUBYLLM_DEBUG` environment variable to see detailed logging, including tool calls and results.

lib/ruby_llm.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
'UI' => 'UI',
2020
'api' => 'API',
2121
'bedrock' => 'Bedrock',
22+
'bm25' => 'BM25',
2223
'deepseek' => 'DeepSeek',
2324
'gpustack' => 'GPUStack',
2425
'llm' => 'LLM',

lib/ruby_llm/chat.rb

Lines changed: 75 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ module RubyLLM
55
class Chat
66
include Enumerable
77

8-
attr_reader :model, :messages, :tools, :tool_prefs, :params, :headers, :schema
8+
attr_reader :model, :messages, :tools, :tool_prefs, :params, :headers, :schema, :tool_catalog
99

1010
def initialize(model: nil, provider: nil, assume_model_exists: false, context: nil)
1111
if assume_model_exists && !provider
@@ -19,6 +19,7 @@ def initialize(model: nil, provider: nil, assume_model_exists: false, context: n
1919
@temperature = nil
2020
@messages = []
2121
@tools = {}
22+
@tool_catalog = ToolCatalog.new
2223
@tool_prefs = { choice: nil, calls: nil }
2324
@params = {}
2425
@headers = {}
@@ -28,7 +29,8 @@ def initialize(model: nil, provider: nil, assume_model_exists: false, context: n
2829
new_message: nil,
2930
end_message: nil,
3031
tool_call: nil,
31-
tool_result: nil
32+
tool_result: nil,
33+
tool_search: nil
3234
}
3335
end
3436

@@ -51,22 +53,27 @@ def with_instructions(instructions, append: false, replace: nil)
5153
self
5254
end
5355

54-
def with_tool(tool, choice: nil, calls: nil)
55-
unless tool.nil?
56-
tool_instance = tool.is_a?(Class) ? tool.new : tool
57-
@tools[tool_instance.name.to_sym] = tool_instance
58-
end
56+
def with_tool(tool, defer: nil, choice: nil, calls: nil)
57+
register_tool(tool, defer: defer) unless tool.nil?
5958
update_tool_options(choice:, calls:)
6059
self
6160
end
6261

63-
def with_tools(*tools, replace: false, choice: nil, calls: nil)
64-
@tools.clear if replace
65-
tools.compact.each { |tool| with_tool tool }
62+
def with_tools(*tools, replace: false, defer: nil, choice: nil, calls: nil)
63+
if replace
64+
@tools.clear
65+
@tool_catalog = ToolCatalog.new
66+
end
67+
tools.compact.each { |tool| with_tool tool, defer: defer }
6668
update_tool_options(choice:, calls:)
6769
self
6870
end
6971

72+
def with_tool_search(&block)
73+
@tool_catalog.search_function = block
74+
self
75+
end
76+
7077
def with_model(model_id, provider: nil, assume_exists: false)
7178
@model, @provider = Models.resolve(model_id, provider:, assume_exists:, config: @config)
7279
@connection = @provider.connection
@@ -132,14 +139,19 @@ def on_tool_result(&block)
132139
self
133140
end
134141

142+
def on_tool_search(&block)
143+
@on[:tool_search] = block
144+
self
145+
end
146+
135147
def each(&)
136148
messages.each(&)
137149
end
138150

139151
def complete(&) # rubocop:disable Metrics/PerceivedComplexity
140152
response = @provider.complete(
141153
messages,
142-
tools: @tools,
154+
tools: effective_tools,
143155
tool_prefs: @tool_prefs,
144156
temperature: @temperature,
145157
model: @model,
@@ -176,6 +188,15 @@ def add_message(message_or_attributes)
176188
message
177189
end
178190

191+
def register_loaded_tool(tool)
192+
@tools[tool.name.to_sym] = tool
193+
self
194+
end
195+
196+
def emit_tool_search(query:, results:)
197+
@on[:tool_search]&.call(Tool::SearchEvent.new(query, results))
198+
end
199+
179200
def reset_messages!
180201
@messages.clear
181202
end
@@ -329,6 +350,49 @@ def content_like?(object)
329350
object.is_a?(Content) || object.is_a?(Content::Raw)
330351
end
331352

353+
def effective_tools
354+
active = @tools.transform_values { |t| Tool::Registration.new(t, deferred: false) }
355+
return active if @tool_catalog.empty?
356+
357+
deferred = @tool_catalog.available.transform_values { |t| Tool::Registration.new(t, deferred: true) }
358+
deferred.merge(active)
359+
end
360+
361+
def register_tool(tool, defer:)
362+
tool_instance = tool.is_a?(Class) ? tool.new : tool
363+
364+
if defer_requested?(tool_instance, defer)
365+
@tool_catalog.add(tool_instance)
366+
ensure_search_tool!
367+
else
368+
@tools[tool_instance.name.to_sym] = tool_instance
369+
end
370+
end
371+
372+
def defer_requested?(tool, explicit)
373+
requested = explicit.nil? ? tool.deferred? : explicit == true
374+
return false unless requested
375+
return true if @config.tool_search_enabled
376+
377+
warn_tool_search_disabled_once
378+
false
379+
end
380+
381+
def ensure_search_tool!
382+
return if @tools.key?(:search_tools)
383+
384+
@tools[:search_tools] = Tools::SearchTools.new(self)
385+
end
386+
387+
def warn_tool_search_disabled_once
388+
return if @tool_search_warning_emitted
389+
390+
RubyLLM.logger.warn(
391+
'tool_search_enabled is false; ignoring defer: true / deferred DSL for the rest of this chat'
392+
)
393+
@tool_search_warning_emitted = true
394+
end
395+
332396
def append_system_instruction(instructions)
333397
system_messages, non_system_messages = @messages.partition { |msg| msg.role == :system }
334398
system_messages << Message.new(role: :system, content: instructions)

lib/ruby_llm/configuration.rb

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,9 @@ def defaults = @defaults ||= {}
5656
option :log_stream_debug, -> { ENV['RUBYLLM_STREAM_DEBUG'] == 'true' }
5757
option :log_regexp_timeout, -> { Regexp.respond_to?(:timeout) ? (Regexp.timeout || 1.0) : nil }
5858

59+
option :tool_search_enabled, true
60+
option :tool_search_function, nil
61+
5962
def initialize
6063
self.class.send(:defaults).each do |key, default|
6164
value = default.respond_to?(:call) ? instance_exec(&default) : default

lib/ruby_llm/error.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ class ConfigurationError < StandardError; end
2222
class PromptNotFoundError < StandardError; end
2323
class InvalidRoleError < StandardError; end
2424
class InvalidToolChoiceError < StandardError; end
25+
class InvalidToolConfigurationError < StandardError; end
2526
class ModelNotFoundError < StandardError; end
2627
class UnsupportedAttachmentError < StandardError; end
2728

lib/ruby_llm/providers/anthropic/chat.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ def build_base_payload(chat_messages, model, stream, thinking)
6565

6666
def add_optional_fields(payload, system_content:, tools:, tool_prefs:, temperature:, schema: nil) # rubocop:disable Metrics/ParameterLists
6767
if tools.any?
68-
payload[:tools] = tools.values.map { |t| Tools.function_for(t) }
68+
payload[:tools] = Tools.format_tools(tools)
6969
unless tool_prefs[:choice].nil? && tool_prefs[:calls].nil?
7070
payload[:tool_choice] = Tools.build_tool_choice(tool_prefs)
7171
end

0 commit comments

Comments
 (0)