Skip to content

Commit fd38dbe

Browse files
committed
Merge remote-tracking branch 'upstream/main' into add-bedrock-embedding-support
# Conflicts: # lib/ruby_llm/providers/bedrock.rb
2 parents cf9b8f9 + 80fe294 commit fd38dbe

802 files changed

Lines changed: 84144 additions & 56912 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.rubocop.yml

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,19 @@ AllCops:
1414
SuggestExtensions: false
1515

1616
Metrics/ClassLength:
17-
Enabled: false
17+
Max: 550
1818
Metrics/AbcSize:
19-
Enabled: false
19+
Max: 80
2020
Metrics/CyclomaticComplexity:
21-
Enabled: false
21+
Max: 25
2222
Metrics/MethodLength:
23-
Enabled: false
23+
Max: 80
24+
CountAsOne:
25+
- array
26+
- hash
27+
- heredoc
2428
Metrics/ModuleLength:
25-
Enabled: false
29+
Max: 550
2630
Performance/CollectionLiteralInLoop:
2731
Exclude:
2832
- spec/**/*

README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,10 @@
55
<img src="/docs/assets/images/logotype.svg" alt="RubyLLM" height="120" width="250">
66
</picture>
77

8-
<strong>One *beautiful* Ruby API for GPT, Claude, Gemini, and more.</strong>
8+
<strong>One <em>delightful</em> Ruby framework for every major AI provider. Build AI agents, chatbots, RAG apps, and multimodal workflows in beautiful, expressive code.
9+
</strong>
910

10-
Battle tested at [<picture><source media="(prefers-color-scheme: dark)" srcset="https://chatwithwork.com/logotype-dark.svg"><img src="https://chatwithwork.com/logotype.svg" alt="Chat with Work" height="30" align="absmiddle"></picture>](https://chatwithwork.com) *Your AI coworker*
11+
Battle tested at [<picture><source media="(prefers-color-scheme: dark)" srcset="https://chatwithwork.com/logotype-dark.svg"><img src="https://chatwithwork.com/logotype.svg" alt="Chat with Work" height="30" align="absmiddle"></picture>](https://chatwithwork.com) - *Fully private work AI*
1112

1213
[![Gem Version](https://badge.fury.io/rb/ruby_llm.svg)](https://badge.fury.io/rb/ruby_llm)
1314
[![Ruby Style Guide](https://img.shields.io/badge/code_style-rubocop-brightgreen.svg)](https://github.com/rubocop/rubocop)
@@ -24,15 +25,15 @@ Battle tested at [<picture><source media="(prefers-color-scheme: dark)" srcset="
2425

2526
Build chatbots, AI agents, RAG applications. Works with OpenAI, xAI, Anthropic, Google, AWS, local models, and any OpenAI-compatible API.
2627

27-
## From zero to AI chat app in under two minutes
28+
## Build a working Ruby AI chat in two minutes
2829

2930
https://github.com/user-attachments/assets/65422091-9338-47da-a303-92b918bd1345
3031

3132
## Why RubyLLM?
3233

3334
Every AI provider ships their own bloated client. Different APIs. Different response formats. Different conventions. It's exhausting.
3435

35-
RubyLLM gives you one beautiful API for all of them. Same interface whether you're using GPT, Claude, or your local Ollama. Just three dependencies: Faraday, Zeitwerk, and Marcel. That's it.
36+
RubyLLM gives you one beautiful framework for all of them. Same interface whether you're using GPT, Claude, or your local Ollama. Just three dependencies: Faraday, Zeitwerk, and Marcel. That's it.
3637

3738
## Show me the code
3839

@@ -138,6 +139,8 @@ response = chat.with_schema(ProductSchema).ask "Analyze this product", with: "pr
138139
* **Async:** Fiber-based concurrency
139140
* **Model registry:** 800+ models with capability detection and pricing
140141
* **Extended thinking:** Control, view, and persist model deliberation
142+
* **Citations:** Normalized source citations from documents, search, and grounding
143+
* **Batches:** Provider-side batch processing at half price with `RubyLLM.batch`
141144
* **Providers:** OpenAI, xAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API
142145

143146
## Installation

bin/test

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
IFS=$'\n\t'
4+
export RUBYOPT="${RUBYOPT:-} -w"
5+
6+
if [ "$#" -eq 0 ]; then
7+
exec bundle exec bin/rspec-queue
8+
else
9+
exec bundle exec rspec "$@"
10+
fi

docs/_advanced/agentic-workflows.md

Lines changed: 57 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: default
33
title: Agentic Workflows
4-
nav_order: 5
4+
nav_order: 6
55
description: Build workflow-oriented AI systems with plain Ruby orchestration, from routing and parallelization to RAG
66
---
77

@@ -108,6 +108,62 @@ workflow = ModelRouterWorkflow.new
108108
response = workflow.call("Write a Ruby function to parse JSON")
109109
```
110110

111+
### Handoff Workflow
112+
113+
Routing picks an agent up front. A handoff goes further: it hands the *same ongoing conversation* to a different agent, so the specialist sees the full history and the user keeps one continuous thread. `Agent.find(chat_id)` is what makes this work. It loads a persisted chat and applies a different agent's instructions and tools at runtime, leaving the messages intact, so whichever agent takes over inherits the whole conversation.
114+
115+
The simplest version is a single turn. Give a router an inline schema so it answers with the specialist to hand to, then let that specialist continue the persisted chat:
116+
117+
```ruby
118+
class SupportRouter < RubyLLM::Agent
119+
schema do
120+
string :specialist, enum: %w[billing technical]
121+
end
122+
instructions "Pick the specialist best suited to the latest message."
123+
end
124+
125+
SPECIALISTS = { "billing" => BillingAgent, "technical" => TechnicalAgent }
126+
127+
def handle(chat_id, message)
128+
specialist = SupportRouter.new.ask(message).content["specialist"]
129+
SPECIALISTS.fetch(specialist).find(chat_id).ask(message)
130+
end
131+
```
132+
133+
The router responds immediately with its choice (no tool round-trip), and the specialist takes over the same chat with full history. Outside Rails, pass the chat object with `SpecialistAgent.new(chat: chat)` instead of `find(chat_id)`.
134+
135+
If you would rather the agent decide mid-conversation, give it a single handoff tool that returns the specialist to switch to. Drive the [loop]({% link _core_features/chat.md %}#driving-the-loop-yourself) a step at a time, and when a tool result names an agent, hand off:
136+
137+
```ruby
138+
class Handoff < RubyLLM::Tool
139+
description "Hand the conversation to the specialist who should answer next"
140+
param :specialist, desc: "billing or technical"
141+
def execute(specialist:) = specialist
142+
end
143+
144+
class SupportRouter < RubyLLM::Agent
145+
chat_model Chat
146+
instructions "Call handoff with the specialist who should take over."
147+
tools Handoff
148+
end
149+
150+
AGENTS = { "billing" => BillingAgent, "technical" => TechnicalAgent }
151+
152+
agent = SupportRouter.find(chat_id)
153+
agent.ask_later(message)
154+
155+
until agent.complete?
156+
agent.step
157+
last = agent.messages.reload.last
158+
specialist = AGENTS[last.content] if last.role == "tool"
159+
agent = specialist.find(chat_id) if specialist
160+
end
161+
```
162+
163+
The handoff tool just returns the name; the orchestrator owns the routing, watching each tool result and swapping agents when one names a specialist. Because the routing lives in the loop and not in any agent, this extends to a multi-router for free: give the specialists the same `Handoff` tool and they can route onward, with every hop handled the same way.
164+
165+
(A tool cannot reconfigure the chat it runs inside, since its `execute` never receives the chat, so the switch belongs in the loop, not in the tool.)
166+
111167
### Parallel Workflow
112168

113169
Use this pattern when independent analyses can run at the same time.

docs/_advanced/instrumentation.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
---
2+
layout: default
3+
title: Instrumentation
4+
nav_order: 5
5+
description: Observe RubyLLM requests, chats, tool calls, embeddings, and model refreshes
6+
redirect_from:
7+
- /guides/instrumentation
8+
---
9+
10+
# {{ page.title }}
11+
{: .no_toc .d-inline-block }
12+
13+
v1.16.0+
14+
{: .label .label-green }
15+
16+
{{ page.description }}.
17+
{: .fs-6 .fw-300 }
18+
19+
## Table of contents
20+
{: .no_toc .text-delta }
21+
22+
1. TOC
23+
{:toc}
24+
25+
---
26+
27+
After reading this guide, you will know:
28+
29+
* How to subscribe to RubyLLM events in Rails.
30+
* How to connect RubyLLM instrumentation outside Rails.
31+
* Which events RubyLLM emits.
32+
* Which payload fields may contain sensitive application data.
33+
34+
## Rails
35+
36+
Rails apps automatically emit RubyLLM events through `ActiveSupport::Notifications`. Subscribe to them the same way you would subscribe to Rails framework events:
37+
38+
```ruby
39+
# config/initializers/ruby_llm_instrumentation.rb
40+
ActiveSupport::Notifications.subscribe('chat.ruby_llm') do |_name, _start, _finish, _id, payload|
41+
Rails.logger.info(
42+
provider: payload[:provider],
43+
model: payload[:model],
44+
input_tokens: payload[:input_tokens],
45+
output_tokens: payload[:output_tokens]
46+
)
47+
end
48+
```
49+
50+
When an instrumented block raises, Rails adds the standard `:exception` and `:exception_object` payload keys.
51+
52+
## Outside Rails
53+
54+
Outside Rails, set `config.instrumenter` to any object that responds to `instrument(name, payload) { ... }`:
55+
56+
```ruby
57+
class AppInstrumenter
58+
def instrument(name, payload)
59+
started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
60+
61+
result = yield if block_given?
62+
result
63+
rescue StandardError => error
64+
payload = payload.merge(
65+
exception: [error.class.name, error.message],
66+
exception_object: error
67+
)
68+
raise
69+
ensure
70+
duration = Process.clock_gettime(Process::CLOCK_MONOTONIC) - started_at
71+
Observability.record(name, payload.merge(duration: duration))
72+
end
73+
end
74+
75+
RubyLLM.configure do |config|
76+
config.instrumenter = AppInstrumenter.new
77+
end
78+
```
79+
80+
You can also set `instrumenter` on a [context]({% link _getting_started/configuration.md %}#contexts-isolated-configurations) when you only want instrumentation around a specific operation.
81+
82+
## Events
83+
84+
RubyLLM emits these events:
85+
86+
* `request.ruby_llm` - HTTP request metadata such as provider, method, URL, and status
87+
* `chat.ruby_llm` - chat completion metadata including model, provider, messages, response, and token usage
88+
* `tool_call.ruby_llm` - tool name, arguments, and result
89+
* `embedding.ruby_llm` - embedding model, input, result, token usage, and vector dimensions
90+
* `image.ruby_llm` - image generation model, prompt, size, and result
91+
* `moderation.ruby_llm` - moderation model, input, result, and flagged status
92+
* `transcription.ruby_llm` - transcription model, language, result, and token usage
93+
* `models.refresh.ruby_llm` - model registry refresh metadata
94+
95+
## Payloads
96+
97+
Payloads include the Ruby objects needed by observability adapters, but message content, tool arguments, and provider responses may be sensitive. Only export or log those fields when your application policy allows it.
98+
99+
Non-Rails instrumenters control their own error payload behavior. If your instrumenter records exceptions, keep those payloads consistent with the rest of your observability stack.

docs/_advanced/models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
layout: default
33
title: Model Registry
44
nav_order: 4
5-
description: Access hundreds of AI models from all major providers with one simple API
5+
description: Access hundreds of AI models from all major AI providers with one Ruby framework
66
redirect_from:
77
- /guides/models
88
---

docs/_advanced/rails.md

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -285,14 +285,19 @@ RubyLLM.configure do |config|
285285
config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
286286
config.gemini_api_key = ENV['GEMINI_API_KEY']
287287

288-
# New apps: Use modern API (generator adds this)
289-
config.use_new_acts_as = true
290-
291288
# For custom Model class names (defaults to 'Model')
292289
# config.model_registry_class = 'AIModel'
293290
end
294291
```
295292

293+
### Instrumentation
294+
{: .d-inline-block }
295+
296+
v1.16.0+
297+
{: .label .label-green }
298+
299+
Rails apps automatically emit RubyLLM events through `ActiveSupport::Notifications`. See [Instrumentation]({% link _advanced/instrumentation.md %}) for events, payloads, and non-Rails instrumenters.
300+
296301
### Fiber-Safe ActiveRecord Connections for Async/Fiber Workloads
297302
{: .d-inline-block }
298303

@@ -313,13 +318,6 @@ Why: Rails defaults to thread-based connection isolation. In fiber-heavy flows,
313318

314319
### Setting Up Models with `acts_as` Helpers
315320

316-
> **New in v1.7.0:** Rails-like `acts_as` API with association names!
317-
> - **New apps**: Generator sets `config.use_new_acts_as = true` for modern API
318-
> - **Existing apps**: Continue using legacy API (with deprecation warning)
319-
> - **Migrate today**: Set `config.use_new_acts_as = true` to use the better API
320-
> - **Legacy API removed in 2.0**: The new API will become the only option
321-
{: .warning }
322-
323321
Add RubyLLM capabilities to your models:
324322

325323
#### With Model Registry (Default for new apps)
@@ -372,7 +370,7 @@ end
372370
Pre-1.7.0 or opt-in
373371
{: .label .label-yellow }
374372

375-
> Default behavior for existing apps. Set `config.use_new_acts_as = true` to upgrade! Legacy API will be removed in 2.0.
373+
> Set `config.use_new_acts_as = false` to stay with this API until it will be removed in 2.0.
376374
{: .note }
377375

378376
```ruby
@@ -573,9 +571,9 @@ When using the Model registry (created by default by the generator), your chats
573571
```ruby
574572
# String automatically resolves to Model record
575573
chat = Chat.create!(model: '{{ site.models.openai_standard }}')
576-
chat.model # => #<Model model_id: "gpt-4o", provider: "openai">
577-
chat.model.name # => "GPT-4"
578-
chat.model.context_window # => 128000
574+
chat.model # => #<Model model_id: "{{ site.models.openai_standard }}", provider: "openai">
575+
chat.model.name # => "GPT-5.4"
576+
chat.model.context_window # => 1050000
579577
chat.model.supports_vision # => true
580578

581579
# Populate/refresh models from models.json (v1.13+)

docs/_advanced/upgrading.md

Lines changed: 60 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: default
33
title: Upgrading
4-
nav_order: 6
4+
nav_order: 7
55
description: Upgrade guides for changes in data formats
66
redirect_from:
77
- /upgrading-to-1-7
@@ -23,6 +23,57 @@ redirect_from:
2323
This guide focuses on upgrade-impacting changes: migrations, token semantics, deprecations, and compatibility notes. It is not a complete changelog. For every feature, fix, and patch note, see the [GitHub releases](https://github.com/crmne/ruby_llm/releases).
2424
{: .note }
2525

26+
---
27+
# Upgrade to 2.0
28+
29+
2.0 is currently in development
30+
{: .note }
31+
32+
## How to Upgrade
33+
34+
```bash
35+
# Run the upgrade generator
36+
bin/rails generate ruby_llm:upgrade_to_v2_0
37+
38+
# Run migrations
39+
bin/rails db:migrate
40+
```
41+
42+
The generator adds a JSON `citations` column to your messages table so [citations]({% link _core_features/citations.md %}) are persisted with each assistant message. The column is optional — without it, citations remain available on in-memory responses but aren't saved.
43+
44+
## Providers and Protocols Split
45+
46+
RubyLLM 2.0 separates providers (host, auth, catalog) from protocols (wire format). The public chat API is unchanged — `RubyLLM.chat`, `with_params`, `embed`, `paint`, and the Rails integration all work as before. Two things changed underneath:
47+
48+
**OpenAI now defaults to the Responses API.** This unlocks reasoning models with tools and extended thinking together. To stay on Chat Completions:
49+
50+
```ruby
51+
RubyLLM.configure do |config|
52+
config.openai_protocol = :chat_completions
53+
end
54+
55+
# or per chat
56+
RubyLLM.chat(model: 'gpt-5.4').with_protocol(:chat_completions)
57+
```
58+
59+
If you pass Chat Completions–only params via `with_params` (like `response_format`), either switch those chats to `:chat_completions` or use the Responses API equivalents (`text: { format: ... }`).
60+
61+
**Wire-format internals moved to `RubyLLM::Protocols`.** `RubyLLM::Providers::OpenAI::Chat` and sibling modules are now `RubyLLM::Protocols::ChatCompletions::Chat` and friends; Anthropic, Gemini, and Bedrock Converse internals moved the same way. Provider classes no longer inherit from each other (`Mistral < OpenAI` is gone) — a provider declares its protocols instead.
62+
63+
If you maintain a provider gem, subclass a protocol for your dialect and declare it in a thin provider:
64+
65+
```ruby
66+
class MyProvider < RubyLLM::Provider
67+
class ChatCompletions < RubyLLM::Protocols::ChatCompletions
68+
# your overrides
69+
end
70+
71+
protocol :chat_completions, ChatCompletions
72+
end
73+
```
74+
75+
For routing details and per-chat overrides, see [Choosing the Wire Protocol]({% link _core_features/chat.md %}#choosing-the-wire-protocol).
76+
2677
---
2778
# Upgrade to 1.15
2879

@@ -189,6 +240,14 @@ Your existing 1.6 app continues working without any changes. You'll see a deprec
189240
!!! RubyLLM's legacy acts_as API is deprecated and will be removed in RubyLLM 2.0.0.
190241
```
191242

243+
You can silence or raise RubyLLM deprecations while upgrading:
244+
245+
```ruby
246+
RubyLLM.configure do |config|
247+
config.deprecation_behavior = :silence # or :raise
248+
end
249+
```
250+
192251
## What's New in 1.7
193252

194253
Among other features, the DB-backed model registry replaces simple string fields with proper ActiveRecord associations. Additionally, the `acts_as` helpers have been redesigned with a more Rails-like API.

0 commit comments

Comments
 (0)