Releases: crmne/ruby_llm
1.15.0
RubyLLM 1.15: Image Editing + Cost Tracking + Less Glue Code 🖼️💸🛠️
RubyLLM 1.15 removes glue code around images, costs, tools, callbacks, and Rails persistence.
If Ruby can infer a tool signature, RubyLLM now infers it. If a provider reports usage, RubyLLM can turn it into cost. If Rails already has a blob, RubyLLM reuses it instead of downloading and uploading it again.
🖼️ Image Editing
Same method, same attachment shape: paint now paints from scratch or edits an existing image.
RubyLLM.paint can edit existing images with OpenAI's GPT Image models. Pass one or more source images with with:, add a mask: when you want to constrain the editable area, and use params: for provider-specific image options.
image = RubyLLM.paint(
"Turn the logo green and keep the background transparent",
model: "gpt-image-1",
with: "logo.png"
)with: accepts the same attachment sources RubyLLM supports elsewhere: local files, URLs, IO-like objects, and Active Storage attachments. Multiple source images work too:
image = RubyLLM.paint(
"Combine these references into a postcard illustration",
model: "gpt-image-1",
with: ["person.png", "style-reference.png"]
)Image responses now expose provider usage data, and GPT Image pricing is represented in the model registry so image input/output costs can be calculated with the same API shape used by chats and messages:
image.tokens.input
image.tokens.output
image.cost.input
image.cost.output
image.cost.total💸 Conversation Costs + Normalized Tokens
Token counts answer "how many?" Cost helpers answer the next question: "how much?"
RubyLLM now has first-class cost helpers for token-priced conversation usage:
response = chat.ask("Summarize Ruby's object model.")
response.cost.total
chat.cost.total
agent.cost.totalA response can tell you its cost. A chat can tell you the running total. An agent can too. Images use the same shape.
Under the hood, RubyLLM::Cost uses normalized token buckets plus pricing from the model registry: standard input, billable output, cache reads, cache writes, and separately priced thinking/reasoning tokens when the model exposes a distinct reasoning-token price.
Prompt caching made token counts messy, so 1.15 separates the buckets before exposing them:
response.tokens.input # Standard input tokens
response.tokens.output # Billable output tokens
response.tokens.cache_read # Prompt cache reads
response.tokens.cache_write # Prompt cache writesThe top-level token helpers still work for backwards compatibility, but new code should prefer response.tokens.*.
For Rails users, persisted messages and chats expose the same helpers. No new migration is required if you already ran the v1.9 token migration; the new names use the existing cached_tokens and cache_creation_tokens columns.
🛠️ Simpler Tool Definitions
Simple tools no longer need duplicated parameter declarations. If the execute signature already says what arguments exist, RubyLLM can infer the flat schema:
class Weather < RubyLLM::Tool
desc "Gets current weather for a location"
def execute(latitude:, longitude:, units: "metric")
# ...
end
endRequired keywords become required string parameters. Optional keywords become optional string parameters. Explicit param declarations and the full params DSL still win when you need descriptions, non-string types, nested objects, arrays, enums, or full JSON Schema control.
There are also small ergonomics improvements:
descis now an alias fordescriptionparamacceptsdescription:as an alias fordesc:- the tool generator now emits
desc - tools with no keyword arguments now get an empty object schema
🔁 Additive Chat Callbacks
Callbacks now stack instead of replacing each other. Register five callbacks for the same event and all five run:
chat.before_message { ... }
chat.after_message { |message| ... }
chat.before_tool_call { |tool_call| ... }
chat.after_tool_result { |result| ... }Unlike the legacy on_* callbacks, multiple before_* / after_* callbacks can be registered for the same event and they all run. The old callbacks still work, but they now log deprecation warnings and keep their existing replacing behavior. They will be removed in RubyLLM 2.0.
Rails persistence now uses the additive callbacks internally, so application callbacks can be layered on top without disturbing message persistence.
🚂 Rails Fixes
Rails got the boring, important fixes: fewer load-order surprises, better persistence behavior, and less duplicate file handling.
Action Text Content
Messages backed by has_rich_text :content now use to_plain_text before being sent to the model. This prevents Action Text HTML from leaking into LLM messages and still works when the message has attachments.
ActiveRecord Eager Loading
The optional ActiveRecord integration is no longer part of the core gem eager-load path. The Railtie now explicitly loads the ActiveRecord support files only after ActiveRecord loads, which fixes standalone require "ruby_llm" + Zeitwerk eager-loading failures while preserving normal Rails behavior. CI now includes an eager-load guard across Rails appraisals.
Rails Association Conventions
The new acts_as API now follows Rails association inference more closely. Association names determine default foreign keys, while *_class options only change class names. The install generator now emits explicit foreign key options only when Rails would not infer the intended key.
Active Storage Persistence
Existing ActiveStorage::Blob, ActiveStorage::Attachment, has_one_attached, and has_many_attached records are reused directly instead of being downloaded and re-uploaded when passed through with:. The Rails docs now clarify that RubyLLM message records need an :attachments association, but your own app models can use any Active Storage attachment name.
🐛 Provider Fixes
Provider cleanup in this release is mostly about making edge cases boring.
Empty tool results are now handled consistently across Anthropic, Bedrock, and Gemini. When a tool returns no content, RubyLLM sends a small (no output) placeholder instead of provider-invalid empty content.
Streaming and non-streaming token usage is also normalized across OpenAI, OpenRouter, Bedrock, and Gemini so cache reads/writes are separated from standard input tokens before cost calculations.
📚 Docs + SEO
The docs have been updated for image editing, tool signature inference, additive callbacks, normalized token semantics, cost helpers, Active Storage attachment names, Rails association conventions, and using an existing Chat record with an Agent.
The docs site also gained richer SEO and AI-visible metadata: JSON-LD injection for collection pages, llms.txt support through jekyll-ai-visible-content, an About page, cleaner collection dates, and a gemspec changelog link that now points to GitHub Releases.
RubyLLM::Tribunal has been added to the ecosystem page for LLM evaluation and testing in Ruby.
📦 Updated Model Registry
The model registry has been refreshed with the latest available models and pricing metadata. This update adds cache read/write pricing fields, reasoning output pricing, image pricing for GPT Image models, and new aliases including Claude Opus 4.7, DeepSeek V4 Flash/Pro, Gemini Embedding 2, Gemma 4, and GPT-5.5.
Installation
gem "ruby_llm", "1.15"Upgrading from 1.14.x
bundle update ruby_llmIf you display or store token counts directly, read the 1.15 upgrade guide section. tokens.input now means standard input tokens; add tokens.cache_read and tokens.cache_write when you need total request-side input activity.
Merged PRs
- Add RubyLLM::Tribunal to ecosystem page by @Florian95 in #571
- Remove Code Duplication by @Aesthetikx in #732
- Feat: Add support to Action Text enabled content by @chagel in #365
- Fix ActiveRecord dependency check for Zeitwerk eager loading by @trevorturk in #504
- Document usage of existing Chat record with Agent by @sarrietav-dev in #693
New Contributors
- @Florian95 made their first contribution in #571
- @Aesthetikx made their first contribution in #732
- @chagel made their first contribution in #365
- @sarrietav-dev made their first contribution in #693
Full Changelog: 1.14.1...1.15.0
1.14.1
RubyLLM 1.14.1: Leaner Providers + ActiveStorage Fix 🔧🐛
A patch release that slims down provider code, fixes an ActiveStorage blob re-upload issue, and refreshes the model registry.
🏗️ Leaner Provider Model Fallbacks
Provider Capabilities modules no longer carry hundreds of lines of hardcoded pricing, context windows, and feature flags. These values now come from the model registry (models.json), and the Ruby fallback code has been trimmed to only the fields that are genuinely needed at the provider level (e.g. supports_tool_choice?). This removes ~400 net lines of code across Anthropic, OpenAI, Gemini, DeepSeek, and Perplexity, and means the registry is the single source of truth for model metadata.
For you, this means more accurate capabilities and pricing going forward.
🐛 Fix: ActiveStorage::Blob Re-upload
When an ActiveStorage::Blob was passed to ask or create_user_message via with:, it was downloaded and re-uploaded as a new blob. Existing blobs are now detected and reused directly, avoiding unnecessary storage churn. Fixes #665.
📦 Updated Model Registry
The model registry has been refreshed with the latest available models from all providers.
📝 Docs
Fixed a typo in the moderation guide ("Patters" → "Patterns").
Installation
gem "ruby_llm", "1.14.1"Upgrading from 1.14.0
bundle update ruby_llmMerged PRs
- Fix ActiveStorage::Blob re-upload if used in
with:param by @bubiche in #683 - Update moderation.md by @artinboghosian in #703
New Contributors
- @bubiche made their first contribution in #683
- @artinboghosian made their first contribution in #703
Full Changelog: 1.14.0...1.14.1
1.14.0
RubyLLM 1.14: Tailwind Chat UI + Rails AI Generators + Config DSL 🎨🤖🛠️
This release overhauls the Rails experience.
RubyLLM 1.14 ships a complete Tailwind-powered chat UI, new Rails generators for agents/tools/schemas, a simplified configuration DSL where providers self-register their options, and a batch of bug fixes across logging, agents, associations, and dependency constraints.
🎨 Tailwind Chat UI
demo.mp4
The Rails chat UI generator now produces a polished Tailwind-based interface out of the box. Run the generator and get a working chat app with message streaming, model selection, tool call display, and proper empty states — all styled with Tailwind CSS.
bin/rails generate ruby_llm:chat_uiThe generated views use role-aware partials (_user, _assistant, _system, _tool, _error) for clean message rendering, Turbo Stream templates for real-time updates, and broadcasts_to for simplified broadcasting.
🏗️ Rails AI Generators
New generators scaffold agents, tools, and schemas with a single command:
bin/rails generate ruby_llm:agent SupportAgent
bin/rails generate ruby_llm:tool WeatherToolThe install generator now creates conventional directories (app/agents, app/tools, app/schemas, app/prompts) with .gitkeep files. Tool partials follow a new naming convention for tool-specific rendering, and the generator produces matching specs.
⚙️ Simplified Configuration DSL
Provider configuration options are now self-registered by each provider using a declarative configuration_options method, replacing the monolithic attr_accessor list in Configuration. When a provider is registered, its options become attr_accessors on RubyLLM::Configuration automatically.
Each provider declares its own option keys following the <provider_slug>_<option> convention:
# In the provider class:
class DeepSeek < RubyLLM::Provider
class << self
def configuration_options
%i[deepseek_api_key deepseek_api_base]
end
end
end
# These become available in configuration automatically:
RubyLLM.configure do |config|
config.deepseek_api_key = ENV["DEEPSEEK_API_KEY"]
config.deepseek_api_base = ENV["DEEPSEEK_API_BASE"]
endThis means third-party provider gems can register their own config keys without patching Configuration.
🐛 Fixes
Faraday Logging Memory Bloat
Faraday body logging no longer serializes large payloads (e.g. base64-encoded PDFs) when the log level is above DEBUG. This eliminates unnecessary memory allocations on every request. Fixes #562.
Gemspec Faraday Constraint Regression
Fixed an overly strict Faraday version constraint in the gemspec that broke compatibility for some users. Fixes #682.
Agent assume_model_exists Propagation
Agent class-level assume_model_exists configuration now correctly propagates to chat instances. Previously, setting it on the agent class had no effect.
Renamed Model Associations
Fixed incorrect foreign key references when using renamed model associations with acts_as helpers.
Eager Logger Interpolation
Fixed eager string interpolation in log statements that caused unnecessary object allocations even when logging was disabled.
Error Raised with String Argument
RubyLLM::Error.new("message") no longer raises a NoMethodError. Fixes #653.
MySQL/MariaDB Compatibility
Fixed JSON column default handling for MySQL/MariaDB users. Fixes #521.
File Attachments Across Ruby Versions
Stabilized file attachment handling to work consistently across different Ruby versions.
Model Type Classification
Fixed model type classification for models that support multiple output modalities (e.g. text + image).
VertexAI Registry Filtering
Slash-based model IDs are now filtered out of the Vertex AI registry, preventing invalid model entries.
📦 Updated Model Registry
Default models and the model registry have been refreshed with the latest available models.
Installation
gem "ruby_llm", "1.14"Upgrading from 1.13.x
bundle update ruby_llmMerged PRs
- Fix NoMethodError when Error is raised with a string argument by @cgmoore120 in #653
- Fix/562 faraday logging memory bloat by @sergiobayona in #661
- Fix/eager logger interpolation and dead options by @sergiobayona in #662
- Fix incorrect reference for renamed model associations by @jayelkaake in #668
- Fix agent not propagating assume_model_exists from class config by @jeffmcfadden in #680
New Contributors
- @cgmoore120 made their first contribution in #653
- @sergiobayona made their first contribution in #661
- @jayelkaake made their first contribution in #668
- @jeffmcfadden made their first contribution in #680
Full Changelog: 1.13.2...1.14.0
1.13.2
RubyLLM 1.13.2: Patch Fixes for Schema + Streaming 🐛🔧
A small patch release with three fixes.
🧩 Fix: Schema Names Are Always OpenAI-Compatible
Schema names now always produce a valid response_format.json_schema.name for OpenAI:
- namespaced names like
MyApp::Schemaare sanitized - blank names now safely fall back to
response
Fixes #654.
🌊 Fix: Streaming Ignores Non-Hash SSE Payloads
Streaming handlers now skip non-Hash JSON payloads (like true) before calling provider chunk builders, preventing intermittent crashes in Anthropic streaming.
Fixes #656.
🗓️ Fix: models.dev created_at Date Handling
Improved handling for missing models.dev dates when populating created_at metadata.
Installation
gem "ruby_llm", "1.13.2"Upgrading from 1.13.1
bundle update ruby_llmMerged PRs
- Fix missing models.dev date handling for created_at metadata by @afurm in #652
- [BUG] Fix schema name sanitization for OpenAI API compatibility by @alexey-hunter-io in #655
- Fix Anthropic streaming crash on non-hash SSE payloads by @crmne in #657
Full Changelog: 1.13.1...1.13.2
1.13.1
RubyLLM 1.13.1: Quick Fixes 🐛🔧
A small patch release with three fixes.
🧩 Fix: Schema + Tool Calls No Longer Crash
Using with_schema and with_tool together caused intermediate tool-call responses to be eagerly JSON-parsed, crashing on the next API call. RubyLLM now only parses the final response content. Fixes #649.
📊 Gemini: Cached Token Usage
Gemini responses now populate cached_tokens for both regular and streaming responses.
🪟 Fix: Binary Attachment Reads on Windows
Path-based attachments now use File.binread instead of File.read, preventing text-mode truncation of binary files on Windows.
Installation
gem "ruby_llm", "1.13.1"Upgrading from 1.13.0
bundle update ruby_llmMerged PRs
New Contributors
Full Changelog: 1.13.0...1.13.1
1.13.0
RubyLLM 1.13: Massive Amount of Fixes + A Ton of Merged PRs 🎉🤖🛠️
This is a big stabilization release.
RubyLLM 1.13.0 ships a very large set of reliability fixes and production-grade polish across tool calling, structured output, provider configuration, retries/error classification, Rails generators, and agent lifecycle behavior.
There are also many merged PRs from the community in this cycle.
Highlights
🛠️ Tool Calling: More Control + Better Real-World Failure Handling
RubyLLM now supports built-in tool control parameters and better edge-case handling.
Control tool behavior with two options:
choiceto control whether/how tools are called (:auto,:none,:required, or a specific tool)callsto control whether the model may return one or multiple tool calls in a single assistant response (:one/:many) (aka "parallel" tool calling)- invalid kwargs and hallucinated/unavailable tool calls are now returned to the model as tool errors so the model can recover and try again (instead of raising app exceptions)
- fixed streaming tool-call nil-argument handling and assistant tool-call messages with nil content, so tool-call transcripts stay valid across turns
chat = RubyLLM.chat(model: "gpt-5-nano")
.with_tools(WeatherTool, CalculatorTool, choice: :required, calls: :one)
response = chat.ask("Use tools to estimate commute time + cost")
puts response.contentTool Choice (choice)
Use choice to control whether the model can call tools and which one it can call.
# Model decides whether to call tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :auto)
# Model must call one of the provided tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :required)
# Disable tool calls
chat.with_tools(WeatherTool, CalculatorTool, choice: :none)
# Force one specific tool
chat.with_tools(WeatherTool, CalculatorTool, choice: :weather_tool)Valid values:
:auto:required:none- tool name symbol/string or
ToolClass
"Parallel" Tool Calling control (calls)
Use calls to control how many tool calls the model may return in a single assistant response.
Providers usually call this parallel tool calling. We call it calls because "parallel" can be misleading: tools are not executed in parallel unless your tool executor itself is parallelized. calls describes response behavior directly.
# provider/model default behavior
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool)
# allow multiple tool calls in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :many)
# allow one tool call in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :one)
# equivalent:
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: 1)Valid values:
:many:one1
If calls is not provided, RubyLLM uses provider/model defaults, usually equivalent to calls: :many.
Invalid tool kwargs now return explicit tool errors
class SignatureTool < RubyLLM::Tool
def execute(questions:)
questions
end
end
result = SignatureTool.new.call({ "questions" => [], "isOther" => true })
puts result
# => { error: "Invalid tool arguments: unknown keyword: isOther" }Hallucinated tool calls are handled gracefully
tool_results = []
chat = RubyLLM.chat.with_tool(WeatherTool)
.on_tool_result { |result| tool_results << result }
# If the model tries to call a non-existent tool,
# RubyLLM reports a tool error and continues the conversation safely.
chat.ask("What tools do you support?")
p tool_results
# => [{ error: "Model tried to call unavailable tool `...`. Available tools: [\"weather\"]." }]🧩 Structured Output: Expanded Coverage + Better Accuracy via Schema Names
Structured output support was expanded (including Bedrock + Anthropic), and a multi-turn structured-output regression was fixed.
class PersonSchema < RubyLLM::Schema
string :name
integer :age
end
chat = RubyLLM.chat(model: "claude-haiku-4-5", provider: :bedrock)
response = chat.with_schema(PersonSchema).ask("Generate a user profile")
puts response.content
# => {"name"=>"...", "age"=>...}Schema naming also got better: RubyLLM now passes more meaningful schema names to providers, which helps the model better understand expected output structure.
# RubyLLM::Schema class names are now used as schema names automatically
class InvoiceSummarySchema < RubyLLM::Schema
string :customer
number :total
end
response = RubyLLM.chat.with_schema(InvoiceSummarySchema).ask("Summarize this invoice")And manual schemas can now provide explicit names:
invoice_schema = {
name: "InvoiceSummarySchema",
schema: {
type: "object",
properties: {
customer: { type: "string" },
total: { type: "number" }
},
required: ["customer", "total"],
additionalProperties: false
}
}
response = RubyLLM.chat.with_schema(invoice_schema).ask("Summarize this invoice")☁️ Provider Configuration Flexibility
This release adds multiple endpoint/base URL and credential options so teams can use self-hosted gateways, private routing, enterprise proxies, and compatible hosted services without patching providers.
RubyLLM.configure do |config|
config.openrouter_api_base = ENV["OPENROUTER_API_BASE"]
config.anthropic_api_base = ENV["ANTHROPIC_API_BASE"]
config.deepseek_api_base = ENV["DEEPSEEK_API_BASE"]
config.ollama_api_key = ENV["OLLAMA_API_KEY"]
endOllama API Key support
ollama_api_key support enables authenticated/remote Ollama endpoints (including Ollama Cloud-style setups) where auth headers are required.
☁️ Vertex AI: Service Account Key Support
Vertex AI auth support was improved to allow service account key usage without ADC regressions, plus scope handling fixes for GCE credentials.
RubyLLM.configure do |config|
config.vertexai_project_id = ENV["GOOGLE_CLOUD_PROJECT"]
config.vertexai_location = ENV["GOOGLE_CLOUD_LOCATION"]
config.vertexai_service_account_key = ENV["VERTEXAI_SERVICE_ACCOUNT_KEY"] # optional JSON key
end🔁 Error Handling and Retries
Error/retry behavior has been tightened for context-length and transient server cases:
- automatic retries were effectively not working for most LLM calls because POST requests were not being retried
- POST retries are now enabled
- context-length detection on HTTP 400
- improved classification for context-length 429 responses
- improved 504 classification
- retries are enabled by default (
max_retries = 3) so check your configuration to confirm this matches your desired behavior
begin
RubyLLM.chat.ask("...")
rescue RubyLLM::ContextLengthExceededError
# trim messages / reduce response size / retry
end🤖 Agent + Rails Lifecycle Fixes
Agent and Rails-backed chat behavior received important fixes:
- runtime agent instructions now persist correctly across
to_llmrebuilds - missing prompts now raise
RubyLLM::PromptNotFoundError - Rails install flow now separates schema migration from model data loading (v1.13+)
- Rails docs now include fiber-safe ActiveRecord isolation guidance for async/fiber-heavy workloads (
config.active_support.isolation_level = :fiber) - generator and migration naming fixes (including acronym model classes)
- chat UI streaming preserves whitespace chunks correctly
Rails setup now looks like:
rails generate ruby_llm:install
rails db:migrate
rails ruby_llm:load_modelsbegin
SupportAgent.new.ask("Help me with this request")
rescue RubyLLM::PromptNotFoundError => e
puts e.message
endPerformance & DX Polishes
- lazy block-style debug logging to reduce allocations when debug logging is disabled
- configurable
log_regexp_timeout - rubocop/lint/test stability improvements
- model matrix/docs refreshes (including newer xAI model IDs and image-generation coverage updates)
- obsolete
codecovgem removed - docs and model listings refreshed
Installation
gem "ruby_llm", "1.13.0"Upgrading from 1.12.x
bundle update ruby_llmMerged PRs
- Fix POST retries and 504 retry classification by @crmne in #624
- Fix streaming to preserve whitespace chunks in chat UI template by @kryzhovnik in #636
- Fix migration class name for model names with acronyms (e.g. model:AIModel) by @Saidbek in #640
- Remove dependency on obsolete 'codecov' gem by @mvz in #625
- Detect context length exceeded errors on HTTP 400 responses by @plehoux in #642
- Use UTC for created_at in order to prevent diff noise when running models:update from a different timezone by @radanskoric in #631
- Adds opentelemetry-instrumentation-ruby_llm to the ecosystem by @clarissalimab in #599
- Add configurable Anthropic API base URL by @ericproulx in #589
- Add ollama_api_key support for remote Ollama endpoints by @geeksilva97 in #612
- Add Anthropic structured output support by @hiasinho in #608
- Add Bedrock structured output support by @llenodo in #619
- Add thought signature support for Google Gemini OpenAI compatibility by @ericproulx in #588
- Data loss in cleanup_orphaned_tool_results with custom associatio...
1.12.1
RubyLLM 1.12.1: Agent API Delegation + Rails add_message Persistence + Dependency Compatibility 🎉🤖🛠️
This is a focused patch release.
RubyLLM 1.12.1 tightens Agent behavior, fixes Rails chat persistence in add_message, and relaxes dependency constraints for better compatibility.
🤖 Agent API: Full Chat Delegation via Forwardable
Agents now delegate the full RubyLLM::Chat instance API to the wrapped chat object using Ruby’s Forwardable.
This also fixes the undefined method 'delegate' for class RubyLLM::Agent issue for PORO.
Delegated methods now include core accessors and fluent config methods like:
model,messages,tools,params,headers,schemaask,say,complete,add_message,reset_messages!with_model,with_tools,with_params,with_headers,with_schema, etc.
agent = WorkAssistant.new
agent.with_model("gpt-5-nano")
agent.add_message(role: :user, content: "Summarize this thread")
response = agent.completeRails: Chat#add_message Now Persists Properly
Rails-backed chats now persist messages correctly when using add_message (not just ask/legacy flows).
chat = Chat.find(params[:chat_id])
chat.add_message(role: :user, content: params[:content]) # now persistedAlso included in this fix:
- tool-call linkage persistence for added messages
- attachment/content persistence handling improvements
create_user_messageremains as a compatibility wrapper (legacy/deprecated path)
📎 Attachment Robustness for Rails Multipart Inputs
RubyLLM::Content now ignores blank/nil attachment placeholder entries (common in Rails multipart arrays), preventing noisy failures when attachments include empty values.
📦 Dependency Compatibility Update
Dependency constraints were updated to reduce unnecessary pinning friction:
ruby_llm-schema:~> 0.2.1→~> 0marcel:~> 1.0→~> 1
Installation
gem "ruby_llm", "1.12.1"Upgrading from 1.12.0
bundle update ruby_llmFull Changelog: 1.12.0...1.12.1
1.12.0
RubyLLM 1.12: Agents + Full Cloud Provider Coverage + New instructions semantics and contributor guidelines 🎉🤖☁️
This is a big one.
RubyLLM 1.12 brings a new Agent interface and concludes cloud provider coverage:
- GCP coverage via Vertex AI (already supported)
- New: full AWS coverage via Bedrock Converse API
- New: full Azure coverage via Azure AI Foundry API
🤖 New Agent Interface
Agents are now a first-class way to define reusable AI behavior once and use it everywhere.
class WorkAssistant < RubyLLM::Agent
chat_model Chat
model "gpt-4.1-nano"
instructions "You are a concise work assistant."
tools TodoTool, GoogleDriveSearchTool
endUse it directly:
response = WorkAssistant.new.ask("What should I work on today?")Or with Rails-backed chats:
chat = WorkAssistant.create!(user: current_user)
WorkAssistant.find(chat.id).completePrompt conventions are built in (app/prompts/<agent_name>/instructions.txt.erb).
More on agents: https://rubyllm.com/agents
☁️ Bedrock Converse API: Full Bedrock Coverage
RubyLLM now uses Bedrock Converse API, which means every Bedrock chat model is supported through one consistent path.
chat = RubyLLM.chat(
model: "anthropic.claude-haiku-4-5-20251001-v1:0",
provider: :bedrock
)
response = chat.ask("Give me three ideas for reducing API latency.")If it runs on Bedrock, RubyLLM can talk to it.
☁️ Azure Foundry API Support
RubyLLM now supports Azure Foundry AI, giving you broad model access on Azure with the same RubyLLM interface.
RubyLLM.configure do |config|
config.azure_api_key = ENV["AZURE_API_KEY"]
config.azure_api_base = ENV["AZURE_API_BASE"]
end
chat = RubyLLM.chat(model: "gpt-4.1", provider: :azure)
response = chat.ask("Summarize this architecture in one paragraph.")Same API, Azure-wide model availability.
🧠 Instruction Semantics Improved
with_instructions behavior is now clearer:
- default call replaces the active system instruction
- append behavior is explicit
- instructions are always sent before other messages
chat.with_instructions("You are concise.")
chat.with_instructions("Use bullet points.", append: true)🤝 Contributor + Provider Guidance Expanded
We clarified how contributions should flow so reviews are faster and less surprising.
What we ask now:
- Open an issue first and wait for maintainer feedback before coding new features.
- Keep PRs focused and reasonably sized.
- If you used AI tooling, you still own the code: understand every line before opening the PR.
Provider-specific direction is also clearer:
- Core providers have a high acceptance bar.
- For smaller or emerging providers, we usually prefer a community gem over adding it to RubyLLM core.
Net effect: less churn in review, clearer expectations up front.
📚 Docs & DX Polishes
A bunch of quality-of-life improvements shipped alongside core features:
- updated guides around agents and configuration
- docs UX improvements (copy page button, dark mode polish)
Installation
gem "ruby_llm", "1.12.0"Upgrading from 1.11.x
bundle update ruby_llmFull Changelog: 1.11.0...1.12.0
1.11.0
RubyLLM 1.11: xAI Provider & Grok Models 🚀🤖⚡
This release welcomes xAI as a first-class provider, brings Grok models into the registry, and polishes docs around configuration and thinking. Plug in your xAI API key and start chatting with Grok in seconds.
🚀 xAI Provider (Hello, Grok!)
Use xAI’s OpenAI-compatible API via a dedicated provider and jump straight into chat:
RubyLLM.configure do |config|
config.xai_api_key = ENV["XAI_API_KEY"]
end
chat = RubyLLM.chat(model: "grok-4-fast-non-reasoning")
response = chat.ask("What's the fastest way to parse a CSV in Ruby?")
response.content- xAI is now a first-class provider (
:xai) with OpenAI-compatible endpoints under the hood. - Grok models are included in the registry so you can pick by name without extra wiring.
- Streaming, tool calls, and structured output work the same way you already use with OpenAI-compatible providers.
Stream responses just like you’re used to:
chat = RubyLLM.chat(model: "grok-3-mini")
chat.ask("Summarize this PR in 3 bullets") do |chunk|
print chunk.content
end🧩 Model Registry Refresh
Model metadata and the public models list were refreshed to include Grok models and related updates.
📚 Docs Polishes
- Configuration docs now include xAI setup examples.
- The thinking guide got a tighter flow and clearer examples.
🛠️ Provider Fixes
- Resolved an OpenAI, Bedrock, and Anthropic error introduced by the new URI interface.
Installation
gem "ruby_llm", "1.11.0"Upgrading from 1.10.x
bundle update ruby_llmMerged PRs
- Add xAI Provider by @infinityrobot and @crmne in #373
Full Changelog: 1.10.0...1.11.0
1.10.0
RubyLLM 1.10: Extended Thinking, Persistent Thoughts & Streaming Fixes 🧠✨🚆
This release brings first-class extended thinking across providers, full Gemini 3 Pro/Flash thinking-signature support (chat + tools), a Rails upgrade path to persist it, and a tighter streaming pipeline. Plus official Ruby 4.0 support, safer model registry refreshes, a Vertex AI global endpoint fix, and a docs refresh.
🧠 Extended Thinking Everywhere
Tune reasoning depth and budget across providers with with_thinking, and get thinking output back when available:
chat = RubyLLM.chat(model: "claude-opus-4.5")
.with_thinking(effort: :high, budget: 8000)
response = chat.ask("Prove it with numbers.")
response.thinking&.text
response.thinking&.signature
response.thinking_tokensresponse.thinkingandchunk.thinkingexpose thinking content during normal and streaming requests.response.thinking_tokensandresponse.tokens.thinkingtrack thinking token usage when providers report it.- Gemini 3 Pro/Flash fully support thought signatures across chat and tool calls, so multi-step sessions stay consistent.
- Extended thinking quirks are now normalized across providers so you can tune one API and get predictable output.
Stream thinking and answer content side-by-side:
chat = RubyLLM.chat(model: "claude-opus-4.5")
.with_thinking(effort: :medium)
chat.ask("Solve this step by step: What is 127 * 43?") do |chunk|
print chunk.thinking&.text
print chunk.content
end- Streaming stays backward-compatible: existing apps can keep printing
chunk.content, while richer UIs can also renderchunk.thinking.
🧰 Rails + ActiveRecord Persistence
Thinking output can now be stored alongside messages (text, signature, and token usage), with an upgrade generator for existing apps:
rails generate ruby_llm:upgrade_to_v1_10
rails db:migrate- Adds
thinking_text,thinking_signature, andthinking_tokensto message tables. - Adds
thought_signatureto tool calls for Gemini tool calling. - Fixes a Rails streaming issue where the first tokens could be dropped.
📊 Unified Token Tracking
All token counts now live in response.tokens and message.tokens, including input, output, cached, cache creation, and thinking tokens.
✅ Official Ruby 4.0 Support
Ruby 4.0 is now officially supported in CI and dependencies.
🧩 Model Registry Updates
- Refreshing the registry no longer deletes models from providers you haven't configured.
🌍 Vertex AI Global Endpoint Fix
When vertexai_location is global, the API base now correctly resolves to:
https://aiplatform.googleapis.com/v1beta1
📚 Docs Updates
- New extended thinking guide.
- Token usage docs include thinking tokens.
Installation
gem "ruby_llm", "1.10.0"Upgrading from 1.9.x
bundle update ruby_llm
rails generate ruby_llm:upgrade_to_v1_10
rails db:migrateMerged PRs
- Fix Vertex AI Global Endpoint URL Construction by @NielsKSchjoedt in #553
New Contributors
- @NielsKSchjoedt made their first contribution in #553
Full Changelog: 1.9.2...1.10.0