07 May 00:49

crmne

ff39289

1.15.0 Latest

Latest

RubyLLM 1.15: Image Editing + Cost Tracking + Less Glue Code 🖼️💸🛠️

RubyLLM 1.15 removes glue code around images, costs, tools, callbacks, and Rails persistence.

If Ruby can infer a tool signature, RubyLLM now infers it. If a provider reports usage, RubyLLM can turn it into cost. If Rails already has a blob, RubyLLM reuses it instead of downloading and uploading it again.

🖼️ Image Editing

Same method, same attachment shape: paint now paints from scratch or edits an existing image.

RubyLLM.paint can edit existing images with OpenAI's GPT Image models. Pass one or more source images with with:, add a mask: when you want to constrain the editable area, and use params: for provider-specific image options.

image = RubyLLM.paint(
  "Turn the logo green and keep the background transparent",
  model: "gpt-image-1",
  with: "logo.png"
)

with: accepts the same attachment sources RubyLLM supports elsewhere: local files, URLs, IO-like objects, and Active Storage attachments. Multiple source images work too:

image = RubyLLM.paint(
  "Combine these references into a postcard illustration",
  model: "gpt-image-1",
  with: ["person.png", "style-reference.png"]
)

Image responses now expose provider usage data, and GPT Image pricing is represented in the model registry so image input/output costs can be calculated with the same API shape used by chats and messages:

image.tokens.input
image.tokens.output

image.cost.input
image.cost.output
image.cost.total

Fixes #138 and #512.

💸 Conversation Costs + Normalized Tokens

Token counts answer "how many?" Cost helpers answer the next question: "how much?"

RubyLLM now has first-class cost helpers for token-priced conversation usage:

response = chat.ask("Summarize Ruby's object model.")

response.cost.total
chat.cost.total
agent.cost.total

A response can tell you its cost. A chat can tell you the running total. An agent can too. Images use the same shape.

Under the hood, RubyLLM::Cost uses normalized token buckets plus pricing from the model registry: standard input, billable output, cache reads, cache writes, and separately priced thinking/reasoning tokens when the model exposes a distinct reasoning-token price.

Prompt caching made token counts messy, so 1.15 separates the buckets before exposing them:

response.tokens.input       # Standard input tokens
response.tokens.output      # Billable output tokens
response.tokens.cache_read  # Prompt cache reads
response.tokens.cache_write # Prompt cache writes

The top-level token helpers still work for backwards compatibility, but new code should prefer response.tokens.*.

For Rails users, persisted messages and chats expose the same helpers. No new migration is required if you already ran the v1.9 token migration; the new names use the existing cached_tokens and cache_creation_tokens columns.

🛠️ Simpler Tool Definitions

Simple tools no longer need duplicated parameter declarations. If the execute signature already says what arguments exist, RubyLLM can infer the flat schema:

class Weather < RubyLLM::Tool
  desc "Gets current weather for a location"

  def execute(latitude:, longitude:, units: "metric")
    # ...
  end
end

Required keywords become required string parameters. Optional keywords become optional string parameters. Explicit param declarations and the full params DSL still win when you need descriptions, non-string types, nested objects, arrays, enums, or full JSON Schema control.

There are also small ergonomics improvements:

desc is now an alias for description
param accepts description: as an alias for desc:
the tool generator now emits desc
tools with no keyword arguments now get an empty object schema

🔁 Additive Chat Callbacks

Callbacks now stack instead of replacing each other. Register five callbacks for the same event and all five run:

chat.before_message { ... }
chat.after_message { |message| ... }
chat.before_tool_call { |tool_call| ... }
chat.after_tool_result { |result| ... }

Unlike the legacy on_* callbacks, multiple before_* / after_* callbacks can be registered for the same event and they all run. The old callbacks still work, but they now log deprecation warnings and keep their existing replacing behavior. They will be removed in RubyLLM 2.0.

Rails persistence now uses the additive callbacks internally, so application callbacks can be layered on top without disturbing message persistence.

🚂 Rails Fixes

Rails got the boring, important fixes: fewer load-order surprises, better persistence behavior, and less duplicate file handling.

Action Text Content

Messages backed by has_rich_text :content now use to_plain_text before being sent to the model. This prevents Action Text HTML from leaking into LLM messages and still works when the message has attachments.

ActiveRecord Eager Loading

The optional ActiveRecord integration is no longer part of the core gem eager-load path. The Railtie now explicitly loads the ActiveRecord support files only after ActiveRecord loads, which fixes standalone require "ruby_llm" + Zeitwerk eager-loading failures while preserving normal Rails behavior. CI now includes an eager-load guard across Rails appraisals.

Rails Association Conventions

The new acts_as API now follows Rails association inference more closely. Association names determine default foreign keys, while *_class options only change class names. The install generator now emits explicit foreign key options only when Rails would not infer the intended key.

Active Storage Persistence

Existing ActiveStorage::Blob, ActiveStorage::Attachment, has_one_attached, and has_many_attached records are reused directly instead of being downloaded and re-uploaded when passed through with:. The Rails docs now clarify that RubyLLM message records need an :attachments association, but your own app models can use any Active Storage attachment name.

🐛 Provider Fixes

Provider cleanup in this release is mostly about making edge cases boring.

Empty tool results are now handled consistently across Anthropic, Bedrock, and Gemini. When a tool returns no content, RubyLLM sends a small (no output) placeholder instead of provider-invalid empty content.

Streaming and non-streaming token usage is also normalized across OpenAI, OpenRouter, Bedrock, and Gemini so cache reads/writes are separated from standard input tokens before cost calculations.

📚 Docs + SEO

The docs have been updated for image editing, tool signature inference, additive callbacks, normalized token semantics, cost helpers, Active Storage attachment names, Rails association conventions, and using an existing Chat record with an Agent.

The docs site also gained richer SEO and AI-visible metadata: JSON-LD injection for collection pages, llms.txt support through jekyll-ai-visible-content, an About page, cleaner collection dates, and a gemspec changelog link that now points to GitHub Releases.

RubyLLM::Tribunal has been added to the ecosystem page for LLM evaluation and testing in Ruby.

📦 Updated Model Registry

The model registry has been refreshed with the latest available models and pricing metadata. This update adds cache read/write pricing fields, reasoning output pricing, image pricing for GPT Image models, and new aliases including Claude Opus 4.7, DeepSeek V4 Flash/Pro, Gemini Embedding 2, Gemma 4, and GPT-5.5.

Installation

gem "ruby_llm", "1.15"

Upgrading from 1.14.x

bundle update ruby_llm

If you display or store token counts directly, read the 1.15 upgrade guide section. tokens.input now means standard input tokens; add tokens.cache_read and tokens.cache_write when you need total request-side input activity.

Merged PRs

Add RubyLLM::Tribunal to ecosystem page by @Florian95 in #571
Remove Code Duplication by @Aesthetikx in #732
Feat: Add support to Action Text enabled content by @chagel in #365
Fix ActiveRecord dependency check for Zeitwerk eager loading by @trevorturk in #504
Document usage of existing Chat record with Agent by @sarrietav-dev in #693

New Contributors

@Florian95 made their first contribution in #571
@Aesthetikx made their first contribution in #732
@chagel made their first contribution in #365
@sarrietav-dev made their first contribution in #693

Full Changelog: 1.14.1...1.15.0

Contributors

trevorturk, Florian95, and 3 other contributors

Assets 2

02 Apr 12:25

crmne

1.14.1

ac4d5d7

1.14.1

RubyLLM 1.14.1: Leaner Providers + ActiveStorage Fix 🔧🐛

A patch release that slims down provider code, fixes an ActiveStorage blob re-upload issue, and refreshes the model registry.

🏗️ Leaner Provider Model Fallbacks

Provider Capabilities modules no longer carry hundreds of lines of hardcoded pricing, context windows, and feature flags. These values now come from the model registry (models.json), and the Ruby fallback code has been trimmed to only the fields that are genuinely needed at the provider level (e.g. supports_tool_choice?). This removes ~400 net lines of code across Anthropic, OpenAI, Gemini, DeepSeek, and Perplexity, and means the registry is the single source of truth for model metadata.

For you, this means more accurate capabilities and pricing going forward.

🐛 Fix: ActiveStorage::Blob Re-upload

When an ActiveStorage::Blob was passed to ask or create_user_message via with:, it was downloaded and re-uploaded as a new blob. Existing blobs are now detected and reused directly, avoiding unnecessary storage churn. Fixes #665.

📦 Updated Model Registry

The model registry has been refreshed with the latest available models from all providers.

📝 Docs

Fixed a typo in the moderation guide ("Patters" → "Patterns").

Installation

gem "ruby_llm", "1.14.1"

Upgrading from 1.14.0

bundle update ruby_llm

Merged PRs

Fix ActiveStorage::Blob re-upload if used in with: param by @bubiche in #683
Update moderation.md by @artinboghosian in #703

New Contributors

@bubiche made their first contribution in #683
@artinboghosian made their first contribution in #703

Full Changelog: 1.14.0...1.14.1

Contributors

artinboghosian and bubiche

Assets 2

16 Mar 21:10

crmne

1.14.0

4034c05

1.14.0

RubyLLM 1.14: Tailwind Chat UI + Rails AI Generators + Config DSL 🎨🤖🛠️

This release overhauls the Rails experience.

RubyLLM 1.14 ships a complete Tailwind-powered chat UI, new Rails generators for agents/tools/schemas, a simplified configuration DSL where providers self-register their options, and a batch of bug fixes across logging, agents, associations, and dependency constraints.

🎨 Tailwind Chat UI

demo.mp4

The Rails chat UI generator now produces a polished Tailwind-based interface out of the box. Run the generator and get a working chat app with message streaming, model selection, tool call display, and proper empty states — all styled with Tailwind CSS.

bin/rails generate ruby_llm:chat_ui

The generated views use role-aware partials (_user, _assistant, _system, _tool, _error) for clean message rendering, Turbo Stream templates for real-time updates, and broadcasts_to for simplified broadcasting.

🏗️ Rails AI Generators

New generators scaffold agents, tools, and schemas with a single command:

bin/rails generate ruby_llm:agent SupportAgent
bin/rails generate ruby_llm:tool WeatherTool

The install generator now creates conventional directories (app/agents, app/tools, app/schemas, app/prompts) with .gitkeep files. Tool partials follow a new naming convention for tool-specific rendering, and the generator produces matching specs.

⚙️ Simplified Configuration DSL

Provider configuration options are now self-registered by each provider using a declarative configuration_options method, replacing the monolithic attr_accessor list in Configuration. When a provider is registered, its options become attr_accessors on RubyLLM::Configuration automatically.

Each provider declares its own option keys following the <provider_slug>_<option> convention:

# In the provider class:
class DeepSeek < RubyLLM::Provider
  class << self
    def configuration_options
      %i[deepseek_api_key deepseek_api_base]
    end
  end
end

# These become available in configuration automatically:
RubyLLM.configure do |config|
  config.deepseek_api_key  = ENV["DEEPSEEK_API_KEY"]
  config.deepseek_api_base = ENV["DEEPSEEK_API_BASE"]
end

This means third-party provider gems can register their own config keys without patching Configuration.

🐛 Fixes

Faraday Logging Memory Bloat

Faraday body logging no longer serializes large payloads (e.g. base64-encoded PDFs) when the log level is above DEBUG. This eliminates unnecessary memory allocations on every request. Fixes #562.

Gemspec Faraday Constraint Regression

Fixed an overly strict Faraday version constraint in the gemspec that broke compatibility for some users. Fixes #682.

Agent `assume_model_exists` Propagation

Agent class-level assume_model_exists configuration now correctly propagates to chat instances. Previously, setting it on the agent class had no effect.

Renamed Model Associations

Fixed incorrect foreign key references when using renamed model associations with acts_as helpers.

Eager Logger Interpolation

Fixed eager string interpolation in log statements that caused unnecessary object allocations even when logging was disabled.

Error Raised with String Argument

RubyLLM::Error.new("message") no longer raises a NoMethodError. Fixes #653.

MySQL/MariaDB Compatibility

Fixed JSON column default handling for MySQL/MariaDB users. Fixes #521.

File Attachments Across Ruby Versions

Stabilized file attachment handling to work consistently across different Ruby versions.

Model Type Classification

Fixed model type classification for models that support multiple output modalities (e.g. text + image).

VertexAI Registry Filtering

Slash-based model IDs are now filtered out of the Vertex AI registry, preventing invalid model entries.

📦 Updated Model Registry

Default models and the model registry have been refreshed with the latest available models.

Installation

gem "ruby_llm", "1.14"

Upgrading from 1.13.x

bundle update ruby_llm

Merged PRs

Fix NoMethodError when Error is raised with a string argument by @cgmoore120 in #653
Fix/562 faraday logging memory bloat by @sergiobayona in #661
Fix/eager logger interpolation and dead options by @sergiobayona in #662
Fix incorrect reference for renamed model associations by @jayelkaake in #668
Fix agent not propagating assume_model_exists from class config by @jeffmcfadden in #680

New Contributors

@cgmoore120 made their first contribution in #653
@sergiobayona made their first contribution in #661
@jayelkaake made their first contribution in #668
@jeffmcfadden made their first contribution in #680

Full Changelog: 1.13.2...1.14.0

Contributors

jeffmcfadden, cgmoore120, and 2 other contributors

Assets 2

05 Mar 10:26

crmne

1.13.2

0950693

1.13.2

RubyLLM 1.13.2: Patch Fixes for Schema + Streaming 🐛🔧

A small patch release with three fixes.

🧩 Fix: Schema Names Are Always OpenAI-Compatible

Schema names now always produce a valid response_format.json_schema.name for OpenAI:

namespaced names like MyApp::Schema are sanitized
blank names now safely fall back to response

Fixes #654.

🌊 Fix: Streaming Ignores Non-Hash SSE Payloads

Streaming handlers now skip non-Hash JSON payloads (like true) before calling provider chunk builders, preventing intermittent crashes in Anthropic streaming.

Fixes #656.

🗓️ Fix: models.dev `created_at` Date Handling

Improved handling for missing models.dev dates when populating created_at metadata.

Installation

gem "ruby_llm", "1.13.2"

Upgrading from 1.13.1

bundle update ruby_llm

Merged PRs

Fix missing models.dev date handling for created_at metadata by @afurm in #652
[BUG] Fix schema name sanitization for OpenAI API compatibility by @alexey-hunter-io in #655
Fix Anthropic streaming crash on non-hash SSE payloads by @crmne in #657

Full Changelog: 1.13.1...1.13.2

Contributors

crmne, afurm, and alexey-hunter-io

Assets 2

04 Mar 11:54

crmne

1.13.1

97c0546

1.13.1

RubyLLM 1.13.1: Quick Fixes 🐛🔧

A small patch release with three fixes.

🧩 Fix: Schema + Tool Calls No Longer Crash

Using with_schema and with_tool together caused intermediate tool-call responses to be eagerly JSON-parsed, crashing on the next API call. RubyLLM now only parses the final response content. Fixes #649.

📊 Gemini: Cached Token Usage

Gemini responses now populate cached_tokens for both regular and streaming responses.

🪟 Fix: Binary Attachment Reads on Windows

Path-based attachments now use File.binread instead of File.read, preventing text-mode truncation of binary files on Windows.

Installation

gem "ruby_llm", "1.13.1"

Upgrading from 1.13.0

bundle update ruby_llm

Merged PRs

Fix schema JSON parsing for intermediate tool-call responses by @trouni in #650

New Contributors

@trouni made their first contribution in #650

Full Changelog: 1.13.0...1.13.1

Contributors

trouni

Assets 2

03 Mar 14:18

crmne

1.13.0

363d3a7

1.13.0

RubyLLM 1.13: Massive Amount of Fixes + A Ton of Merged PRs 🎉🤖🛠️

This is a big stabilization release.

RubyLLM 1.13.0 ships a very large set of reliability fixes and production-grade polish across tool calling, structured output, provider configuration, retries/error classification, Rails generators, and agent lifecycle behavior.

There are also many merged PRs from the community in this cycle.

Highlights

🛠️ Tool Calling: More Control + Better Real-World Failure Handling

RubyLLM now supports built-in tool control parameters and better edge-case handling.

Control tool behavior with two options:

choice to control whether/how tools are called (:auto, :none, :required, or a specific tool)
calls to control whether the model may return one or multiple tool calls in a single assistant response (:one / :many) (aka "parallel" tool calling)
invalid kwargs and hallucinated/unavailable tool calls are now returned to the model as tool errors so the model can recover and try again (instead of raising app exceptions)
fixed streaming tool-call nil-argument handling and assistant tool-call messages with nil content, so tool-call transcripts stay valid across turns

chat = RubyLLM.chat(model: "gpt-5-nano")
  .with_tools(WeatherTool, CalculatorTool, choice: :required, calls: :one)

response = chat.ask("Use tools to estimate commute time + cost")
puts response.content

Tool Choice (`choice`)

Use choice to control whether the model can call tools and which one it can call.

# Model decides whether to call tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :auto)

# Model must call one of the provided tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :required)

# Disable tool calls
chat.with_tools(WeatherTool, CalculatorTool, choice: :none)

# Force one specific tool
chat.with_tools(WeatherTool, CalculatorTool, choice: :weather_tool)

Valid values:

:auto
:required
:none
tool name symbol/string or ToolClass

"Parallel" Tool Calling control (`calls`)

Use calls to control how many tool calls the model may return in a single assistant response.

Providers usually call this parallel tool calling. We call it calls because "parallel" can be misleading: tools are not executed in parallel unless your tool executor itself is parallelized. calls describes response behavior directly.

# provider/model default behavior
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool)

# allow multiple tool calls in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :many)

# allow one tool call in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :one)

# equivalent:
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: 1)

Valid values:

:many
:one
1

If calls is not provided, RubyLLM uses provider/model defaults, usually equivalent to calls: :many.

Invalid tool kwargs now return explicit tool errors

class SignatureTool < RubyLLM::Tool
  def execute(questions:)
    questions
  end
end

result = SignatureTool.new.call({ "questions" => [], "isOther" => true })
puts result
# => { error: "Invalid tool arguments: unknown keyword: isOther" }

Hallucinated tool calls are handled gracefully

tool_results = []

chat = RubyLLM.chat.with_tool(WeatherTool)
  .on_tool_result { |result| tool_results << result }

# If the model tries to call a non-existent tool,
# RubyLLM reports a tool error and continues the conversation safely.
chat.ask("What tools do you support?")

p tool_results
# => [{ error: "Model tried to call unavailable tool `...`. Available tools: [\"weather\"]." }]

🧩 Structured Output: Expanded Coverage + Better Accuracy via Schema Names

Structured output support was expanded (including Bedrock + Anthropic), and a multi-turn structured-output regression was fixed.

class PersonSchema < RubyLLM::Schema
  string :name
  integer :age
end

chat = RubyLLM.chat(model: "claude-haiku-4-5", provider: :bedrock)
response = chat.with_schema(PersonSchema).ask("Generate a user profile")
puts response.content
# => {"name"=>"...", "age"=>...}

Schema naming also got better: RubyLLM now passes more meaningful schema names to providers, which helps the model better understand expected output structure.

# RubyLLM::Schema class names are now used as schema names automatically
class InvoiceSummarySchema < RubyLLM::Schema
  string :customer
  number :total
end

response = RubyLLM.chat.with_schema(InvoiceSummarySchema).ask("Summarize this invoice")

And manual schemas can now provide explicit names:

invoice_schema = {
  name: "InvoiceSummarySchema",
  schema: {
    type: "object",
    properties: {
      customer: { type: "string" },
      total: { type: "number" }
    },
    required: ["customer", "total"],
    additionalProperties: false
  }
}

response = RubyLLM.chat.with_schema(invoice_schema).ask("Summarize this invoice")

☁️ Provider Configuration Flexibility

This release adds multiple endpoint/base URL and credential options so teams can use self-hosted gateways, private routing, enterprise proxies, and compatible hosted services without patching providers.

RubyLLM.configure do |config|
  config.openrouter_api_base = ENV["OPENROUTER_API_BASE"]
  config.anthropic_api_base  = ENV["ANTHROPIC_API_BASE"]
  config.deepseek_api_base   = ENV["DEEPSEEK_API_BASE"]
  config.ollama_api_key      = ENV["OLLAMA_API_KEY"]
end

Ollama API Key support

ollama_api_key support enables authenticated/remote Ollama endpoints (including Ollama Cloud-style setups) where auth headers are required.

☁️ Vertex AI: Service Account Key Support

Vertex AI auth support was improved to allow service account key usage without ADC regressions, plus scope handling fixes for GCE credentials.

RubyLLM.configure do |config|
  config.vertexai_project_id          = ENV["GOOGLE_CLOUD_PROJECT"]
  config.vertexai_location            = ENV["GOOGLE_CLOUD_LOCATION"]
  config.vertexai_service_account_key = ENV["VERTEXAI_SERVICE_ACCOUNT_KEY"] # optional JSON key
end

🔁 Error Handling and Retries

Error/retry behavior has been tightened for context-length and transient server cases:

automatic retries were effectively not working for most LLM calls because POST requests were not being retried
POST retries are now enabled
context-length detection on HTTP 400
improved classification for context-length 429 responses
improved 504 classification
retries are enabled by default (max_retries = 3) so check your configuration to confirm this matches your desired behavior

begin
  RubyLLM.chat.ask("...")
rescue RubyLLM::ContextLengthExceededError
  # trim messages / reduce response size / retry
end

🤖 Agent + Rails Lifecycle Fixes

Agent and Rails-backed chat behavior received important fixes:

runtime agent instructions now persist correctly across to_llm rebuilds
missing prompts now raise RubyLLM::PromptNotFoundError
Rails install flow now separates schema migration from model data loading (v1.13+)
Rails docs now include fiber-safe ActiveRecord isolation guidance for async/fiber-heavy workloads (config.active_support.isolation_level = :fiber)
generator and migration naming fixes (including acronym model classes)
chat UI streaming preserves whitespace chunks correctly

Rails setup now looks like:

rails generate ruby_llm:install
rails db:migrate
rails ruby_llm:load_models

begin
  SupportAgent.new.ask("Help me with this request")
rescue RubyLLM::PromptNotFoundError => e
  puts e.message
end

Performance & DX Polishes

lazy block-style debug logging to reduce allocations when debug logging is disabled
configurable log_regexp_timeout
rubocop/lint/test stability improvements
model matrix/docs refreshes (including newer xAI model IDs and image-generation coverage updates)
obsolete codecov gem removed
docs and model listings refreshed

Installation

gem "ruby_llm", "1.13.0"

Upgrading from 1.12.x

bundle update ruby_llm

Merged PRs

Fix POST retries and 504 retry classification by @crmne in #624
Fix streaming to preserve whitespace chunks in chat UI template by @kryzhovnik in #636
Fix migration class name for model names with acronyms (e.g. model:AIModel) by @Saidbek in #640
Remove dependency on obsolete 'codecov' gem by @mvz in #625
Detect context length exceeded errors on HTTP 400 responses by @plehoux in #642
Use UTC for created_at in order to prevent diff noise when running models:update from a different timezone by @radanskoric in #631
Adds opentelemetry-instrumentation-ruby_llm to the ecosystem by @clarissalimab in #599
Add configurable Anthropic API base URL by @ericproulx in #589
Add ollama_api_key support for remote Ollama endpoints by @geeksilva97 in #612
Add Anthropic structured output support by @hiasinho in #608
Add Bedrock structured output support by @llenodo in #619
Add thought signature support for Google Gemini OpenAI compatibility by @ericproulx in #588
Data loss in cleanup_orphaned_tool_results with custom associatio...

Contributors

mvz, cpetersen, and 25 other contributors

Assets 2

19 Feb 16:40

crmne

1.12.1

f44384a

1.12.1

RubyLLM 1.12.1: Agent API Delegation + Rails `add_message` Persistence + Dependency Compatibility 🎉🤖🛠️

This is a focused patch release.

RubyLLM 1.12.1 tightens Agent behavior, fixes Rails chat persistence in add_message, and relaxes dependency constraints for better compatibility.

🤖 Agent API: Full `Chat` Delegation via `Forwardable`

Agents now delegate the full RubyLLM::Chat instance API to the wrapped chat object using Ruby’s Forwardable.

This also fixes the undefined method 'delegate' for class RubyLLM::Agent issue for PORO.

Delegated methods now include core accessors and fluent config methods like:

model, messages, tools, params, headers, schema
ask, say, complete, add_message, reset_messages!
with_model, with_tools, with_params, with_headers, with_schema, etc.

agent = WorkAssistant.new
agent.with_model("gpt-5-nano")
agent.add_message(role: :user, content: "Summarize this thread")
response = agent.complete

Rails: `Chat#add_message` Now Persists Properly

Rails-backed chats now persist messages correctly when using add_message (not just ask/legacy flows).

chat = Chat.find(params[:chat_id])
chat.add_message(role: :user, content: params[:content]) # now persisted

Also included in this fix:

tool-call linkage persistence for added messages
attachment/content persistence handling improvements
create_user_message remains as a compatibility wrapper (legacy/deprecated path)

📎 Attachment Robustness for Rails Multipart Inputs

RubyLLM::Content now ignores blank/nil attachment placeholder entries (common in Rails multipart arrays), preventing noisy failures when attachments include empty values.

📦 Dependency Compatibility Update

Dependency constraints were updated to reduce unnecessary pinning friction:

ruby_llm-schema: ~> 0.2.1 → ~> 0
marcel: ~> 1.0 → ~> 1

Installation

gem "ruby_llm", "1.12.1"

Upgrading from 1.12.0

bundle update ruby_llm

Full Changelog: 1.12.0...1.12.1

Assets 2

14 Feb 12:04

crmne

1.12.0

6278a05

1.12.0

RubyLLM 1.12: Agents + Full Cloud Provider Coverage + New instructions semantics and contributor guidelines 🎉🤖☁️

This is a big one.

RubyLLM 1.12 brings a new Agent interface and concludes cloud provider coverage:

GCP coverage via Vertex AI (already supported)
New: full AWS coverage via Bedrock Converse API
New: full Azure coverage via Azure AI Foundry API

🤖 New Agent Interface

Agents are now a first-class way to define reusable AI behavior once and use it everywhere.

class WorkAssistant < RubyLLM::Agent
  chat_model Chat
  model "gpt-4.1-nano"
  instructions "You are a concise work assistant."
  tools TodoTool, GoogleDriveSearchTool
end

Use it directly:

response = WorkAssistant.new.ask("What should I work on today?")

Or with Rails-backed chats:

chat = WorkAssistant.create!(user: current_user)
WorkAssistant.find(chat.id).complete

Prompt conventions are built in (app/prompts/<agent_name>/instructions.txt.erb).

More on agents: https://rubyllm.com/agents

☁️ Bedrock Converse API: Full Bedrock Coverage

RubyLLM now uses Bedrock Converse API, which means every Bedrock chat model is supported through one consistent path.

chat = RubyLLM.chat(
  model: "anthropic.claude-haiku-4-5-20251001-v1:0",
  provider: :bedrock
)

response = chat.ask("Give me three ideas for reducing API latency.")

If it runs on Bedrock, RubyLLM can talk to it.

☁️ Azure Foundry API Support

RubyLLM now supports Azure Foundry AI, giving you broad model access on Azure with the same RubyLLM interface.

RubyLLM.configure do |config|
  config.azure_api_key = ENV["AZURE_API_KEY"]
  config.azure_api_base = ENV["AZURE_API_BASE"]
end

chat = RubyLLM.chat(model: "gpt-4.1", provider: :azure)
response = chat.ask("Summarize this architecture in one paragraph.")

Same API, Azure-wide model availability.

🧠 Instruction Semantics Improved

with_instructions behavior is now clearer:

default call replaces the active system instruction
append behavior is explicit
instructions are always sent before other messages

chat.with_instructions("You are concise.")
chat.with_instructions("Use bullet points.", append: true)

🤝 Contributor + Provider Guidance Expanded

We clarified how contributions should flow so reviews are faster and less surprising.

What we ask now:

Open an issue first and wait for maintainer feedback before coding new features.
Keep PRs focused and reasonably sized.
If you used AI tooling, you still own the code: understand every line before opening the PR.

Provider-specific direction is also clearer:

Core providers have a high acceptance bar.
For smaller or emerging providers, we usually prefer a community gem over adding it to RubyLLM core.

Net effect: less churn in review, clearer expectations up front.

📚 Docs & DX Polishes

A bunch of quality-of-life improvements shipped alongside core features:

updated guides around agents and configuration
docs UX improvements (copy page button, dark mode polish)

Installation

gem "ruby_llm", "1.12.0"

Upgrading from 1.11.x

bundle update ruby_llm

Full Changelog: 1.11.0...1.12.0

Assets 2

16 Jan 17:47

crmne

1.11.0

1ad64c8

1.11.0

RubyLLM 1.11: xAI Provider & Grok Models 🚀🤖⚡

This release welcomes xAI as a first-class provider, brings Grok models into the registry, and polishes docs around configuration and thinking. Plug in your xAI API key and start chatting with Grok in seconds.

🚀 xAI Provider (Hello, Grok!)

Use xAI’s OpenAI-compatible API via a dedicated provider and jump straight into chat:

RubyLLM.configure do |config|
  config.xai_api_key = ENV["XAI_API_KEY"]
end

chat = RubyLLM.chat(model: "grok-4-fast-non-reasoning")
response = chat.ask("What's the fastest way to parse a CSV in Ruby?")
response.content

xAI is now a first-class provider (:xai) with OpenAI-compatible endpoints under the hood.
Grok models are included in the registry so you can pick by name without extra wiring.
Streaming, tool calls, and structured output work the same way you already use with OpenAI-compatible providers.

Stream responses just like you’re used to:

chat = RubyLLM.chat(model: "grok-3-mini")

chat.ask("Summarize this PR in 3 bullets") do |chunk|
  print chunk.content
end

🧩 Model Registry Refresh

Model metadata and the public models list were refreshed to include Grok models and related updates.

📚 Docs Polishes

Configuration docs now include xAI setup examples.
The thinking guide got a tighter flow and clearer examples.

🛠️ Provider Fixes

Resolved an OpenAI, Bedrock, and Anthropic error introduced by the new URI interface.

Installation

gem "ruby_llm", "1.11.0"

Upgrading from 1.10.x

bundle update ruby_llm

Merged PRs

Add xAI Provider by @infinityrobot and @crmne in #373

Full Changelog: 1.10.0...1.11.0

Contributors

crmne and infinityrobot

Assets 2

13 Jan 19:03

crmne

1.10.0

f3dfe31

1.10.0

RubyLLM 1.10: Extended Thinking, Persistent Thoughts & Streaming Fixes 🧠✨🚆

This release brings first-class extended thinking across providers, full Gemini 3 Pro/Flash thinking-signature support (chat + tools), a Rails upgrade path to persist it, and a tighter streaming pipeline. Plus official Ruby 4.0 support, safer model registry refreshes, a Vertex AI global endpoint fix, and a docs refresh.

🧠 Extended Thinking Everywhere

Tune reasoning depth and budget across providers with with_thinking, and get thinking output back when available:

chat = RubyLLM.chat(model: "claude-opus-4.5")
  .with_thinking(effort: :high, budget: 8000)

response = chat.ask("Prove it with numbers.")
response.thinking&.text
response.thinking&.signature
response.thinking_tokens

response.thinking and chunk.thinking expose thinking content during normal and streaming requests.
response.thinking_tokens and response.tokens.thinking track thinking token usage when providers report it.
Gemini 3 Pro/Flash fully support thought signatures across chat and tool calls, so multi-step sessions stay consistent.
Extended thinking quirks are now normalized across providers so you can tune one API and get predictable output.

Stream thinking and answer content side-by-side:

chat = RubyLLM.chat(model: "claude-opus-4.5")
  .with_thinking(effort: :medium)

chat.ask("Solve this step by step: What is 127 * 43?") do |chunk|
  print chunk.thinking&.text
  print chunk.content
end

Streaming stays backward-compatible: existing apps can keep printing chunk.content, while richer UIs can also render chunk.thinking.

🧰 Rails + ActiveRecord Persistence

Thinking output can now be stored alongside messages (text, signature, and token usage), with an upgrade generator for existing apps:

rails generate ruby_llm:upgrade_to_v1_10
rails db:migrate

Adds thinking_text, thinking_signature, and thinking_tokens to message tables.
Adds thought_signature to tool calls for Gemini tool calling.
Fixes a Rails streaming issue where the first tokens could be dropped.

📊 Unified Token Tracking

All token counts now live in response.tokens and message.tokens, including input, output, cached, cache creation, and thinking tokens.

✅ Official Ruby 4.0 Support

Ruby 4.0 is now officially supported in CI and dependencies.

🧩 Model Registry Updates

Refreshing the registry no longer deletes models from providers you haven't configured.

🌍 Vertex AI Global Endpoint Fix

When vertexai_location is global, the API base now correctly resolves to:

https://aiplatform.googleapis.com/v1beta1

📚 Docs Updates

New extended thinking guide.
Token usage docs include thinking tokens.

Installation

gem "ruby_llm", "1.10.0"

Upgrading from 1.9.x

bundle update ruby_llm
rails generate ruby_llm:upgrade_to_v1_10
rails db:migrate

Merged PRs

Fix Vertex AI Global Endpoint URL Construction by @NielsKSchjoedt in #553

New Contributors

@NielsKSchjoedt made their first contribution in #553

Full Changelog: 1.9.2...1.10.0

Contributors

NielsKSchjoedt

Assets 2

Uh oh!

Releases: crmne/ruby_llm

1.15.0

RubyLLM 1.15: Image Editing + Cost Tracking + Less Glue Code 🖼️💸🛠️

🖼️ Image Editing

💸 Conversation Costs + Normalized Tokens

🛠️ Simpler Tool Definitions

🔁 Additive Chat Callbacks

🚂 Rails Fixes

Action Text Content

ActiveRecord Eager Loading

Rails Association Conventions

Active Storage Persistence

🐛 Provider Fixes

📚 Docs + SEO

📦 Updated Model Registry

Installation

Upgrading from 1.14.x

Merged PRs

New Contributors

Contributors

Uh oh!

1.14.1

RubyLLM 1.14.1: Leaner Providers + ActiveStorage Fix 🔧🐛

🏗️ Leaner Provider Model Fallbacks

🐛 Fix: ActiveStorage::Blob Re-upload

📦 Updated Model Registry

📝 Docs

Installation

Upgrading from 1.14.0

Merged PRs

New Contributors

Contributors

Uh oh!

1.14.0

RubyLLM 1.14: Tailwind Chat UI + Rails AI Generators + Config DSL 🎨🤖🛠️

🎨 Tailwind Chat UI

🏗️ Rails AI Generators

⚙️ Simplified Configuration DSL

🐛 Fixes

Faraday Logging Memory Bloat

Gemspec Faraday Constraint Regression

Agent assume_model_exists Propagation

Renamed Model Associations

Eager Logger Interpolation

Error Raised with String Argument

MySQL/MariaDB Compatibility

File Attachments Across Ruby Versions

Model Type Classification

VertexAI Registry Filtering

📦 Updated Model Registry

Installation

Upgrading from 1.13.x

Merged PRs

New Contributors

Contributors

Uh oh!

1.13.2

RubyLLM 1.13.2: Patch Fixes for Schema + Streaming 🐛🔧

🧩 Fix: Schema Names Are Always OpenAI-Compatible

🌊 Fix: Streaming Ignores Non-Hash SSE Payloads

🗓️ Fix: models.dev created_at Date Handling

Installation

Upgrading from 1.13.1

Merged PRs

Contributors

Uh oh!

1.13.1

RubyLLM 1.13.1: Quick Fixes 🐛🔧

🧩 Fix: Schema + Tool Calls No Longer Crash

📊 Gemini: Cached Token Usage

🪟 Fix: Binary Attachment Reads on Windows

Installation

Upgrading from 1.13.0

Merged PRs

New Contributors

Contributors

Uh oh!

1.13.0

RubyLLM 1.13: Massive Amount of Fixes + A Ton of Merged PRs 🎉🤖🛠️

Agent `assume_model_exists` Propagation

🗓️ Fix: models.dev `created_at` Date Handling

Tool Choice (`choice`)

"Parallel" Tool Calling control (`calls`)

RubyLLM 1.12.1: Agent API Delegation + Rails `add_message` Persistence + Dependency Compatibility 🎉🤖🛠️

🤖 Agent API: Full `Chat` Delegation via `Forwardable`

Rails: `Chat#add_message` Now Persists Properly