crmne
diff --git a/‎.rubocop.yml‎
Lines changed: 9 additions & 5 deletions b/‎.rubocop.yml‎
Lines changed: 9 additions & 5 deletions
diff --git a/‎README.md‎
Lines changed: 7 additions & 4 deletions b/‎README.md‎
Lines changed: 7 additions & 4 deletions
diff --git a/‎bin/test‎
Lines changed: 10 additions & 0 deletions b/‎bin/test‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/_advanced/agentic-workflows.md‎
Lines changed: 57 additions & 1 deletion b/‎docs/_advanced/agentic-workflows.md‎
Lines changed: 57 additions & 1 deletion
diff --git a/‎docs/_advanced/instrumentation.md‎
Lines changed: 99 additions & 0 deletions b/‎docs/_advanced/instrumentation.md‎
Lines changed: 99 additions & 0 deletions
diff --git a/‎docs/_advanced/models.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/_advanced/models.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/_advanced/rails.md‎
Lines changed: 12 additions & 14 deletions b/‎docs/_advanced/rails.md‎
Lines changed: 12 additions & 14 deletions
diff --git a/‎docs/_advanced/upgrading.md‎
Lines changed: 60 additions & 1 deletion b/‎docs/_advanced/upgrading.md‎
Lines changed: 60 additions & 1 deletion
@@ -14,15 +14,19 @@ AllCops:
   SuggestExtensions: false
 
 Metrics/ClassLength:
-  Enabled: false
+  Max: 550
 Metrics/AbcSize:
-  Enabled: false
+  Max: 80
 Metrics/CyclomaticComplexity:
-  Enabled: false
+  Max: 25
 Metrics/MethodLength:
-  Enabled: false
+  Max: 80
+  CountAsOne:
+    - array
+    - hash
+    - heredoc
 Metrics/ModuleLength:
-  Enabled: false
+  Max: 550
 Performance/CollectionLiteralInLoop:
   Exclude:
     - spec/**/*
 
@@ -5,9 +5,10 @@
   <img src="/docs/assets/images/logotype.svg" alt="RubyLLM" height="120" width="250">
 </picture>
 
-<strong>One *beautiful* Ruby API for GPT, Claude, Gemini, and more.</strong>
+<strong>One <em>delightful</em> Ruby framework for every major AI provider. Build AI agents, chatbots, RAG apps, and multimodal workflows in beautiful, expressive code.
+</strong>
 
-Battle tested at [<picture><source media="(prefers-color-scheme: dark)" srcset="https://chatwithwork.com/logotype-dark.svg"><img src="https://chatwithwork.com/logotype.svg" alt="Chat with Work" height="30" align="absmiddle"></picture>](https://chatwithwork.com) — *Your AI coworker*
+Battle tested at [<picture><source media="(prefers-color-scheme: dark)" srcset="https://chatwithwork.com/logotype-dark.svg"><img src="https://chatwithwork.com/logotype.svg" alt="Chat with Work" height="30" align="absmiddle"></picture>](https://chatwithwork.com) - *Fully private work AI*
 
 [![Gem Version](https://badge.fury.io/rb/ruby_llm.svg)](https://badge.fury.io/rb/ruby_llm)
 [![Ruby Style Guide](https://img.shields.io/badge/code_style-rubocop-brightgreen.svg)](https://github.com/rubocop/rubocop)
@@ -24,15 +25,15 @@ Battle tested at [<picture><source media="(prefers-color-scheme: dark)" srcset="
 
 Build chatbots, AI agents, RAG applications. Works with OpenAI, xAI, Anthropic, Google, AWS, local models, and any OpenAI-compatible API.
 
-## From zero to AI chat app in under two minutes
+## Build a working Ruby AI chat in two minutes
 
 https://github.com/user-attachments/assets/65422091-9338-47da-a303-92b918bd1345
 
 ## Why RubyLLM?
 
 Every AI provider ships their own bloated client. Different APIs. Different response formats. Different conventions. It's exhausting.
 
-RubyLLM gives you one beautiful API for all of them. Same interface whether you're using GPT, Claude, or your local Ollama. Just three dependencies: Faraday, Zeitwerk, and Marcel. That's it.
+RubyLLM gives you one beautiful framework for all of them. Same interface whether you're using GPT, Claude, or your local Ollama. Just three dependencies: Faraday, Zeitwerk, and Marcel. That's it.
 
 ## Show me the code
 
@@ -138,6 +139,8 @@ response = chat.with_schema(ProductSchema).ask "Analyze this product", with: "pr
 * **Async:** Fiber-based concurrency
 * **Model registry:** 800+ models with capability detection and pricing
 * **Extended thinking:** Control, view, and persist model deliberation
+* **Citations:** Normalized source citations from documents, search, and grounding
+* **Batches:** Provider-side batch processing at half price with `RubyLLM.batch`
 * **Providers:** OpenAI, xAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API
 
 ## Installation
 
@@ -0,0 +1,10 @@
+#!/usr/bin/env bash
+set -euo pipefail
+IFS=$'\n\t'
+export RUBYOPT="${RUBYOPT:-} -w"
+
+if [ "$#" -eq 0 ]; then
+  exec bundle exec bin/rspec-queue
+else
+  exec bundle exec rspec "$@"
+fi
@@ -1,7 +1,7 @@
 ---
 layout: default
 title: Agentic Workflows
-nav_order: 5
+nav_order: 6
 description: Build workflow-oriented AI systems with plain Ruby orchestration, from routing and parallelization to RAG
 ---
 
@@ -108,6 +108,62 @@ workflow = ModelRouterWorkflow.new
 response = workflow.call("Write a Ruby function to parse JSON")
 ```
 
+### Handoff Workflow
+
+Routing picks an agent up front. A handoff goes further: it hands the *same ongoing conversation* to a different agent, so the specialist sees the full history and the user keeps one continuous thread. `Agent.find(chat_id)` is what makes this work. It loads a persisted chat and applies a different agent's instructions and tools at runtime, leaving the messages intact, so whichever agent takes over inherits the whole conversation.
+
+The simplest version is a single turn. Give a router an inline schema so it answers with the specialist to hand to, then let that specialist continue the persisted chat:
+
+```ruby
+class SupportRouter < RubyLLM::Agent
+  schema do
+    string :specialist, enum: %w[billing technical]
+  end
+  instructions "Pick the specialist best suited to the latest message."
+end
+
+SPECIALISTS = { "billing" => BillingAgent, "technical" => TechnicalAgent }
+
+def handle(chat_id, message)
+  specialist = SupportRouter.new.ask(message).content["specialist"]
+  SPECIALISTS.fetch(specialist).find(chat_id).ask(message)
+end
+```
+
+The router responds immediately with its choice (no tool round-trip), and the specialist takes over the same chat with full history. Outside Rails, pass the chat object with `SpecialistAgent.new(chat: chat)` instead of `find(chat_id)`.
+
+If you would rather the agent decide mid-conversation, give it a single handoff tool that returns the specialist to switch to. Drive the [loop]({% link _core_features/chat.md %}#driving-the-loop-yourself) a step at a time, and when a tool result names an agent, hand off:
+
+```ruby
+class Handoff < RubyLLM::Tool
+  description "Hand the conversation to the specialist who should answer next"
+  param :specialist, desc: "billing or technical"
+  def execute(specialist:) = specialist
+end
+
+class SupportRouter < RubyLLM::Agent
+  chat_model Chat
+  instructions "Call handoff with the specialist who should take over."
+  tools Handoff
+end
+
+AGENTS = { "billing" => BillingAgent, "technical" => TechnicalAgent }
+
+agent = SupportRouter.find(chat_id)
+agent.ask_later(message)
+
+until agent.complete?
+  agent.step
+  last = agent.messages.reload.last
+  specialist = AGENTS[last.content] if last.role == "tool"
+  agent = specialist.find(chat_id) if specialist
+end
+```
+
+The handoff tool just returns the name; the orchestrator owns the routing, watching each tool result and swapping agents when one names a specialist. Because the routing lives in the loop and not in any agent, this extends to a multi-router for free: give the specialists the same `Handoff` tool and they can route onward, with every hop handled the same way.
+
+(A tool cannot reconfigure the chat it runs inside, since its `execute` never receives the chat, so the switch belongs in the loop, not in the tool.)
+
 ### Parallel Workflow
 
 Use this pattern when independent analyses can run at the same time.
 
@@ -0,0 +1,99 @@
+---
+layout: default
+title: Instrumentation
+nav_order: 5
+description: Observe RubyLLM requests, chats, tool calls, embeddings, and model refreshes
+redirect_from:
+  - /guides/instrumentation
+---
+
+# {{ page.title }}
+{: .no_toc .d-inline-block }
+
+v1.16.0+
+{: .label .label-green }
+
+{{ page.description }}.
+{: .fs-6 .fw-300 }
+
+## Table of contents
+{: .no_toc .text-delta }
+
+1. TOC
+{:toc}
+
+---
+
+After reading this guide, you will know:
+
+*   How to subscribe to RubyLLM events in Rails.
+*   How to connect RubyLLM instrumentation outside Rails.
+*   Which events RubyLLM emits.
+*   Which payload fields may contain sensitive application data.
+
+## Rails
+
+Rails apps automatically emit RubyLLM events through `ActiveSupport::Notifications`. Subscribe to them the same way you would subscribe to Rails framework events:
+
+```ruby
+# config/initializers/ruby_llm_instrumentation.rb
+ActiveSupport::Notifications.subscribe('chat.ruby_llm') do |_name, _start, _finish, _id, payload|
+  Rails.logger.info(
+    provider: payload[:provider],
+    model: payload[:model],
+    input_tokens: payload[:input_tokens],
+    output_tokens: payload[:output_tokens]
+  )
+end
+```
+
+When an instrumented block raises, Rails adds the standard `:exception` and `:exception_object` payload keys.
+
+## Outside Rails
+
+Outside Rails, set `config.instrumenter` to any object that responds to `instrument(name, payload) { ... }`:
+
+```ruby
+class AppInstrumenter
+  def instrument(name, payload)
+    started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
+
+    result = yield if block_given?
+    result
+  rescue StandardError => error
+    payload = payload.merge(
+      exception: [error.class.name, error.message],
+      exception_object: error
+    )
+    raise
+  ensure
+    duration = Process.clock_gettime(Process::CLOCK_MONOTONIC) - started_at
+    Observability.record(name, payload.merge(duration: duration))
+  end
+end
+
+RubyLLM.configure do |config|
+  config.instrumenter = AppInstrumenter.new
+end
+```
+
+You can also set `instrumenter` on a [context]({% link _getting_started/configuration.md %}#contexts-isolated-configurations) when you only want instrumentation around a specific operation.
+
+## Events
+
+RubyLLM emits these events:
+
+*   `request.ruby_llm` - HTTP request metadata such as provider, method, URL, and status
+*   `chat.ruby_llm` - chat completion metadata including model, provider, messages, response, and token usage
+*   `tool_call.ruby_llm` - tool name, arguments, and result
+*   `embedding.ruby_llm` - embedding model, input, result, token usage, and vector dimensions
+*   `image.ruby_llm` - image generation model, prompt, size, and result
+*   `moderation.ruby_llm` - moderation model, input, result, and flagged status
+*   `transcription.ruby_llm` - transcription model, language, result, and token usage
+*   `models.refresh.ruby_llm` - model registry refresh metadata
+
+## Payloads
+
+Payloads include the Ruby objects needed by observability adapters, but message content, tool arguments, and provider responses may be sensitive. Only export or log those fields when your application policy allows it.
+
+Non-Rails instrumenters control their own error payload behavior. If your instrumenter records exceptions, keep those payloads consistent with the rest of your observability stack.
@@ -2,7 +2,7 @@
 layout: default
 title: Model Registry
 nav_order: 4
-description: Access hundreds of AI models from all major providers with one simple API
+description: Access hundreds of AI models from all major AI providers with one Ruby framework
 redirect_from:
   - /guides/models
 ---
 
@@ -285,14 +285,19 @@ RubyLLM.configure do |config|
   config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
   config.gemini_api_key = ENV['GEMINI_API_KEY']
 
-  # New apps: Use modern API (generator adds this)
-  config.use_new_acts_as = true
-
   # For custom Model class names (defaults to 'Model')
   # config.model_registry_class = 'AIModel'
 end
 ```
 
+### Instrumentation
+{: .d-inline-block }
+
+v1.16.0+
+{: .label .label-green }
+
+Rails apps automatically emit RubyLLM events through `ActiveSupport::Notifications`. See [Instrumentation]({% link _advanced/instrumentation.md %}) for events, payloads, and non-Rails instrumenters.
+
 ### Fiber-Safe ActiveRecord Connections for Async/Fiber Workloads
 {: .d-inline-block }
 
@@ -313,13 +318,6 @@ Why: Rails defaults to thread-based connection isolation. In fiber-heavy flows,
 
 ### Setting Up Models with `acts_as` Helpers
 
-> **New in v1.7.0:** Rails-like `acts_as` API with association names!
-> - **New apps**: Generator sets `config.use_new_acts_as = true` for modern API
-> - **Existing apps**: Continue using legacy API (with deprecation warning)
-> - **Migrate today**: Set `config.use_new_acts_as = true` to use the better API
-> - **Legacy API removed in 2.0**: The new API will become the only option
-{: .warning }
-
 Add RubyLLM capabilities to your models:
 
 #### With Model Registry (Default for new apps)
@@ -372,7 +370,7 @@ end
 Pre-1.7.0 or opt-in
 {: .label .label-yellow }
 
-> Default behavior for existing apps. Set `config.use_new_acts_as = true` to upgrade! Legacy API will be removed in 2.0.
+> Set `config.use_new_acts_as = false` to stay with this API until it will be removed in 2.0.
 {: .note }
 
 ```ruby
@@ -573,9 +571,9 @@ When using the Model registry (created by default by the generator), your chats
 ```ruby
 # String automatically resolves to Model record
 chat = Chat.create!(model: '{{ site.models.openai_standard }}')
-chat.model # => #<Model model_id: "gpt-4o", provider: "openai">
-chat.model.name # => "GPT-4"
-chat.model.context_window # => 128000
+chat.model # => #<Model model_id: "{{ site.models.openai_standard }}", provider: "openai">
+chat.model.name # => "GPT-5.4"
+chat.model.context_window # => 1050000
 chat.model.supports_vision # => true
 
 # Populate/refresh models from models.json (v1.13+)
 
@@ -1,7 +1,7 @@
 ---
 layout: default
 title: Upgrading
-nav_order: 6
+nav_order: 7
 description: Upgrade guides for changes in data formats
 redirect_from:
   - /upgrading-to-1-7
@@ -23,6 +23,57 @@ redirect_from:
 This guide focuses on upgrade-impacting changes: migrations, token semantics, deprecations, and compatibility notes. It is not a complete changelog. For every feature, fix, and patch note, see the [GitHub releases](https://github.com/crmne/ruby_llm/releases).
 {: .note }
 
+---
+# Upgrade to 2.0
+
+2.0 is currently in development
+{: .note }
+
+## How to Upgrade
+
+```bash
+# Run the upgrade generator
+bin/rails generate ruby_llm:upgrade_to_v2_0
+
+# Run migrations
+bin/rails db:migrate
+```
+
+The generator adds a JSON `citations` column to your messages table so [citations]({% link _core_features/citations.md %}) are persisted with each assistant message. The column is optional — without it, citations remain available on in-memory responses but aren't saved.
+
+## Providers and Protocols Split
+
+RubyLLM 2.0 separates providers (host, auth, catalog) from protocols (wire format). The public chat API is unchanged — `RubyLLM.chat`, `with_params`, `embed`, `paint`, and the Rails integration all work as before. Two things changed underneath:
+
+**OpenAI now defaults to the Responses API.** This unlocks reasoning models with tools and extended thinking together. To stay on Chat Completions:
+
+```ruby
+RubyLLM.configure do |config|
+  config.openai_protocol = :chat_completions
+end
+
+# or per chat
+RubyLLM.chat(model: 'gpt-5.4').with_protocol(:chat_completions)
+```
+
+If you pass Chat Completions–only params via `with_params` (like `response_format`), either switch those chats to `:chat_completions` or use the Responses API equivalents (`text: { format: ... }`).
+
+**Wire-format internals moved to `RubyLLM::Protocols`.** `RubyLLM::Providers::OpenAI::Chat` and sibling modules are now `RubyLLM::Protocols::ChatCompletions::Chat` and friends; Anthropic, Gemini, and Bedrock Converse internals moved the same way. Provider classes no longer inherit from each other (`Mistral < OpenAI` is gone) — a provider declares its protocols instead.
+
+If you maintain a provider gem, subclass a protocol for your dialect and declare it in a thin provider:
+
+```ruby
+class MyProvider < RubyLLM::Provider
+  class ChatCompletions < RubyLLM::Protocols::ChatCompletions
+    # your overrides
+  end
+
+  protocol :chat_completions, ChatCompletions
+end
+```
+
+For routing details and per-chat overrides, see [Choosing the Wire Protocol]({% link _core_features/chat.md %}#choosing-the-wire-protocol).
+
 ---
 # Upgrade to 1.15
 
@@ -189,6 +240,14 @@ Your existing 1.6 app continues working without any changes. You'll see a deprec
 !!! RubyLLM's legacy acts_as API is deprecated and will be removed in RubyLLM 2.0.0.
 ```
 
+You can silence or raise RubyLLM deprecations while upgrading:
+
+```ruby
+RubyLLM.configure do |config|
+  config.deprecation_behavior = :silence # or :raise
+end
+```
+
 ## What's New in 1.7
 
 Among other features, the DB-backed model registry replaces simple string fields with proper ActiveRecord associations. Additionally, the `acts_as` helpers have been redesigned with a more Rails-like API.