Skip to content

Commit d1ae15a

Browse files
dalton-coleclaude
andcommitted
Add Ollama Cloud provider
Ollama's hosted service (https://ollama.com) exposes OpenAI-compatible endpoints at /v1 with Bearer-token auth. Add :ollama_cloud as a dedicated provider inheriting from the existing Ollama provider so chat, streaming, media, and dynamic model listing all work unchanged, while correctly reporting as remote, requiring an API key, and defaulting api_base to https://ollama.com/v1. Two class-level overrides are load-bearing: - `slug` returns "ollama_cloud" — the default "ollamacloud" would mismatch the :ollama_cloud registration symbol and break Model::Info#provider lookups. - `assume_models_exist?` returns true — cloud models are dynamic and not in the static registry; the existing Ollama provider gets the same shortcut via its `local?` flag, which OllamaCloud correctly returns false for. Models.dev already catalogs Ollama Cloud under the key "ollama-cloud", so the MODELS_DEV_PROVIDER_MAP entry wires its 37 models into the shared registry with full metadata (context_window, max_output_tokens, capabilities). Adding ollama_cloud_api_key to models.rake's configure_from_env lets the maintainer's next `rake models:update` populate the shipped models.json. Verified live against the hosted API: /v1/models returns the expected OpenAI list shape, sync and streaming chat both work on gpt-oss:120b, ConfigurationError is raised when the key is missing, and `RubyLLM.models.refresh!` populates 38 entries (37 from models.dev + 1 additional from the live provider listing). 8 VCR cassettes recorded (4 basic chat, 1 streaming, 3 thinking). Two reasoning-model quirks match the existing ollama/qwen3 skip pattern: system-prompt replacement and streaming-vs-sync token count drift. Resolves #740. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 4371a1b commit d1ae15a

23 files changed

Lines changed: 858 additions & 2 deletions

.env.example

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ GPUSTACK_API_BASE=http://localhost:11444/v1
1313
GPUSTACK_API_KEY=$(op read "op://RubyLLM/GPUStack/credential")
1414
MISTRAL_API_KEY=$(op read "op://RubyLLM/Mistral/credential")
1515
OLLAMA_API_BASE=http://localhost:11434/v1
16+
OLLAMA_CLOUD_API_KEY=$(op read "op://RubyLLM/Ollama Cloud/credential")
1617
OPENAI_API_KEY=$(op read "op://RubyLLM/OpenAI/credential")
1718
OPENROUTER_API_KEY=$(op read "op://RubyLLM/OpenRouter/credential")
1819
PERPLEXITY_API_KEY=$(op read "op://RubyLLM/Perplexity/credential")

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ response = chat.with_schema(ProductSchema).ask "Analyze this product", with: "pr
140140
* **Async:** Fiber-based concurrency
141141
* **Model registry:** 800+ models with capability detection and pricing
142142
* **Extended thinking:** Control, view, and persist model deliberation
143-
* **Providers:** OpenAI, xAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API
143+
* **Providers:** OpenAI, xAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, Ollama Cloud, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API
144144

145145
## Installation
146146

docs/_getting_started/configuration.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,10 @@ RubyLLM.configure do |config|
8383
config.ollama_api_base = 'http://localhost:11434/v1'
8484
config.ollama_api_key = ENV['OLLAMA_API_KEY'] # Available in v1.13.0+ (optional for authenticated/remote Ollama endpoints)
8585

86+
# Ollama Cloud
87+
config.ollama_cloud_api_key = ENV['OLLAMA_CLOUD_API_KEY'] # Required. Keys: https://ollama.com/settings/keys
88+
config.ollama_cloud_api_base = ENV['OLLAMA_CLOUD_API_BASE'] # Optional. Defaults to https://ollama.com/v1
89+
8690
# OpenAI
8791
config.openai_api_key = ENV['OPENAI_API_KEY']
8892
config.openai_api_base = ENV['OPENAI_API_BASE'] # Optional custom OpenAI-compatible endpoint
@@ -166,6 +170,28 @@ end
166170

167171
By default, RubyLLM uses the 'developer' role (matching OpenAI's current API). Set `openai_use_system_role` to true for compatibility with servers that still expect 'system'.
168172

173+
### Ollama Cloud
174+
175+
Ollama's hosted service exposes OpenAI-compatible endpoints at `https://ollama.com/v1` with Bearer-token auth. Keys are issued at [ollama.com/settings/keys](https://ollama.com/settings/keys).
176+
177+
```ruby
178+
RubyLLM.configure do |config|
179+
config.ollama_cloud_api_key = ENV['OLLAMA_CLOUD_API_KEY']
180+
end
181+
182+
chat = RubyLLM.chat(
183+
model: 'gpt-oss:120b',
184+
provider: :ollama_cloud,
185+
assume_model_exists: true
186+
)
187+
chat.ask('Hello from the cloud')
188+
```
189+
190+
Cloud-capable models include `gpt-oss:120b`, `gpt-oss:120b-cloud`, `qwen3-coder:480b-cloud`, and `deepseek-v3.1:671b-cloud`. Models are discovered dynamically via `/v1/models`; pass `assume_model_exists: true` until you run `RubyLLM.models.refresh!`.
191+
192+
> Ollama Cloud is billed by subscription tier (Free / Pro $20/mo / Max $100/mo), not per-token — so `Message#input_tokens` and `Message#output_tokens` are reported but `Model::Info#pricing` will be empty. See [ollama.com/pricing](https://ollama.com/pricing) for current tiers.
193+
{: .note }
194+
169195
### Gemini API Versions
170196
{: .d-inline-block }
171197

@@ -484,6 +510,10 @@ RubyLLM.configure do |config|
484510
config.ollama_api_base = String
485511
config.ollama_api_key = String # v1.13.0+
486512

513+
# Ollama Cloud
514+
config.ollama_cloud_api_key = String
515+
config.ollama_cloud_api_base = String
516+
487517
# OpenAI
488518
config.openai_api_key = String
489519
config.openai_api_base = String

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ response = chat.with_schema(ProductSchema).ask "Analyze this product", with: "pr
214214
* **Async:** Fiber-based concurrency
215215
* **Model registry:** 800+ models with capability detection and pricing
216216
* **Extended thinking:** Control, view, and persist model deliberation
217-
* **Providers:** OpenAI, xAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API
217+
* **Providers:** OpenAI, xAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, Ollama Cloud, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API
218218

219219
## Installation
220220

lib/ruby_llm.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@ def logger
101101
RubyLLM::Provider.register :gpustack, RubyLLM::Providers::GPUStack
102102
RubyLLM::Provider.register :mistral, RubyLLM::Providers::Mistral
103103
RubyLLM::Provider.register :ollama, RubyLLM::Providers::Ollama
104+
RubyLLM::Provider.register :ollama_cloud, RubyLLM::Providers::OllamaCloud
104105
RubyLLM::Provider.register :openai, RubyLLM::Providers::OpenAI
105106
RubyLLM::Provider.register :openrouter, RubyLLM::Providers::OpenRouter
106107
RubyLLM::Provider.register :perplexity, RubyLLM::Providers::Perplexity

lib/ruby_llm/models.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ class Models
1313
'amazon-bedrock' => 'bedrock',
1414
'deepseek' => 'deepseek',
1515
'mistral' => 'mistral',
16+
'ollama-cloud' => 'ollama_cloud',
1617
'openrouter' => 'openrouter',
1718
'perplexity' => 'perplexity'
1819
}.freeze
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# frozen_string_literal: true
2+
3+
module RubyLLM
4+
module Providers
5+
# Ollama Cloud API integration.
6+
class OllamaCloud < Ollama
7+
def api_base
8+
@config.ollama_cloud_api_base || 'https://ollama.com/v1'
9+
end
10+
11+
def headers
12+
{ 'Authorization' => "Bearer #{@config.ollama_cloud_api_key}" }
13+
end
14+
15+
class << self
16+
def slug
17+
'ollama_cloud'
18+
end
19+
20+
def configuration_options
21+
%i[ollama_cloud_api_base ollama_cloud_api_key]
22+
end
23+
24+
def configuration_requirements
25+
%i[ollama_cloud_api_key]
26+
end
27+
28+
def local?
29+
false
30+
end
31+
32+
def assume_models_exist?
33+
true
34+
end
35+
36+
def capabilities
37+
Ollama::Capabilities
38+
end
39+
end
40+
end
41+
end
42+
end

lib/tasks/models.rake

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ def configure_from_env
4545
config.deepseek_api_key = ENV.fetch('DEEPSEEK_API_KEY', nil)
4646
config.gemini_api_key = ENV.fetch('GEMINI_API_KEY', nil)
4747
config.mistral_api_key = ENV.fetch('MISTRAL_API_KEY', nil)
48+
config.ollama_cloud_api_key = ENV.fetch('OLLAMA_CLOUD_API_KEY', nil)
4849
config.openai_api_key = ENV.fetch('OPENAI_API_KEY', nil)
4950
config.openrouter_api_key = ENV.fetch('OPENROUTER_API_KEY', nil)
5051
config.perplexity_api_key = ENV.fetch('PERPLEXITY_API_KEY', nil)

spec/fixtures/vcr_cassettes/chat_basic_chat_functionality_ollama_cloud_gpt-oss_120b_can_handle_multi-turn_conversations.yml

Lines changed: 123 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

spec/fixtures/vcr_cassettes/chat_basic_chat_functionality_ollama_cloud_gpt-oss_120b_can_have_a_basic_conversation.yml

Lines changed: 57 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)