Skip to content

Commit 3bec469

Browse files
authored
Add documentation for running with Ramalama local model serving in OCI Containers (aaif-goose#1973)
Signed-off-by: Adam Miller <admiller@redhat.com>
1 parent 4a93d42 commit 3bec469

1 file changed

Lines changed: 101 additions & 2 deletions

File tree

documentation/docs/getting-started/providers.md

Lines changed: 101 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Goose relies heavily on tool calling capabilities and currently works best with
2828
| [GitHub Copilot](https://docs.github.com/en/copilot/using-github-copilot/ai-models) | Access to GitHub Copilot's chat models including gpt-4o, o1, o3-mini, and Claude models. Uses device code authentication flow for secure access. | Uses GitHub device code authentication flow (no API key needed) |
2929
| [Groq](https://groq.com/) | High-performance inference hardware and tools for LLMs. | `GROQ_API_KEY` |
3030
| [Ollama](https://ollama.com/) | Local model runner supporting Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](/docs/getting-started/providers#local-llms-ollama).** | `OLLAMA_HOST` |
31+
| [Ramalama](https://ramalama.ai/) | Local model using native [OCI](https://opencontainers.org/) container runtimes, [CNCF](https://www.cncf.io/) tools, and supporting models as OCI artifacts. Ramalama API an compatible alternative to Ollama and can be used with the Goose Ollama provider. Supports Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](/docs/getting-started/providers#local-llms-ollama).** | `OLLAMA_HOST` |
3132
| [OpenAI](https://platform.openai.com/api-keys) | Provides gpt-4o, o1, and other advanced language models. Also supports OpenAI-compatible endpoints (e.g., self-hosted LLaMA, vLLM, KServe). **o1-mini and o1-preview are not supported because Goose uses tool calling.** | `OPENAI_API_KEY`, `OPENAI_HOST` (optional), `OPENAI_ORGANIZATION` (optional), `OPENAI_PROJECT` (optional), `OPENAI_CUSTOM_HEADERS` (optional) |
3233
| [OpenRouter](https://openrouter.ai/) | API gateway for unified access to various models with features like rate-limiting management. | `OPENROUTER_API_KEY` |
3334
| [Snowflake](https://docs.snowflake.com/user-guide/snowflake-cortex/aisql#choosing-a-model) | Access the latest models using Snowflake Cortex services, including Claude models. **Requires a Snowflake account and programmatic access token (PAT)**. | `SNOWFLAKE_HOST`, `SNOWFLAKE_TOKEN` |
@@ -275,9 +276,11 @@ To set up Google Gemini with Goose, follow these steps:
275276
</Tabs>
276277
277278
278-
### Local LLMs (Ollama)
279+
### Local LLMs (Ollama or Ramalama)
279280
280-
Ollama provides local LLMs, which requires a bit more set up before you can use it with Goose.
281+
Ollama and Ramalama are both options to provide local LLMs, each which requires a bit more set up before you can use one of them with Goose.
282+
283+
#### Ollama
281284
282285
1. [Download Ollama](https://ollama.com/download).
283286
2. Run any [model supporting tool-calling](https://ollama.com/search?c=tools):
@@ -374,6 +377,102 @@ If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST=
374377
└ Configuration saved successfully
375378
```
376379

380+
#### Ramalama
381+
382+
1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install).
383+
2. Run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model) :
384+
385+
:::warning Limited Support for models without tool calling
386+
Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose.
387+
:::
388+
389+
Example:
390+
391+
```sh
392+
# NOTE: the --runtime-args="--jinja" flag is required for Ramalama to work with the Goose Ollama provider.
393+
ramalama serve --runtime-args="--jinja" ollama://qwen2.5
394+
```
395+
396+
3. In a separate terminal window, configure with Goose:
397+
398+
```sh
399+
goose configure
400+
```
401+
402+
4. Choose to `Configure Providers`
403+
404+
```
405+
┌ goose-configure
406+
407+
◆ What would you like to configure?
408+
│ ● Configure Providers (Change provider or update credentials)
409+
│ ○ Toggle Extensions
410+
│ ○ Add Extension
411+
412+
```
413+
414+
5. Choose `Ollama` as the model provider since Ramalama is API compatible and can use the Goose Ollama provider
415+
416+
```
417+
┌ goose-configure
418+
419+
◇ What would you like to configure?
420+
│ Configure Providers
421+
422+
◆ Which model provider should we use?
423+
│ ○ Anthropic
424+
│ ○ Databricks
425+
│ ○ Google Gemini
426+
│ ○ Groq
427+
│ ● Ollama (Local open source models)
428+
│ ○ OpenAI
429+
│ ○ OpenRouter
430+
431+
```
432+
433+
5. Enter the host where your model is running
434+
435+
:::info Endpoint
436+
For the Ollama provider, if you don't provide a host, we set it to `localhost:11434`. When constructing the URL, we preprend `http://` if the scheme is not `http` or `https`. Since Ramalama's default port to serve on is 8080, we set `OLLAMA_HOST=http://0.0.0.0:8080`
437+
:::
438+
439+
```
440+
┌ goose-configure
441+
442+
◇ What would you like to configure?
443+
│ Configure Providers
444+
445+
◇ Which model provider should we use?
446+
│ Ollama
447+
448+
◆ Provider Ollama requires OLLAMA_HOST, please enter a value
449+
│ http://0.0.0.0:8080
450+
451+
```
452+
453+
454+
6. Enter the model you have running
455+
456+
```
457+
┌ goose-configure
458+
459+
◇ What would you like to configure?
460+
│ Configure Providers
461+
462+
◇ Which model provider should we use?
463+
│ Ollama
464+
465+
◇ Provider Ollama requires OLLAMA_HOST, please enter a value
466+
│ http://0.0.0.0:8080
467+
468+
◇ Enter a model from that provider:
469+
│ qwen2.5
470+
471+
◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together!
472+
473+
└ Configuration saved successfully
474+
```
475+
377476
### DeepSeek-R1
378477

379478
Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally.

0 commit comments

Comments
 (0)