Ollama Integration with AIDD

AIDD now supports Ollama as a provider through the ZRun CLI backend, allowing you to run AI coding agents locally without API costs.

What is Ollama?

Ollama is a local LLM runner that lets you run models like Llama 3.1, Qwen, and Code Llama on your own machine. It provides an OpenAI-compatible API that works seamlessly with AIDD's ZRun agent.

Prerequisites

Install Ollama: Follow the official guide at https://ollama.ai/download
Pull a Model: Choose a model and pull it with ollama pull <model>
Start Ollama: Run ollama serve to start the server

Recommended Models

Model	Size	RAM Required	Tool Support	Best For
`gpt-oss:20b`	13GB	16GB	✅	Default — passed the aidd quiz benchmark
`llama3.1:70b`	40GB	64GB	✅	Complex tasks (untested against aidd quiz)
`qwen2.5:32b`	19GB	32GB	✅	Code generation
`codellama:34b`	19GB	32GB	✅	Code-specific
`deepseek-coder-v2:16b`	9.2GB	16GB	✅	Programming
`llama3.1:8b`	4.7GB	8GB	⚠️	Lower-RAM fallback; tool-call reliability is hit-or-miss

Why gpt-oss:20b is the default: in the 2026-04-19 codebase comprehension quiz run (benchmarks/fixtures/quiz/), gpt-oss:20b was the only installed local model that scored a clean sweep on all four questions — llama3.2:latest, qwen3.5:9b, and qwen2.5-7b-cline all failed at the tool-call or response-writing layer. If you want the older llama3.1:8b default, pin it explicitly in providers.ollama.model or pass --model llama3.1:8b per run.

Setup Instructions

1. Add Ollama to `zrun/config.json`

Both providers live in the same config file — add an ollama entry alongside your existing zhipu block (or copy zrun/config.json.example if starting fresh):

{
	"defaultProvider": "zhipu",
	"maxTurns": 500,
	"providers": {
		"ollama": {
			"model": "gpt-oss:20b"
		},
		"zhipu": {
			"apiKey": "your-z-ai-api-key",
			"model": "glm-5.1"
		}
	}
}

You never have to edit this file again to switch — pick a provider per run via CLI flag, env var, or model inference (see §3).

2. Verify Installation

Run the test script from the repo root:

bun zrun/test-ollama.ts

# or, if Ollama runs on a different host
bun zrun/test-ollama.ts --base-url http://192.168.1.100:11434

It probes /api/tags, lists installed models, and reports the OpenAI-compatible URL zrun will use at runtime. Exit code 0 means ready; non-zero means the server is unreachable or has no models pulled.

3. Use with AIDD

Four ways to pick Ollama for a run, in decreasing precedence:

# (a) Explicit --provider flag
./aidd.sh --cli zrun --project-dir ./myproject --provider ollama

# (b) ZRUN_PROVIDER env var (works through aidd.sh without any extra plumbing)
ZRUN_PROVIDER=ollama ./aidd.sh --cli zrun --project-dir ./myproject

# (c) Auto-inferred from --model — any known Ollama family name routes to ollama
./aidd.sh --cli zrun --project-dir ./myproject --model llama3.1:8b
./aidd.sh --cli zrun --project-dir ./myproject --model qwen2.5:32b

# (d) defaultProvider in config.json, once you're ready to make ollama sticky
#     ({ "defaultProvider": "ollama", ... })
./aidd.sh --cli zrun --project-dir ./myproject

Model Discovery

If you don't specify a model (no providers.ollama.model in config, no --model on the command line), ZRun will automatically:

Connect to your Ollama instance
List available models
Pick the provider's declared default (gpt-oss:20b) if it's installed, otherwise fall back to the alphabetical first so repeat runs are deterministic

You can see available models with:

curl http://localhost:11434/api/tags

Performance Tips

Use GPU Acceleration: Install Ollama with GPU support for better performance
Choose Appropriate Model Size: Match model size to your available RAM
Adjust maxTurns: Reduce maxTurns in config for faster iteration
Use Smaller Contexts: Keep prompts focused to reduce token usage

Troubleshooting

"Ollama server is not running"

ollama serve

"No models found"

ollama pull gpt-oss:20b

Model doesn't respond to tool calls

Some models have limited function calling support. Try models marked with ✅ in the table above.

Performance is slow

Check if GPU is being used: Ollama should show "GPU" in the output
Consider a smaller model
Close other applications to free RAM

Configuration Options

Per-provider settings live under providers.ollama:

Option	Required	Default	Description
`providers.ollama.baseUrl`	No	`http://localhost:11434/v1`	Ollama API endpoint
`providers.ollama.model`	No	First available	Model to use (overridable per run via `--model`)
`providers.ollama.apiKey`	No	-	Not needed for Ollama

Top-level settings that apply across all providers:

Option	Required	Default	Description
`defaultProvider`	No	`"zhipu"`	Which provider fires when no `--provider` flag / `ZRUN_PROVIDER` / `--model` hint
`maxTurns`	No	500	Maximum agent iterations

Coexisting with Zhipu AI

There's nothing to migrate — both providers live in the same config file and you pick at runtime:

{
	"defaultProvider": "zhipu",
	"providers": {
		"ollama": { "model": "gpt-oss:20b" },
		"zhipu": { "apiKey": "your-z-ai-api-key", "model": "glm-5.1" }
	}
}

# Default (zhipu, per defaultProvider)
./aidd.sh --cli zrun --project-dir ./myproject

# Same machine, switch to ollama for one run
ZRUN_PROVIDER=ollama ./aidd.sh --cli zrun --project-dir ./myproject

# Or let --model decide
./aidd.sh --cli zrun --project-dir ./myproject --model gpt-oss:20b

If you want ollama to be the sticky default, flip defaultProvider to "ollama" — the zhipu block stays available for runs that pass --provider zhipu or ZRUN_PROVIDER=zhipu.

Security Considerations

Ollama runs locally, so your code never leaves your machine
No API keys or external dependencies
Ensure your Ollama instance is not exposed to the network if sensitive
Regular model updates are manual (pull new versions when available)

Advanced Usage

Custom Ollama Server

Point at a remote Ollama host:

{
	"providers": {
		"ollama": {
			"baseUrl": "http://192.168.1.100:11434/v1",
			"model": "gpt-oss:20b"
		}
	}
}

Multiple Model Support

Swap models per task without editing config:

# Lightweight model for quick tasks (auto-infers ollama from the model name)
./aidd.sh --cli zrun --model qwen2.5:7b --max-iterations 5

# Heavier model for complex features
./aidd.sh --cli zrun --model llama3.1:70b --max-iterations 20

Limitations

Hardware Dependent: Performance depends on your CPU/GPU
Model Quality: Local models may be less capable than GPT-4/Claude
Token Limits: Smaller context windows than cloud models
Tool Support: Varies by model (check documentation)

Getting Help

Ollama documentation: https://github.com/ollama/ollama
Model list: https://ollama.ai/library
AIDD issues: https://github.com/NomadicDaddy/aidd/issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama Integration with AIDD

What is Ollama?

Prerequisites

Recommended Models

Setup Instructions

1. Add Ollama to `zrun/config.json`

2. Verify Installation

3. Use with AIDD

Model Discovery

Performance Tips

Troubleshooting

"Ollama server is not running"

"No models found"

Model doesn't respond to tool calls

Performance is slow

Configuration Options

Coexisting with Zhipu AI

Security Considerations

Advanced Usage

Custom Ollama Server

Multiple Model Support

Limitations

Getting Help

FilesExpand file tree

ollama-integration.md

Latest commit

History

ollama-integration.md

File metadata and controls

Ollama Integration with AIDD

What is Ollama?

Prerequisites

Recommended Models

Setup Instructions

1. Add Ollama to zrun/config.json

2. Verify Installation

3. Use with AIDD

Model Discovery

Performance Tips

Troubleshooting

"Ollama server is not running"

"No models found"

Model doesn't respond to tool calls

Performance is slow

Configuration Options

Coexisting with Zhipu AI

Security Considerations

Advanced Usage

Custom Ollama Server

Multiple Model Support

Limitations

Getting Help

1. Add Ollama to `zrun/config.json`