Skip to content

Feat/ollama model#1322

Open
eliasubz wants to merge 3 commits intoEvolvingLMMs-Lab:mainfrom
eliasubz:feat/ollama-model
Open

Feat/ollama model#1322
eliasubz wants to merge 3 commits intoEvolvingLMMs-Lab:mainfrom
eliasubz:feat/ollama-model

Conversation

@eliasubz
Copy link
Copy Markdown

@eliasubz eliasubz commented May 5, 2026

Summary

  • Adds an ollama chat backend for local inference through Ollama's OpenAI-compatible /v1 API.
  • Enables local text and vision generate_until evals with Ollama models.
  • Adds unit coverage for registration, initialization, base URL handling, and explicit loglikelihood non-support.

In scope

  • Adds lmms_eval/models/chat/ollama.py.
  • Registers "ollama": "Ollama" in AVAILABLE_CHAT_TEMPLATE_MODELS.
  • Adds test/models/test_ollama.py for backend registration and constructor behavior.
  • Supports generate_until through Ollama's OpenAI-compatible chat completions endpoint.

Out of scope

  • Does not add loglikelihood support; Ollama returns generated-token logprobs, not prompt/continuation likelihoods required by lmms-eval.
  • Does not add video or audio support.

Validation

  • Ran various vision and text evals on models such as smollm2, llava and moondream.
  • uv --cache-dir .\.uv-cache run --with pytest python -m pytest test/models/test_ollama.py -v | sample size: N=7 tests | key metrics: 7 passed | result: pass
  • uv --cache-dir .\.uv-cache run python -m lmms_eval --model ollama --model_args model_version=smollm2:135m --include_path C:\tmp\lmms_tasks --tasks gsm8k_ollama_1shot --limit 8 --batch_size 2 | sample size: N=8 | key metrics: flexible-extract exact_match=0.125, strict-match exact_match=0.000 | result: pass
  • uv --cache-dir .\.uv-cache run python -m lmms_eval --model ollama --model_args model_version=<vision-model> --tasks ok_vqa_val2014_lite --limit 1 --batch_size 1 | sample size: N=1 | key metrics: vision generate_until completed and produced metrics table | result: pass

Risk / Compatibility

  • Low risk: this adds a new model backend and does not change existing model or task behavior.
  • Reproducibility only requieres local Ollama model.

Type of Change

  • Bug fix (non-breaking change)
  • New feature
  • New benchmark/task
  • New model integration
  • Breaking change
  • Documentation update
  • Refactoring (no functional changes)

@eliasubz
Copy link
Copy Markdown
Author

eliasubz commented May 5, 2026

Hey could someone review this PR? It just adds the ollama interface to models/ to run local evals with text and image. I tested evals such as gsm8k with few_shot, vqa_val_lite and mme. If there is anything I missed just let me know!

Copy link
Copy Markdown
Collaborator

@kcz358 kcz358 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for the contribution. It seems like this model just inherit oai with some change on the host or base url, where everything can also be configured using the original openai class itself? If this is just a self hosted oai server, I think can just use the original oai chat models instead of create a new ollama model

@eliasubz
Copy link
Copy Markdown
Author

eliasubz commented May 6, 2026

Yes, that actually makes a lot of sense. The only reason for the push I see now is that with the ollama model available more people would be aware that they can use it, because I dont think everyone is aware they can use the openai model with a ollama self-host.
Regardless, do you know any other necessary contributions that dont have an issue right now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants