Skip to content

Add MiniMax as LLM provider backend#501

Open
octo-patch wants to merge 1 commit into
explosion:mainfrom
octo-patch:feature/add-minimax-provider
Open

Add MiniMax as LLM provider backend#501
octo-patch wants to merge 1 commit into
explosion:mainfrom
octo-patch:feature/add-minimax-provider

Conversation

@octo-patch
Copy link
Copy Markdown

Summary

Adds MiniMax as a first-class REST LLM provider backend for spacy-llm, alongside OpenAI, Anthropic, Cohere, and PaLM.

MiniMax offers an OpenAI-compatible chat completions API with models like MiniMax-M2.5 (204K context) and MiniMax-M2.7 (1M context), making it a compelling alternative for structured NLP pipelines.

Changes

  • New provider module spacy_llm/models/rest/minimax/ with model, registry, and init files
  • Registry entry spacy.MiniMax.v1 for seamless spaCy config integration
  • Think-tag stripping for reasoning model output
  • Context length definitions for M2.5, M2.5-highspeed, M2.7, M2.7-highspeed
  • Test suite with unit tests (context lengths, endpoints) and integration tests (API response validation, error handling)
  • Usage example config for text classification with MiniMax
  • README updated to list MiniMax as a supported provider

Usage

Python

import spacy

nlp = spacy.blank("en")
nlp.add_pipe(
    "llm",
    config={
        "model": {
            "@llm_models": "spacy.MiniMax.v1",
            "name": "MiniMax-M2.5",
            "config": {"temperature": 0.0},
        },
        "task": {
            "@llm_tasks": "spacy.TextCat.v2",
            "labels": ["POSITIVE", "NEGATIVE"],
        },
    },
)
doc = nlp("This is a great product!")
print(doc.cats)

Config file

[components.llm.model]
@llm_models = "spacy.MiniMax.v1"
name = "MiniMax-M2.5"
config = {"temperature": 0.0}

Supported Models

Model Context Length
MiniMax-M2.7 1,048,576
MiniMax-M2.7-highspeed 1,048,576
MiniMax-M2.5 204,800
MiniMax-M2.5-highspeed 204,800

Test Plan

  • Unit tests pass: context length definitions, endpoint constants
  • Integration tests pass: API response validation with M2.5 and M2.5-highspeed
  • Error handling tests pass: unsupported model, invalid config
  • Registry integration verified: spacy.MiniMax.v1 resolves correctly
  • Existing tests unaffected

Add MiniMax as a first-class REST LLM provider alongside OpenAI, Anthropic,
Cohere, and PaLM. MiniMax uses an OpenAI-compatible chat completions API
with support for M2.5, M2.5-highspeed, M2.7, and M2.7-highspeed models.

- New `spacy_llm/models/rest/minimax/` provider module (model, registry, init)
- Registry entry `spacy.MiniMax.v1` for spaCy config integration
- Think-tag stripping for reasoning model output
- Temperature clamping and context length definitions
- Test suite with unit and integration tests
- Usage example config for text classification with MiniMax
- README updated to list MiniMax as supported provider
Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In model.py, the error-handling path inside the __call__ loop has a type mismatch bug: when "error" in responses, return responses["error"] returns a flat List[str] (e.g., ["<json error>", "<json error>"]), but the declared return type is Iterable[Iterable[str]]. Callers expecting a nested structure will silently iterate over individual characters of the error string rather than getting a proper response list. This should be return [responses["error"]] or appended to all_api_responses before returning, consistent with the non-error path.

Additionally, _verify_auth in model.py verifies credentials by making a live API call with self([["test"]]), which consumes billable tokens on every cold start or config validation. The other providers in this repo (e.g., OpenAI's implementation) typically use a lightweight /v1/models endpoint for this. Since MiniMax reportedly lacks that endpoint, it may be worth at minimum documenting this side-effect in a comment, or skipping the verification entirely and relying on the first real call to surface auth errors.

Finally, the _request closure is redefined on every iteration of the outer for prompts_for_doc in prompts loop — moving it outside the loop (accepting prompts_for_doc as a parameter) would be cleaner and avoids the closure capturing a loop variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants