feat(server): add Anthropic Messages API endpoint (/v1/messages) by carlushuang · Pull Request #778 · ROCm/ATOM

carlushuang · 2026-05-13T22:39:25Z

Summary

Add /v1/messages endpoint to ATOM's OpenAI server, enabling Claude Code and other Anthropic-compatible tools to use ATOM as a backend.

Depends on PR #775 (MiniMax M2.7 reasoning parser fix).

What it does

Translates between Anthropic Messages API format and ATOM's internal OpenAI format:

Claude Code CLI → /v1/messages (Anthropic format)
       ↓
serving_anthropic.py (format translation)
       ↓
ATOM engine (any model, e.g., MiniMax M2.7)
       ↓
GPU inference

Features

Non-streaming and streaming responses
Anthropic SSE event format: message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop
Thinking/reasoning separation: <think> blocks → thinking content blocks (via ReasoningFilter)
System messages: string or content-block array
Tool definitions: Anthropic → OpenAI format translation
Tool use/result messages: bidirectional translation

New files

atom/entrypoints/openai/serving_anthropic.py — request/response schemas, format converters, SSE helpers

Usage with Claude Code

# 1. Start ATOM
python -m atom.entrypoints.openai_server --model MiniMaxAI/MiniMax-M2.7 \
  --trust-remote-code --kv_cache_dtype fp8 -tp 2 --server-port 8000

# 2. Configure Claude Code (~/.claude/settings.json)
{
  "env": {
    "ANTHROPIC_BASE_URL": "http://localhost:8000",
    "ANTHROPIC_AUTH_TOKEN": "dummy",
    "ANTHROPIC_MODEL": "MiniMax-M2.7",
    "DISABLE_PROMPT_CACHING": "1"
  }
}

# 3. Use Claude Code
claude

Verified on

MiniMax M2.7 on MI355X (gfx950), TP=2, FP8 KV
Claude Code --print "Say hello world" → Hello, world!
Claude Code --print "Write is_prime function" → correct Python code
Streaming and non-streaming both work
Thinking content properly separated into thinking blocks

Test plan

Non-streaming /v1/messages returns correct Anthropic format
Streaming /v1/messages returns correct SSE events
Thinking/reasoning separated into thinking content blocks
Claude Code end-to-end: hello world, code generation, math
Tool calling (needs model with tool-call support)

MiniMax M2.7's chat template injects <think> as part of the prompt, so the model output contains only </think> (no <think> start tag). The reasoning parser now splits at </think> even without a preceding <think> in both non-streaming (separate_reasoning) and streaming (ReasoningFilter) paths.

Replace MiniMax-M2.5 → M2.7 and M2.5-MXFP4 → M2.7-MXFP4 across all benchmark and accuracy configs. Same architecture (MiniMaxM2ForCausalLM), M2.7 has better-trained weights. Updated accuracy baselines from M2.7 HF card: gsm8k=0.9181 (BF16), MXFP4=0.9189. MXFP4 model: amd/MiniMax-M2.7-MXFP4 (Quark quantized). Local perf verified on MI355X: M2.7 BF16 TP=2 matches M2.5 dashboard numbers within noise (817 vs 808 tok/s at c=4, 4745 vs 4685 at c=64).

Enables Claude Code and other Anthropic-compatible tools to use ATOM as a backend. Translates between Anthropic Messages format and ATOM's internal OpenAI format. Supports: - Non-streaming and streaming responses - System messages, multi-turn conversations - Thinking/reasoning content separation (via ReasoningFilter) - Anthropic SSE event format (message_start, content_block_delta, etc.) - Tool definitions translation (Anthropic → OpenAI format) Usage with Claude Code: ANTHROPIC_BASE_URL=http://localhost:8000 \ ANTHROPIC_AUTH_TOKEN=dummy \ ANTHROPIC_MODEL=MiniMax-M2.7 \ claude

carlushuang added 2 commits May 13, 2026 13:32

carlushuang force-pushed the carhuang/enable_anthropic_endp branch 3 times, most recently from 581f897 to 4c104f9 Compare May 13, 2026 22:49

carlushuang force-pushed the carhuang/enable_anthropic_endp branch from 4c104f9 to 298a7a8 Compare May 14, 2026 02:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(server): add Anthropic Messages API endpoint (/v1/messages)#778

feat(server): add Anthropic Messages API endpoint (/v1/messages)#778
carlushuang wants to merge 3 commits into
mainfrom
carhuang/enable_anthropic_endp

carlushuang commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlushuang commented May 13, 2026

Summary

What it does

Features

New files

Usage with Claude Code

Verified on

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant