feat: add OpenAI /v1/completions adapter for vLLM gpt-oss-120b accuracy by arekay-nv · Pull Request #308 · mlcommons/endpoints

arekay-nv · 2026-05-09T11:37:56Z

Adds APIType.OPENAI_COMPLETIONS routing to /v1/completions, which accepts pre-tokenized token ID arrays and bypasses vLLM's chat template — required for gpt-oss-120b where the Harmony format must be applied client-side.

Add APIType.OPENAI_COMPLETIONS with default_route "/v1/completions"
Add TextCompletionRequest/Response/SSE msgspec types
Add OpenAITextCompletionsAdapter (mirrors SGLang adapter, reuses OpenAISSEAccumulator)
Register adapter and accumulator in endpoint_client/config.py
Rename gptoss → gptoss_sglang presets; add gptoss_vllm across aime25/gpqa/livecodebench
Update sglang_gptoss_120b_example.yaml to use gptoss_sglang presets
Update vllm_gptoss_120b_example.yaml to use openai_completions + gptoss_vllm presets
Add 18 unit tests covering adapter, SSE, preset existence, and APIType integration

fix: move lazy test imports to module level; fix decode_sse_message return type

Move all inline imports in test_completions_adapter.py to file-level
Add test for empty-text SSE choice path
Fix HttpRequestAdapter.decode_sse_message abstract annotation from str -> Any (SGLang and completions adapters both return SSEDelta structs, not str)

examples/04_GPTOSS120B_Example/Readme.md:

Replace stale chat-completions note with accurate openai_completions description
Update performance-only vLLM api_type reference from "openai" to "openai_completions"

What does this PR do?

Type of change

Bug fix
New feature
Documentation update
Refactor/cleanup

Related issues

Testing

Tests added/updated
All tests pass locally
Manual testing completed

Checklist

Code follows project style
Pre-commit hooks pass
Documentation updated (if needed)

Adds APIType.OPENAI_COMPLETIONS routing to /v1/completions, which accepts pre-tokenized token ID arrays and bypasses vLLM's chat template — required for gpt-oss-120b where the Harmony format must be applied client-side. - Add APIType.OPENAI_COMPLETIONS with default_route "/v1/completions" - Add TextCompletionRequest/Response/SSE msgspec types - Add OpenAITextCompletionsAdapter (mirrors SGLang adapter, reuses OpenAISSEAccumulator) - Register adapter and accumulator in endpoint_client/config.py - Rename gptoss → gptoss_sglang presets; add gptoss_vllm across aime25/gpqa/livecodebench - Update sglang_gptoss_120b_example.yaml to use gptoss_sglang presets - Update vllm_gptoss_120b_example.yaml to use openai_completions + gptoss_vllm presets - Add 18 unit tests covering adapter, SSE, preset existence, and APIType integration fix: move lazy test imports to module level; fix decode_sse_message return type - Move all inline imports in test_completions_adapter.py to file-level - Add test for empty-text SSE choice path - Fix HttpRequestAdapter.decode_sse_message abstract annotation from str -> Any (SGLang and completions adapters both return SSEDelta structs, not str) examples/04_GPTOSS120B_Example/Readme.md: - Replace stale chat-completions note with accurate openai_completions description - Update performance-only vLLM api_type reference from "openai" to "openai_completions"

github-actions · 2026-05-09T11:38:10Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist

Code Review

This pull request introduces a new openai_completions API type and adapter to support the OpenAI /v1/completions endpoint, enabling the use of pre-tokenized input with vLLM. This change allows users to bypass server-side chat templates, ensuring parity with SGLang results for specific models like gpt-oss-120b. The implementation includes the OpenAITextCompletionsAdapter, updated configuration templates, documentation, and new unit tests. I have no feedback to provide.

github-actions Bot requested a review from nvzhihanj May 9, 2026 11:38

arekay-nv requested review from nv-alicheng and viraatc May 9, 2026 11:38

gemini-code-assist Bot reviewed May 9, 2026

View reviewed changes

arekay-nv requested a review from tianmu-li May 9, 2026 11:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add OpenAI /v1/completions adapter for vLLM gpt-oss-120b accuracy#308

feat: add OpenAI /v1/completions adapter for vLLM gpt-oss-120b accuracy#308
arekay-nv wants to merge 1 commit intomainfrom
arekay/openai-completions-adapter

arekay-nv commented May 9, 2026

Uh oh!

github-actions Bot commented May 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arekay-nv commented May 9, 2026

What does this PR do?

Type of change

Related issues

Testing

Checklist

Uh oh!

github-actions Bot commented May 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant