feat(together-ai): update model YAMLs [bot] by harshiv-26 · Pull Request #919 · truefoundry/models

harshiv-26 · 2026-05-02T02:31:37Z

Auto-generated by poc-agent for provider together-ai.

Note

Medium Risk
Primarily metadata changes, but updates to provisioning, context_window limits, and new deprecation dates could affect model selection and runtime routing if consumers rely on these fields.

Overview
Updates Together.ai provider model YAMLs with refreshed capabilities and lifecycle metadata.

Notable changes include reduced context_window for MiniMax-M2.5*, provisioning updates (e.g., MiniMax-M2.5 and DeepSeek-R1-Original to provisioned), new features/modalities flags (e.g., system_messages, video/image inputs), added deprecationDate on several Qwen/DeepSeek models, and expanded/normalized sources plus a few status/thinking flag adjustments (e.g., imagen-4.0-preview to status: preview).

^{Reviewed by Cursor Bugbot for commit 438d118. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-05-02T02:31:45Z

/test-models

harshiv-26 · 2026-05-02T02:34:58Z

Gateway test results

Total: 112
Passed: 52
Failed: 21
Validation failed: 27
Errored: 0
Skipped: 12
Success rate: 52.0%

Provider	Model	Scenarios
`together-ai`	`MiniMaxAI/MiniMax-M2.5`	failure: reasoning:stream, tool-call, structured-output:stream, tool-call:stream, reasoning, params, structured-output, json-output:stream, json-output, params:stream
`together-ai`	`MiniMaxAI/MiniMax-M2.5-FP4`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-4B-Instruct-2507`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-8B-Lora`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-Coder-Next-FP8`	success: structured-output:stream, params, tool-call, params:stream, structured-output validation_failure: tool-call:stream
`together-ai`	`Qwen/Qwen3.5-35B-A3B`	skipped: skip-check
`together-ai`	`Qwen/Qwen3.5-9B`	success: params:stream, parallel-tool-call, structured-output, params, tool-call, json-output, structured-output:stream, json-output:stream validation_failure: parallel-tool-call:stream, tool-call:stream, reasoning:stream, reasoning
`together-ai`	`Qwen/Qwen3.5-9B-FP8`	skipped: skip-check
`together-ai`	`black-forest-labs/FLUX.2-max`	skipped: skip-check
`together-ai`	`deepseek-ai/DeepSeek-R1`	success: json-output, params:stream, params, structured-output, json-output:stream, structured-output:stream validation_failure: tool-call:stream, tool-call, reasoning, reasoning:stream
`together-ai`	`deepseek-ai/DeepSeek-R1-Original`	failure: tool-call, json-output:stream, tool-call:stream, reasoning:stream, json-output, params:stream, params, reasoning
`together-ai`	`deepseek-ai/DeepSeek-V3`	success: structured-output, tool-call, params:stream, json-output:stream, json-output, params failure: structured-output:stream validation_failure: tool-call:stream, reasoning, reasoning:stream
`together-ai`	`deepseek-ai/DeepSeek-V3-0324`	skipped: skip-check
`together-ai`	`deepseek-ai/DeepSeek-V3.1`	success: tool-call, params:stream, parallel-tool-call, params failure: structured-output, structured-output:stream validation_failure: parallel-tool-call:stream, tool-call:stream, reasoning, reasoning:stream
`together-ai`	`google/gemini-3-pro-image`	skipped: skip-check
`together-ai`	`google/gemma-4-26B-A4B-it`	skipped: skip-check
`together-ai`	`google/imagen-4.0-preview`	skipped: skip-check
`together-ai`	`mistralai/Mistral-7B-Instruct-v0.2`	skipped: skip-check
`together-ai`	`moonshotai/Kimi-K2.5`	success: json-output, structured-output:stream, json-output:stream, structured-output, parallel-tool-call, params, tool-call, params:stream validation_failure: tool-call:stream, parallel-tool-call:stream, reasoning:stream, reasoning
`together-ai`	`openai/gpt-image-1.5`	skipped: skip-check
`together-ai`	`openai/gpt-oss-120b`	success: json-output:stream, params:stream, params, tool-call, json-output, structured-output, structured-output:stream, parallel-tool-call validation_failure: tool-call:stream, parallel-tool-call:stream, reasoning, reasoning:stream
`together-ai`	`zai-org/GLM-5`	success: structured-output:stream, json-output:stream, json-output, structured-output, params, params:stream, tool-call validation_failure: tool-call:stream, reasoning, reasoning:stream

Failures (48)

together-ai/deepseek-ai/DeepSeek-R1 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpwhuvj8go/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — tool-call (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6gzck3wn/snippet.py", line 46, in <module>
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
Exception: VALIDATION FAILED: tool-call - no tool calls in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

_message = response.choices[0].message
if _message.tool_calls:
    for _tc in _message.tool_calls:
        print(f"Function: {_tc.function.name}")
        print(f"Arguments: {_tc.function.arguments}")
else:
    print(_message.content)

if not _message.tool_calls or len(_message.tool_calls) == 0:
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
print("VALIDATION: tool-call SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6tvdp0c4/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpdp99owdi/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpcusomg8e/snippet.py", line 53, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant with access to tools. You MUST strictly use the provided tools to answer. Never respond with plain text when a tool is available."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpocm_kfeh/snippet.py", line 54, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant with access to tools. You MUST strictly call multiple tools in parallel whenever possible. Never call them sequentially."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpx5cfwrnb/snippet.py", line 35, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpkyhca_go/snippet.py", line 43, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/openai/gpt-oss-120b — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpbc78x32e/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/openai/gpt-oss-120b — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp0l2a9kfg/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/openai/gpt-oss-120b — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp_xjei02o/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/openai/gpt-oss-120b — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6a5zt3fp/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpas_1wvof/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp0g3t_8mj/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp9dph940t/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — structured-output (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpuxizlu5s/snippet.py", line 38, in <module>
    _parsed = _json.loads(_content)
              ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 14343 column 1 (char 23105)

Code snippet

from openai import OpenAI
import json

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response_schema = json.loads('''{
  "title": "CalendarEvent",
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "date": { "type": "string" },
    "participants": {
      "type": "array",
      "items": { "type": "string" }
    }
  },
  "required": ["name", "date", "participants"],
  "additionalProperties": false
}''')

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday. Extract the event details as JSON."},
    ],
    response_format={"type": "json_schema", "json_schema": {"name": "CalendarEvent", "schema": response_schema}},
    stream=False,
)

import json as _json

_content = response.choices[0].message.content
print(_content)

if not _content:
    raise Exception("VALIDATION FAILED: structured-output - response content is empty")

_parsed = _json.loads(_content)

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output - 'participants' is not a list, schema not enforced")

if set(_parsed.keys()) != {"name", "date", "participants"}:
    raise Exception(
        f"VALIDATION FAILED: structured-output - unexpected keys present: {set(_parsed.keys())}"
    )

print("VALIDATION: structured-output SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — structured-output:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpiyibeh8w/snippet.py", line 43, in <module>
    _parsed = _json.loads(_accumulated)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 14668 column 1 (char 23272)

Code snippet

from openai import OpenAI
import json

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response_schema = json.loads('''{
  "title": "CalendarEvent",
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "date": { "type": "string" },
    "participants": {
      "type": "array",
      "items": { "type": "string" }
    }
  },
  "required": ["name", "date", "participants"],
  "additionalProperties": false
}''')

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday. Extract the event details as JSON."},
    ],
    response_format={"type": "json_schema", "json_schema": {"name": "CalendarEvent", "schema": response_schema}},
    stream=True,
)

import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: structured-output stream - no content received")

_parsed = _json.loads(_accumulated)

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output stream - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output stream - 'participants' is not a list, schema not enforced")

if set(_parsed.keys()) != {"name", "date", "participants"}:
    raise Exception(
        f"VALIDATION FAILED: structured-output stream - unexpected keys present: {set(_parsed.keys())}"
    )

print("\nVALIDATION: structured-output stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpv70tlbxg/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/Qwen/Qwen3-Coder-Next-FP8 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp4mxi80f0/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3-Coder-Next-FP8",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/zai-org/GLM-5 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpsq1oea_q/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/zai-org/GLM-5 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpltikgy88/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/zai-org/GLM-5 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmprhydajfy/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1-Original — tool-call (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpg1cwc8x5/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1-Original",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

_message = response.choices[0].message
if _message.tool_calls:
    for _tc in _message.tool_calls:
        print(f"Function: {_tc.function.name}")
        print(f"Arguments: {_tc.function.arguments}")
else:
    print(_message.content)

if not _message.tool_calls or len(_message.tool_calls) == 0:
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
print("VALIDATION: tool-call SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1-Original — json-output:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp91d6skog/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1-Original",
    messages=[
        {"role": "user", "content": "List 3 colors with their hex codes in JSON."},
    ],
    response_format={"type": "json_object"},
    stream=True,
)

import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: json-output stream - no content received")

_json.loads(_accumulated)
print("\nVALIDATION: json-output stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1-Original — tool-call:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp9_pe4t_8/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1-Original",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1-Original — reasoning:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp4rb_q81u/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1-Original",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1-Original — json-output (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp9ot5uttx/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1-Original",
    messages=[
        {"role": "user", "content": "List 3 colors with their hex codes in JSON."},
    ],
    response_format={"type": "json_object"},
    stream=False,
)

import json as _json

_content = response.choices[0].message.content
print(_content)

if not _content:
    raise Exception("VALIDATION FAILED: json-output - response content is empty")

_json.loads(_content)
print("VALIDATION: json-output SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1-Original — params:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmps38eonwz/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1-Original",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)

together-ai/deepseek-ai/DeepSeek-R1-Original — params (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp_xxtnidz/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1-Original",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=False,
)

print(response.choices[0].message.content)

together-ai/deepseek-ai/DeepSeek-R1-Original — reasoning (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpum8hp1l9/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model deepseek-ai/DeepSeek-R1-Original. Please visit https://api.together.ai/models/deepseek-ai/DeepSeek-R1-Original to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1-Original",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpxle0f2gq/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmphuv8jbji/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp11nm8j00/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — structured-output:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp2l4flb78/snippet.py", line 43, in <module>
    _parsed = _json.loads(_accumulated)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 15097 column 1 (char 24110)

Code snippet

from openai import OpenAI
import json

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response_schema = json.loads('''{
  "title": "CalendarEvent",
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "date": { "type": "string" },
    "participants": {
      "type": "array",
      "items": { "type": "string" }
    }
  },
  "required": ["name", "date", "participants"],
  "additionalProperties": false
}''')

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday. Extract the event details as JSON."},
    ],
    response_format={"type": "json_schema", "json_schema": {"name": "CalendarEvent", "schema": response_schema}},
    stream=True,
)

import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: structured-output stream - no content received")

_parsed = _json.loads(_accumulated)

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output stream - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output stream - 'participants' is not a list, schema not enforced")

if set(_parsed.keys()) != {"name", "date", "participants"}:
    raise Exception(
        f"VALIDATION FAILED: structured-output stream - unexpected keys present: {set(_parsed.keys())}"
    )

print("\nVALIDATION: structured-output stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpb2xjgz8x/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpiff8sup3/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpvb1nvqnp/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpdklibatj/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — reasoning:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6i41xygy/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — tool-call (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp3cu3zcig/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

_message = response.choices[0].message
if _message.tool_calls:
    for _tc in _message.tool_calls:
        print(f"Function: {_tc.function.name}")
        print(f"Arguments: {_tc.function.arguments}")
else:
    print(_message.content)

if not _message.tool_calls or len(_message.tool_calls) == 0:
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
print("VALIDATION: tool-call SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — structured-output:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp146xk_y5/snippet.py", line 21, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI
import json

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response_schema = json.loads('''{
  "title": "CalendarEvent",
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "date": { "type": "string" },
    "participants": {
      "type": "array",
      "items": { "type": "string" }
    }
  },
  "required": ["name", "date", "participants"],
  "additionalProperties": false
}''')

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday. Extract the event details as JSON."},
    ],
    response_format={"type": "json_schema", "json_schema": {"name": "CalendarEvent", "schema": response_schema}},
    stream=True,
)

import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: structured-output stream - no content received")

_parsed = _json.loads(_accumulated)

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output stream - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output stream - 'participants' is not a list, schema not enforced")

if set(_parsed.keys()) != {"name", "date", "participants"}:
    raise Exception(
        f"VALIDATION FAILED: structured-output stream - unexpected keys present: {set(_parsed.keys())}"
    )

print("\nVALIDATION: structured-output stream SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — tool-call:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpo03ae9gj/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — reasoning (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp5jxgsy_b/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — params (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6gy0n6dn/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=False,
)

print(response.choices[0].message.content)

together-ai/MiniMaxAI/MiniMax-M2.5 — structured-output (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpc2akin3h/snippet.py", line 21, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI
import json

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response_schema = json.loads('''{
  "title": "CalendarEvent",
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "date": { "type": "string" },
    "participants": {
      "type": "array",
      "items": { "type": "string" }
    }
  },
  "required": ["name", "date", "participants"],
  "additionalProperties": false
}''')

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday. Extract the event details as JSON."},
    ],
    response_format={"type": "json_schema", "json_schema": {"name": "CalendarEvent", "schema": response_schema}},
    stream=False,
)

import json as _json

_content = response.choices[0].message.content
print(_content)

if not _content:
    raise Exception("VALIDATION FAILED: structured-output - response content is empty")

_parsed = _json.loads(_content)

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output - 'participants' is not a list, schema not enforced")

if set(_parsed.keys()) != {"name", "date", "participants"}:
    raise Exception(
        f"VALIDATION FAILED: structured-output - unexpected keys present: {set(_parsed.keys())}"
    )

print("VALIDATION: structured-output SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — json-output:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpbl45d6hq/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "List 3 colors with their hex codes in JSON."},
    ],
    response_format={"type": "json_object"},
    stream=True,
)

import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: json-output stream - no content received")

_json.loads(_accumulated)
print("\nVALIDATION: json-output stream SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — json-output (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp5cmk7zs3/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "List 3 colors with their hex codes in JSON."},
    ],
    response_format={"type": "json_object"},
    stream=False,
)

import json as _json

_content = response.choices[0].message.content
print(_content)

if not _content:
    raise Exception("VALIDATION FAILED: json-output - response content is empty")

_json.loads(_content)
print("VALIDATION: json-output SUCCESS")

together-ai/MiniMaxAI/MiniMax-M2.5 — params:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpsm_kb0g_/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model MiniMaxAI/MiniMax-M2.5. Please visit https://api.together.ai/models/MiniMaxAI/MiniMax-M2.5 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/MiniMaxAI-MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)

Skipped (12)

together-ai/black-forest-labs/FLUX.2-max — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/deepseek-ai/DeepSeek-V3-0324 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/google/gemini-3-pro-image — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/google/gemma-4-26B-A4B-it — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/google/imagen-4.0-preview — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/MiniMaxAI/MiniMax-M2.5-FP4 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/mistralai/Mistral-7B-Instruct-v0.2 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/openai/gpt-image-1.5 — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/Qwen/Qwen3-4B-Instruct-2507 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3-8B-Lora — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3.5-35B-A3B — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3.5-9B-FP8 — skip-check (skipped)

Skip reason:

Provisioned model

github-actions · 2026-05-02T05:44:12Z

/test-models

harshiv-26 · 2026-05-02T05:47:29Z

Gateway test results

Total: 96
Passed: 54
Failed: 1
Validation failed: 27
Errored: 0
Skipped: 14
Success rate: 65.85%

Provider	Model	Scenarios
`together-ai`	`MiniMaxAI/MiniMax-M2.5`	skipped: skip-check
`together-ai`	`MiniMaxAI/MiniMax-M2.5-FP4`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-4B-Instruct-2507`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-8B-Lora`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-Coder-Next-FP8`	success: structured-output, params, params:stream, structured-output:stream, tool-call validation_failure: tool-call:stream
`together-ai`	`Qwen/Qwen3.5-35B-A3B`	skipped: skip-check
`together-ai`	`Qwen/Qwen3.5-9B`	success: params:stream, params, tool-call, parallel-tool-call, structured-output:stream, json-output:stream, structured-output, json-output validation_failure: tool-call:stream, parallel-tool-call:stream, reasoning, reasoning:stream
`together-ai`	`Qwen/Qwen3.5-9B-FP8`	skipped: skip-check
`together-ai`	`black-forest-labs/FLUX.2-max`	skipped: skip-check
`together-ai`	`deepseek-ai/DeepSeek-R1`	success: json-output:stream, params:stream, structured-output:stream, structured-output, json-output, params validation_failure: tool-call:stream, tool-call, reasoning, reasoning:stream
`together-ai`	`deepseek-ai/DeepSeek-R1-Original`	skipped: skip-check
`together-ai`	`deepseek-ai/DeepSeek-V3`	success: structured-output:stream, params, json-output:stream, json-output, params:stream, structured-output, tool-call validation_failure: tool-call:stream, reasoning, reasoning:stream
`together-ai`	`deepseek-ai/DeepSeek-V3-0324`	skipped: skip-check
`together-ai`	`deepseek-ai/DeepSeek-V3.1`	success: structured-output, params:stream, tool-call, structured-output:stream, parallel-tool-call, params validation_failure: parallel-tool-call:stream, tool-call:stream, reasoning:stream, reasoning
`together-ai`	`google/gemini-3-pro-image`	skipped: skip-check
`together-ai`	`google/gemma-4-26B-A4B-it`	skipped: skip-check
`together-ai`	`google/imagen-4.0-preview`	skipped: skip-check
`together-ai`	`mistralai/Mistral-7B-Instruct-v0.2`	skipped: skip-check
`together-ai`	`moonshotai/Kimi-K2.5`	success: json-output:stream, structured-output:stream, structured-output, params:stream, tool-call, json-output, parallel-tool-call, params validation_failure: tool-call:stream, parallel-tool-call:stream, reasoning, reasoning:stream
`together-ai`	`openai/gpt-image-1.5`	skipped: skip-check
`together-ai`	`openai/gpt-oss-120b`	success: params:stream, params, structured-output, tool-call, json-output, structured-output:stream, parallel-tool-call failure: json-output:stream validation_failure: tool-call:stream, parallel-tool-call:stream, reasoning, reasoning:stream
`together-ai`	`zai-org/GLM-5`	success: structured-output, structured-output:stream, tool-call, json-output:stream, params, params:stream, json-output validation_failure: tool-call:stream, reasoning, reasoning:stream

Failures (28)

together-ai/deepseek-ai/DeepSeek-V3 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpids_txj3/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpsahgboee/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpm6em0jw7/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp894vpk9b/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmppvvyn6ub/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp83s8yyzv/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmphwcax1yx/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp58mdo9ox/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — tool-call (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp7eddr9p1/snippet.py", line 46, in <module>
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
Exception: VALIDATION FAILED: tool-call - no tool calls in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

_message = response.choices[0].message
if _message.tool_calls:
    for _tc in _message.tool_calls:
        print(f"Function: {_tc.function.name}")
        print(f"Arguments: {_tc.function.arguments}")
else:
    print(_message.content)

if not _message.tool_calls or len(_message.tool_calls) == 0:
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
print("VALIDATION: tool-call SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpngpdo2xm/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpsft05cvq/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/Qwen/Qwen3-Coder-Next-FP8 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpaeh4bm6y/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3-Coder-Next-FP8",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/openai/gpt-oss-120b — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmps913s79i/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/openai/gpt-oss-120b — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp49k96e_7/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/openai/gpt-oss-120b — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp2_4tn2cz/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/openai/gpt-oss-120b — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp8mafh7kx/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/openai/gpt-oss-120b — json-output:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpve50z4tw/snippet.py", line 27, in <module>
    _json.loads(_accumulated)
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 2 (char 1)

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "List 3 colors with their hex codes in JSON."},
    ],
    response_format={"type": "json_object"},
    stream=True,
)

import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: json-output stream - no content received")

_json.loads(_accumulated)
print("\nVALIDATION: json-output stream SUCCESS")

together-ai/zai-org/GLM-5 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmphmkcjdoe/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/zai-org/GLM-5 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpfv1rf20d/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/zai-org/GLM-5 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpfmaffhj4/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpbp0s7ces/snippet.py", line 53, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant with access to tools. You MUST strictly use the provided tools to answer. Never respond with plain text when a tool is available."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp1wx2cm5i/snippet.py", line 54, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant with access to tools. You MUST strictly call multiple tools in parallel whenever possible. Never call them sequentially."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6390ev0a/snippet.py", line 43, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp39jyv2el/snippet.py", line 35, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpjb1qtn5h/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmped08thys/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6p2ve9s_/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/Qwen/Qwen3.5-9B — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpeq08kux8/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

Skipped (14)

together-ai/black-forest-labs/FLUX.2-max — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/deepseek-ai/DeepSeek-R1-Original — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/deepseek-ai/DeepSeek-V3-0324 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/google/gemini-3-pro-image — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/google/gemma-4-26B-A4B-it — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/google/imagen-4.0-preview — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/MiniMaxAI/MiniMax-M2.5 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/MiniMaxAI/MiniMax-M2.5-FP4 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/mistralai/Mistral-7B-Instruct-v0.2 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/openai/gpt-image-1.5 — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/Qwen/Qwen3-4B-Instruct-2507 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3-8B-Lora — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3.5-35B-A3B — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3.5-9B-FP8 — skip-check (skipped)

Skip reason:

Provisioned model

github-actions · 2026-05-06T08:03:11Z

/test-models

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 438d118. Configure here.}

cursor · 2026-05-06T08:07:47Z

 mode: chat
 model: MiniMaxAI/MiniMax-M2.5
-provisioning: serverless
+provisioning: provisioned


MiniMax availability is misclassified

Medium Severity

MiniMaxAI/MiniMax-M2.5 is still listed as a serverless Together model with a larger context window, but this marks it provisioned and lowers context_window. Consumers may hide the serverless model or reject valid prompts.

Additional Locations (1)

providers/together-ai/MiniMaxAI/MiniMax-M2.5.yaml#L11-L12

^{Reviewed by Cursor Bugbot for commit 438d118. Configure here.}

cursor · 2026-05-06T08:07:47Z

+      region: "*"
+features:
+    - function_calling
+    - json_output


Original R1 capabilities are overstated

Medium Severity

DeepSeek-R1-Original did not gain native function_calling or json_output; those capabilities belong to newer R1 variants. This metadata can make clients send unsupported tool or JSON-mode requests to the original model.

^{Reviewed by Cursor Bugbot for commit 438d118. Configure here.}

harshiv-26 · 2026-05-06T08:09:46Z

Gateway test results

Total: 96
Passed: 54
Failed: 1
Validation failed: 27
Errored: 0
Skipped: 14
Success rate: 65.85%

Provider	Model	Scenarios
`together-ai`	`MiniMaxAI/MiniMax-M2.5`	skipped: skip-check
`together-ai`	`MiniMaxAI/MiniMax-M2.5-FP4`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-4B-Instruct-2507`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-8B-Lora`	skipped: skip-check
`together-ai`	`Qwen/Qwen3-Coder-Next-FP8`	success: tool-call, structured-output:stream, params, structured-output, params:stream validation_failure: tool-call:stream
`together-ai`	`Qwen/Qwen3.5-35B-A3B`	skipped: skip-check
`together-ai`	`Qwen/Qwen3.5-9B`	success: tool-call, params, parallel-tool-call, params:stream, structured-output:stream, json-output, structured-output, json-output:stream validation_failure: tool-call:stream, parallel-tool-call:stream, reasoning, reasoning:stream
`together-ai`	`Qwen/Qwen3.5-9B-FP8`	skipped: skip-check
`together-ai`	`black-forest-labs/FLUX.2-max`	skipped: skip-check
`together-ai`	`deepseek-ai/DeepSeek-R1`	success: json-output:stream, json-output, params:stream, structured-output:stream, structured-output, params validation_failure: tool-call, tool-call:stream, reasoning:stream, reasoning
`together-ai`	`deepseek-ai/DeepSeek-R1-Original`	skipped: skip-check
`together-ai`	`deepseek-ai/DeepSeek-V3`	success: structured-output, tool-call, params:stream, params, json-output:stream, json-output failure: structured-output:stream validation_failure: tool-call:stream, reasoning, reasoning:stream
`together-ai`	`deepseek-ai/DeepSeek-V3-0324`	skipped: skip-check
`together-ai`	`deepseek-ai/DeepSeek-V3.1`	success: tool-call, parallel-tool-call, params:stream, params, structured-output:stream, structured-output validation_failure: parallel-tool-call:stream, tool-call:stream, reasoning, reasoning:stream
`together-ai`	`google/gemini-3-pro-image`	skipped: skip-check
`together-ai`	`google/gemma-4-26B-A4B-it`	skipped: skip-check
`together-ai`	`google/imagen-4.0-preview`	skipped: skip-check
`together-ai`	`mistralai/Mistral-7B-Instruct-v0.2`	skipped: skip-check
`together-ai`	`moonshotai/Kimi-K2.5`	success: structured-output:stream, parallel-tool-call, structured-output, json-output:stream, params, json-output, params:stream, tool-call validation_failure: parallel-tool-call:stream, reasoning, tool-call:stream, reasoning:stream
`together-ai`	`openai/gpt-image-1.5`	skipped: skip-check
`together-ai`	`openai/gpt-oss-120b`	success: structured-output:stream, params:stream, tool-call, params, parallel-tool-call, structured-output, json-output:stream, json-output validation_failure: tool-call:stream, parallel-tool-call:stream, reasoning, reasoning:stream
`together-ai`	`zai-org/GLM-5`	success: tool-call, structured-output, structured-output:stream, json-output:stream, json-output, params, params:stream validation_failure: tool-call:stream, reasoning, reasoning:stream

Failures (28)

together-ai/deepseek-ai/DeepSeek-V3 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp7xetoyxc/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmppqu4c5n1/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpfypryskh/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3 — structured-output:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmppw1neab5/snippet.py", line 43, in <module>
    _parsed = _json.loads(_accumulated)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 14311 column 52 (char 23071)

Code snippet

from openai import OpenAI
import json

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response_schema = json.loads('''{
  "title": "CalendarEvent",
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "date": { "type": "string" },
    "participants": {
      "type": "array",
      "items": { "type": "string" }
    }
  },
  "required": ["name", "date", "participants"],
  "additionalProperties": false
}''')

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3",
    messages=[
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday. Extract the event details as JSON."},
    ],
    response_format={"type": "json_schema", "json_schema": {"name": "CalendarEvent", "schema": response_schema}},
    stream=True,
)

import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: structured-output stream - no content received")

_parsed = _json.loads(_accumulated)

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output stream - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output stream - 'participants' is not a list, schema not enforced")

if set(_parsed.keys()) != {"name", "date", "participants"}:
    raise Exception(
        f"VALIDATION FAILED: structured-output stream - unexpected keys present: {set(_parsed.keys())}"
    )

print("\nVALIDATION: structured-output stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp01ei_geo/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpofm8whtt/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpw3bei4x9/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/deepseek-ai/DeepSeek-V3.1 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpwot9c544/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-V3.1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/openai/gpt-oss-120b — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp_9aaun2q/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/openai/gpt-oss-120b — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp5hzlp9da/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/openai/gpt-oss-120b — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp715njnkh/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/openai/gpt-oss-120b — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpk4ltirz1/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/openai-gpt-oss-120b",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/Qwen/Qwen3-Coder-Next-FP8 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpup7avf0b/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3-Coder-Next-FP8",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp3bwap026/snippet.py", line 54, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant with access to tools. You MUST strictly call multiple tools in parallel whenever possible. Never call them sequentially."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpn0_jtx8f/snippet.py", line 43, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpaguukiid/snippet.py", line 53, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant with access to tools. You MUST strictly use the provided tools to answer. Never respond with plain text when a tool is available."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/moonshotai/Kimi-K2.5 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpiqkd2y9u/snippet.py", line 35, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/moonshotai-Kimi-K2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."},
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": "Hi, how can I help you"},
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/zai-org/GLM-5 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpaowxyxc6/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/zai-org/GLM-5 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpan3s6j30/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/zai-org/GLM-5 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpu3eu0qx4/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/zai-org-GLM-5",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — tool-call (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpwww9wr6x/snippet.py", line 46, in <module>
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
Exception: VALIDATION FAILED: tool-call - no tool calls in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

_message = response.choices[0].message
if _message.tool_calls:
    for _tc in _message.tool_calls:
        print(f"Function: {_tc.function.name}")
        print(f"Arguments: {_tc.function.arguments}")
else:
    print(_message.content)

if not _message.tool_calls or len(_message.tool_calls) == 0:
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
print("VALIDATION: tool-call SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmph4r787qm/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpdfoxds9j/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/deepseek-ai/DeepSeek-R1 — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpouw8g8kt/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/deepseek-ai-DeepSeek-R1",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/Qwen/Qwen3.5-9B — tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp94q443gm/snippet.py", line 50, in <module>
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
Exception: VALIDATION FAILED: tool-call stream - no tool calls received

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — parallel-tool-call:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpiav7ymig/snippet.py", line 51, in <module>
    raise Exception(
Exception: VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, got 0

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London and Paris. You MUST make both tool calls strictly in parallel, not sequentially."},
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
    stream=True,
)

_tool_call_indices = set()
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            for _tc in delta.tool_calls:
                _tool_call_indices.add(_tc.index)
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if len(_tool_call_indices) < 1:
    raise Exception(
        f"VALIDATION FAILED: parallel-tool-call stream - expected at least 1 tool call, "
        f"got {len(_tool_call_indices)}"
    )
print(f"\nNumber of parallel tool calls: {len(_tool_call_indices)}")
print("VALIDATION: parallel-tool-call stream SUCCESS")

together-ai/Qwen/Qwen3.5-9B — reasoning (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpmrb3h7bz/snippet.py", line 40, in <module>
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
Exception: VALIDATION FAILED: reasoning - no reasoning information in response

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/Qwen/Qwen3.5-9B — reasoning:stream (validation_failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp36evp2cy/snippet.py", line 32, in <module>
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
Exception: VALIDATION FAILED: reasoning stream - no reasoning information in stream

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/Qwen-Qwen3.5-9B",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

Skipped (14)

together-ai/black-forest-labs/FLUX.2-max — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/deepseek-ai/DeepSeek-R1-Original — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/deepseek-ai/DeepSeek-V3-0324 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/google/gemini-3-pro-image — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/google/gemma-4-26B-A4B-it — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/google/imagen-4.0-preview — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/MiniMaxAI/MiniMax-M2.5 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/MiniMaxAI/MiniMax-M2.5-FP4 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/mistralai/Mistral-7B-Instruct-v0.2 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/openai/gpt-image-1.5 — skip-check (skipped)

Skip reason:

unsupported mode 'image'

together-ai/Qwen/Qwen3-4B-Instruct-2507 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3-8B-Lora — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3.5-35B-A3B — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/Qwen/Qwen3.5-9B-FP8 — skip-check (skipped)

Skip reason:

Provisioned model

feat(together-ai): update model YAMLs [bot]

2cd480e

provision models

39e3f02

cursor Bot reviewed May 2, 2026

View reviewed changes

Comment thread providers/together-ai/moonshotai/Kimi-K2.5.yaml

LordGameleo approved these changes May 6, 2026

View reviewed changes

Merge branch 'main' into bot/update-together-ai-20260502-023135

438d118

harshiv-26 enabled auto-merge (squash) May 6, 2026 08:02

harshiv-26 merged commit fbb4d85 into main May 6, 2026
8 checks passed

harshiv-26 deleted the bot/update-together-ai-20260502-023135 branch May 6, 2026 08:03

cursor Bot reviewed May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(together-ai): update model YAMLs [bot]#919

feat(together-ai): update model YAMLs [bot]#919
harshiv-26 merged 3 commits intomainfrom
bot/update-together-ai-20260502-023135

harshiv-26 commented May 2, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented May 2, 2026

Uh oh!

harshiv-26 commented May 2, 2026

Uh oh!

github-actions Bot commented May 2, 2026

Uh oh!

harshiv-26 commented May 2, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 6, 2026

Uh oh!

cursor Bot May 6, 2026

Uh oh!

harshiv-26 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

harshiv-26 commented May 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 2, 2026

Uh oh!

harshiv-26 commented May 2, 2026

Gateway test results

Uh oh!

github-actions Bot commented May 2, 2026

Uh oh!

harshiv-26 commented May 2, 2026

Gateway test results

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 6, 2026

Choose a reason for hiding this comment

MiniMax availability is misclassified

Uh oh!

cursor Bot May 6, 2026

Choose a reason for hiding this comment

Original R1 capabilities are overstated

Uh oh!

harshiv-26 commented May 6, 2026

Gateway test results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

harshiv-26 commented May 2, 2026 •

edited by cursor Bot

Loading