Skip to content

Commit 1326751

Browse files
committed
[None][fix] Use simple shard + BMM and fix chat template for GPT-OSS
- Use simple_shard_only + bmm sharding per reviewer feedback (uses all_gather for functional multi-GPU support) - Guard multimodal content-to-list conversion in llm.py with hasattr(processor, "image_processor") to fix TypeError in text-only model chat templates (e.g., GPT-OSS) Signed-off-by: Lucas Liebenwein <lliebenwein@nvidia.com> Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
1 parent afd5b38 commit 1326751

1 file changed

Lines changed: 7 additions & 3 deletions

File tree

  • tensorrt_llm/_torch/auto_deploy

tensorrt_llm/_torch/auto_deploy/llm.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,10 +49,14 @@ def __call__(
4949
# Normalize message content to list-of-dicts format for multimodal
5050
# processors (e.g., Llama4) that expect {"type": "text", "text": "..."}
5151
# instead of plain strings when tokenize=True.
52+
# Only apply for multimodal processors that need it; text-only models
53+
# (e.g., GPT-OSS) have chat templates that expect plain string content.
5254
messages = inputs["messages"]
53-
for msg in messages:
54-
if isinstance(msg.get("content"), str):
55-
msg["content"] = [{"type": "text", "text": msg["content"]}]
55+
is_multimodal = hasattr(self.processor, "image_processor")
56+
if is_multimodal:
57+
for msg in messages:
58+
if isinstance(msg.get("content"), str):
59+
msg["content"] = [{"type": "text", "text": msg["content"]}]
5660

5761
# TODO: we don't really need this but it makes for a good sanity check. Consider
5862
# removing this in the future if we need to speed things up.

0 commit comments

Comments
 (0)