Issue: ValueError in interns1_vit.py when serving intern-s1 model
Environment
- OS: Ubuntu 22.04
- CUDA: 12.8.1
- Python: 3.11
- PyTorch: 2.8.0
- vLLM: 0.11.0
- Transformers: 4.57.1
Command
vllm serve intern-s1 \
--port 8008 \
--host 0.0.0.0 \
--served-model-name intern-s1 \
--trust-remote-code \
--gpu-memory-utilization 0.9 \
--tensor-parallel-size 8
Error
File "/usr/local/lib/python3.11/site-packages/vllm/model_executor/models/interns1_vit.py", line 221, in forward
(Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671] B_, N_, H_, D_ = q.shape
(Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671] ^^^^^^^^^^^^^^
(Worker_TP0 pid=1145) ERROR 10-24 16:40:22 [multiproc_executor.py:671] ValueError: not enough values to unpack (expected 4, got 3)
Description
When trying to serve the intern-s1 model using vLLM, the service fails with a ValueError in the interns1_vit.py file. The error occurs at line 221 where the code attempts to unpack 4 values from q.shape but only receives 3 values.
This suggests there's a shape mismatch in the tensor dimensions, likely related to the model's vision transformer implementation. The error occurs during the forward pass of the model execution.
Expected Behavior
The model should load and serve successfully without shape unpacking errors.
Issue: ValueError in interns1_vit.py when serving intern-s1 model
Environment
Command
vllm serve intern-s1 \ --port 8008 \ --host 0.0.0.0 \ --served-model-name intern-s1 \ --trust-remote-code \ --gpu-memory-utilization 0.9 \ --tensor-parallel-size 8Error
Description
When trying to serve the intern-s1 model using vLLM, the service fails with a ValueError in the
interns1_vit.pyfile. The error occurs at line 221 where the code attempts to unpack 4 values fromq.shapebut only receives 3 values.This suggests there's a shape mismatch in the tensor dimensions, likely related to the model's vision transformer implementation. The error occurs during the forward pass of the model execution.
Expected Behavior
The model should load and serve successfully without shape unpacking errors.