[https://nvbugs/5941242][fix] Fix SigLIP test failure (#12717)

tijyojwad · web-flow · commit a1777fdc94c6 · 2026-04-07T14:17:33.000-04:00
Signed-off-by: tijyojwad &lt;1127155+tijyojwad@users.noreply.github.com&gt;
diff --git a/tests/integration/test_lists/waives.txt b/tests/integration/test_lists/waives.txt
@@ -290,7 +290,6 @@ unittest/_torch/modules/test_fused_moe.py::test_fused_moe_triton_mxfp4[False-Fal
 full:RTXPro6000D/accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_eagle3_4gpus[v2_kv_cache-cutlass-one_model-overlap_scheduler] SKIP (https://nvbugs/5945047)
 accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4_4gpus[moe_backend=CUTLASS-mtp_nextn=0-pp4-fp8kv=False-attention_dp=False-cuda_graph=False-overlap_scheduler=False-low_precision_combine=False-torch_compile=False] SKIP (https://nvbugs/5945081)
 full:RTXPro6000D/accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4_4gpus[moe_backend=CUTLASS-mtp_nextn=0-ep4-fp8kv=True-attention_dp=True-cuda_graph=True-overlap_scheduler=True-low_precision_combine=False-torch_compile=False] SKIP (https://nvbugs/5948435)
-unittest/_torch/modeling -k "modeling_siglip" SKIP (https://nvbugs/5941242)
 accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_bfloat16[mtp_nextn=2-attention_dp=True-cuda_graph=True-overlap_scheduler=True-torch_compile=False-enable_chunked_prefill=False-v2_kv_cache=False] SKIP (https://nvbugs/5955765)
 accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_no_kv_cache_reuse[quant_dtype=none-mtp_nextn=2-fp8kv=False-attention_dp=True-cuda_graph=True-overlap_scheduler=True] SKIP (https://nvbugs/5955773)
 accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_fp8_blockscale[baseline_mtp1] SKIP (https://nvbugs/5955792)
diff --git a/tests/unittest/_torch/modeling/test_modeling_siglip.py b/tests/unittest/_torch/modeling/test_modeling_siglip.py
@@ -135,10 +135,19 @@ def test_siglip_vision_allclose_to_hf(self, scenario: Scenario):
             attn_metadata=attn_metadata,
         )
 
-        # Compare all hidden states
+        # Compare all hidden states.
+        # TRT-LLM applies post_layernorm to the last encoder hidden state
+        # (matching production usage), so the last element must be compared
+        # against HF's post_layernormed last_hidden_state rather than the
+        # raw hidden_states[-1].
+        num_states = len(tllm_outputs)
+        for i in range(num_states):
+            tllm_hs = tllm_outputs[i]
+            if i < num_states - 1:
+                hf_hs = hf_outputs.hidden_states[i]
+            else:
+                hf_hs = hf_outputs.last_hidden_state
 
-        for i, (hf_hs, tllm_hs) in enumerate(
-                zip(hf_outputs.hidden_states, tllm_outputs)):
             self.assertEqual(hf_hs.shape, tllm_hs.shape,
                              f"Shape mismatch for hidden state {i}")