Skip to content

Commit 450ef8d

Browse files
hsuan-lun-chiangecnal-cienet
authored andcommitted
test: skip NNX int8 parameter-only checkpoint generation for GPU dot product test
NNX int8 parameter-only generation requires a convert-on-load setup, which causes a ValueError since the fp32 training checkpoint lacks the AqtDotGeneral state that the target int8 model expects. This aligns the GPU/dot-product test with the existing skip in the TPU/autoselected test variant.
1 parent 3527394 commit 450ef8d

1 file changed

Lines changed: 14 additions & 1 deletion

File tree

tests/integration/generate_param_only_checkpoint_test.py

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,20 @@ def test_param_ckpt_generation_with_autoselected_attention(quantization, capsys)
127127
@pytest.mark.external_serving
128128
@pytest.mark.integration_test
129129
@pytest.mark.gpu_only
130-
@pytest.mark.parametrize("quantization", [(""), ("int8")])
130+
@pytest.mark.parametrize(
131+
"quantization",
132+
[
133+
(""),
134+
pytest.param(
135+
"int8",
136+
marks=pytest.mark.skip(
137+
reason="NNX int8 param-only generation is a convert-on-load case (the fp32 training "
138+
"checkpoint has no AqtDotGeneral state the int8 model expects); tracked as a follow-up "
139+
"alongside layerwise_quantization."
140+
),
141+
),
142+
],
143+
)
131144
def test_param_ckpt_generation_with_dot_product(quantization, capsys):
132145
"""Tests the parameter-only checkpoint generation and decode flow on GPU with dot product attention."""
133146
os.environ["NVTE_FUSED_ATTN"] = "1" # Enable fused attention

0 commit comments

Comments
 (0)