Skip to content

hipMalloc (unified) failed: an illegal memory access was encountered. #6

@ckuethe

Description

@ckuethe
=== INFERENCE PIPELINE DIAGNOSTICS ===
Loading model: models--mlx-community--Granite-4.0-H-Tiny-4bit-DWQ/snapshots/a892ded1552d6d4089fa644bbff6ccbc54dddc67
Model loaded.

--- TEST 1: Basic GPU ops ---
[DIAG] matmul(ones, 2*ones) expect=8	   shape=(4,4) min=8.000000 max=8.000000 mean=8.000000 |mean|=8.000000
[VALS] matmul result: [8.0000, 8.0000, 8.0000, 8.0000, 8.0000, 8.0000, 8.0000, 8.0000]
[hipBLASLt] first call
[hipBLASLt] M=4 N=4 K=4 ta=0 tb=0 lda=4 ldb=4 ldc=4
[DIAG] bf16 matmul expect=8		   shape=(4,4) min=8.000000 max=8.000000 mean=8.000000 |mean|=8.000000

--- TEST 2: quantized_matmul vs dequant ---
[DIAG] q_proj weights not found (w=0 s=0 b=0)

--- TEST 3: RMS Norm ---
[DIAG] rms_norm([1,2,3,4])		   shape=(1,1,4) min=0.365148 max=1.460593 mean=0.912871 |mean|=0.912871
[VALS] rms_norm([1,2,3,4]) expect≈[.365,.730,1.095,1.461]: [0.3651, 0.7303, 1.0954, 1.4606]
[DIAG] rms_norm(rand bf16 4096)		   shape=(1,3,4096) min=-3.625000 max=3.531250 mean=0.007225 |mean|=0.797776

--- TEST 4: RoPE ---
[DIAG] rope(ones, off=0)		   shape=(1,1,1,128) min=1.000000 max=1.000000 mean=1.000000 |mean|=1.000000
[VALS] rope(ones, off=0): [1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000]
[DIAG] rope(ones, off=100)		   shape=(1,1,1,128) min=-1.414062 max=1.406250 mean=0.586235 |mean|=0.953492
[VALS] rope(ones, off=100): [1.3672, 1.3438, -1.3672, -1.3516, 0.7305, -1.3828, -1.4062, -0.9219, 1.3594, -1.1719, 1.3750, -1.1094, -0.5898, 1.2109, 1.1406, -0.0040, -0.9805, -1.3906, -1.3516, -1.0781]

--- TEST 5: Full forward pass ---
[DIAG] logits(token=1)			   shape=(1,1,100352) min=-56.392059 max=106.358070 mean=-12.956909 |mean|=13.429077
[VALS] logits(token=1): [29.7169, 106.3581, 29.6625, 30.8870, 25.5035, 26.7016, 63.0159, 45.6126, 40.3120, 39.8485, 26.7892, 48.8852, 48.5656, 48.9910, 41.7294, 24.7933, 34.7946, 31.9800, 27.5472, 24.7337, 22.3976, 17.8242, 17.8524, 17.3758, 16.2437, 42.6530, 32.3579, 28.8633, 26.7248, 28.7879]
[DIAG] Top-10:
  token=100257 logit=27.1867
  token=100260 logit=19.6822
  token=100259 logit=17.4482
  token=100258 logit=5.2841
  token=99703 logit=4.3919
  token=99519 logit=0.1561
  token=99362 logit=-1.0377
  token=99783 logit=-2.8212
  token=99542 logit=-2.9113
  token=99809 logit=-3.0971
[DIAG] logits(step2)			   shape=(1,1,100352) min=-27.977913 max=23.408852 mean=-4.440776 |mean|=5.364185

--- TEST 6: dequantize() sanity ---
[DIAG] dequant([0..7],s=1,b=0)		   shape=(1,8) min=0.000000 max=7.000000 mean=3.500000 |mean|=3.500000
[VALS] dequant expect=[0,1,2,3,4,5,6,7]: [0.0000, 1.0000, 2.0000, 3.0000, 4.0000, 5.0000, 6.0000, 7.0000]

--- TEST 6b: Warmup pass ---
[DIAG] warmup logits			   shape=(1,1,100352) min=-30.554613 max=24.993921 mean=-4.758521 |mean|=5.718008
[DIAG] Warmup complete

--- TEST 7: Token-level generation trace ---
[DIAG] encode("What is 2+2?") = [3923, 374, 220, 17, 10, 17, 30] (7 tokens)
[DIAG] Token-by-token decode:
  token 3923 -> "What"
  token 374 -> " is"
  token 220 -> " "
  token 17 -> "2"
  token 10 -> "+"
  token 17 -> "2"
  token 30 -> "?"
[DIAG] Chat template tokens (15): [100264, 882, 100265, 3923, 374, 220, 17, 10, 17, 30, 100257, 198, 100264, 78191, 100265]
[DIAG] Chat template decoded: "<|start_of_role|>user<|end_of_role|>What is 2+2?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>"
[DIAG] prefill logits			   shape=(1,15,100352) min=-50.140812 max=92.124290 mean=-1.907799 |mean|=7.610769
[DIAG] Generating 20 tokens (argmax):
  step=0 token=93909 text="-ves"
  step=1 token=6549 text="125"
  step=2 token=6549 text="125"
  step=3 token=6549 text="125"
  step=4 token=6549 text="125"
  step=5 token=6549 text="125"
  step=6 token=6549 text="125"
  step=7 token=6549 text="125"
  step=8 token=6549 text="125"
  step=9 token=6549 text="125"
  step=10 token=6549 text="125"
  step=11 token=6549 text="125"
  step=12 token=6549 text="125"
  step=13 token=6549 text="125"
  step=14 token=6549 text="125"
  step=15 token=6549 text="125"
  step=16 token=6549 text="125"
  step=17 token=6549 text="125"
  step=18 token=6549 text="125"
  step=19 token=6549 text="125"
[DIAG] Full output (argmax): "-ves125125125125125125125125125125125125125125125125125125125"

[DIAG] Generating 20 tokens (categorical T=0.7):
  step=0 token=89232 text=".Disclaimer"
  step=1 token=6549 text="125"
  step=2 token=6549 text="125"
  step=3 token=6549 text="125"
  step=4 token=6549 text="125"
  step=5 token=6549 text="125"
  step=6 token=6549 text="125"
  step=7 token=6549 text="125"
  step=8 token=6549 text="125"
  step=9 token=6549 text="125"
  step=10 token=6549 text="125"
  step=11 token=6549 text="125"
  step=12 token=6549 text="125"
  step=13 token=6549 text="125"
  step=14 token=6549 text="125"
  step=15 token=6549 text="125"
  step=16 token=6549 text="125"
  step=17 token=6549 text="125"
  step=18 token=6549 text="125"
  step=19 token=6549 text="125"
[DIAG] Full output (categorical): ".Disclaimer125125125125125125125125125125125125125125125125125125125"

[DIAG] Testing via generate_text (chat.cpp path):
  token=89232 text=".Disclaimer"
  token=6549 text="125"
  token=0 text="!"
  token=0 text="!"
  token=75948 text=" exporters"
  token=0 text="!"
  token=0 text="!"
  token=0 text="!"
  token=93548 text=".optString"
  token=0 text="!"
  token=0 text="!"
  token=0 text="!"
  token=44206 text="ITT"
  token=0 text="!"
  token=0 text="!"
  token=0 text="!"
  token=44206 text="ITT"
  token=0 text="!"
  token=0 text="!"
  token=0 text="!"
[DIAG] generate_text output: ".Disclaimer125!! exporters!!!.optString!!!ITT!!!ITT!!!"
[DIAG] Prompt:	   15 tokens, 38.1461 tokens/s, 0.393225s
Generation: 20 tokens, 26.9525 tokens/s, 0.742047s

--- TEST 8: random::categorical ---
[DIAG] categorical([..., 10, ...]) = 2 (expect 2)
[DIAG] categorical([..., 10, ...]) = 2 (expect 2)
[DIAG] categorical([..., 10, ...]) = 2 (expect 2)
[DIAG] categorical([..., 10, ...]) = 2 (expect 2)
[DIAG] categorical([..., 10, ...]) = 2 (expect 2)
[DIAG] categorical(peak@17, V=151936) = 17 (expect 17)
[DIAG] categorical(peak@17, V=151936) = 17 (expect 17)
[DIAG] categorical(peak@17, V=151936) = 17 (expect 17)
[DIAG] Testing categorical with real model logits...
terminate called after throwing an instance of 'std::runtime_error'
  what():  hipMalloc (unified) failed: an illegal memory access was encountered.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions