Skip to content

[Issue]: [rocBLASLt] gfx1201 GPU causes lookup of wrong Tensile file (gfx1200.dat) — model load fails with SIGKILL — Ollama v0.7.22.1 on AMD Radeon AI PRO R9700 #7192

@manand77

Description

@manand77

Problem Description

Environment

OS:
NAME="Ubuntu"
VERSION="24.04.4 LTS (Noble Numbat)"
CPU:
model name : AMD Ryzen 7 9800X3D 8-Core Processor
GPU:
Name: AMD Ryzen 7 9800X3D 8-Core Processor
Marketing Name: AMD Ryzen 7 9800X3D 8-Core Processor
Name: gfx1201
Marketing Name: AMD Radeon AI PRO R9700
Name: amdgcn-amd-amdhsa--gfx1201
Name: amdgcn-amd-amdhsa--gfx12-generic
Name: gfx1201
Marketing Name: AMD Radeon Graphics
Name: amdgcn-amd-amdhsa--gfx1201
Name: amdgcn-amd-amdhsa--gfx12-generic

Field Value
GPU AMD Radeon AI PRO R9700 (Navi 48, gfx1201, 32GB VRAM)
Ollama Version v0.7.22.1 — native install at /usr/local/bin/ollama
Ollama ROCm Backend /usr/local/lib/ollama/rocm/libggml-hip.so (bundled, not system ROCm)
System ROCm 7.2.1 at /opt/rocm-7.2.1 (not used by Ollama)
OS Ubuntu 24.04.4 LTS (Noble)
Kernel 6.17.0-23-generic (x86_64, PREEMPT_DYNAMIC)
CPU AMD Ryzen 7 9800X3D
HSA_OVERRIDE_GFX_VERSION 12.0.1 (set in systemd env — see notes)
ROCR_VISIBLE_DEVICES 0 (pinned to discrete R9700; iGPU excluded)
Multiple ROCm installs? No — single system install at /opt/rocm/opt/rocm-7.2.1

Note: Two gfx1201 devices appear in rocminfo — the discrete R9700 (device 0, 32GB) and an integrated "AMD Radeon Graphics" (device 1, ~15.7GiB shared RAM). Stack is pinned to device 0.

Error (100% reproducible on every model load)

rocblaslt error: Cannot read "TensileLibrary_lazy_gfx1200.dat": No such file or directory
rocblaslt error: Could not load "TensileLibrary_lazy_gfx1200.dat"

Critical Observation: Target Mismatch

The GPU is gfx1201 (RDNA 4 / Navi 48), but rocBLASLt attempts to load TensileLibrary_lazy_gfx1200.dat — a gfx1200 target file. This indicates either:

  • HSA_OVERRIDE_GFX_VERSION=12.0.1 is causing a misidentification (12.0.1 → gfx1200 instead of gfx1201), OR
  • The Tensile target resolution logic maps gfx1201gfx1200 fallback, and neither file exists in Ollama's bundled ROCm

Impact: Complete Model Load Failure

This is not a non-fatal warning. Every model load fails with a 2-minute timeout followed by SIGKILL:

time=...  level=ERROR  msg="Load failed"  error="model failed to load, this may be due to resource
          limitations or an internal error, check ollama server logs for details"
time=...  level=ERROR  msg="llama runner terminated"  error="signal: killed"
[GIN] 500 | 2m0s | POST "/api/chat"

Full Log Sequence (one representative occurrence)

ollama[2408]: load_backend: loaded ROCm backend from /usr/local/lib/ollama/rocm/libggml-hip.so
ollama[2408]: rocblaslt error: Cannot read "TensileLibrary_lazy_gfx1200.dat": No such file or directory
ollama[2408]: rocblaslt error: Could not load "TensileLibrary_lazy_gfx1200.dat"
ollama[2408]: msg=load request="{Operation:fit ... FlashAttention:Enabled ... GPULayers:41 ...}"
ollama[2408]: msg="do load request" error="Post \"http://127.0.0.1:XXXXX/load\": context canceled"
ollama[2408]: msg="Load failed" error="model failed to load..."
ollama[2408]: msg="llama runner terminated" error="signal: killed"

Operating System

Ubuntu 24.04.4 LTS

CPU

AMD Ryzen 7 9800X3D

GPU

AMD Radeon AI PRO R9700

ROCm Version

ROCm 7.2.1

ROCm Component

No response

Steps to Reproduce

Steps to Reproduce

  1. Install Ollama v0.7.22.1 natively on Ubuntu 24.04.4 with AMD Radeon AI PRO R9700 (gfx1201)
  2. Set HSA_OVERRIDE_GFX_VERSION=12.0.1 and ROCR_VISIBLE_DEVICES=0 in the Ollama systemd service
  3. Attempt to run any model: ollama run qwen2.5:14b
  4. Observe: rocBLASLt looks for gfx1200.dat (not gfx1201.dat), fails to find it, model load times out with SIGKILL after 2 minutes

Expected Behavior

rocBLASLt resolves the correct Tensile kernel file for the gfx1201 target and model inference completes successfully.

Workaround Status

Setting HSA_OVERRIDE_GFX_VERSION=12.0.1 was attempted as a workaround per community guidance, but a ROCm maintainer confirmed this env var is for debugging only. Moreover it may be contributing to the gfx1200 target mismatch rather than resolving it.

Questions for Maintainers

  1. Should HSA_OVERRIDE_GFX_VERSION=12.0.1 resolve to gfx1200 or gfx1201?
  2. Is TensileLibrary_lazy_gfx1200.dat or TensileLibrary_lazy_gfx1201.dat (or both) expected for RDNA 4 / Navi 48?
  3. What is the correct path to verify the Tensile .dat files shipped with Ollama's bundled ROCm?
  4. Does ROCm 7.12+ or a newer Ollama release resolve this?

References

  • hipblaslt runtime triage checklist reviewed before filing
  • Filed per suggestion from ROCm maintainer in public discussion thread

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions