Skip to content

fix(rocm): add gfx1151 support and expose AMDGPU_TARGETS build-arg#9410

Merged
mudler merged 2 commits intomudler:masterfrom
keithmattix:fix/rocm-amdgpu-targets
Apr 18, 2026
Merged

fix(rocm): add gfx1151 support and expose AMDGPU_TARGETS build-arg#9410
mudler merged 2 commits intomudler:masterfrom
keithmattix:fix/rocm-amdgpu-targets

Conversation

@keithmattix
Copy link
Copy Markdown
Contributor

Summary

Adds gfx1151 (AMD Strix Halo / Ryzen AI MAX) to the default AMDGPU_TARGETS list and fixes a bug where custom GPU architecture targets passed via CMAKE_ARGS were silently overridden.

Fixes #9374

Problem

When building the llama-cpp backend for a non-default GPU architecture (e.g. gfx1151), the documented approach of passing -DAMDGPU_TARGETS=gfx1151 through CMAKE_ARGS does not work:

docker buildx build -f backend/Dockerfile.llama-cpp \
    --build-arg BUILD_TYPE=hipblas \
    --build-arg CMAKE_ARGS="-DAMDGPU_TARGETS=gfx1151" \
    ...

The Makefile's hipblas block (line 37) appends its own -DAMDGPU_TARGETS=$(AMDGPU_TARGETS) using the default arch list after the user's CMAKE_ARGS value. Since CMake uses the last -D value for a given variable, the user's target is silently overridden.

At runtime, this causes the gRPC backend to crash with a GPF in libamdhip64.so:

hip_fatbin.cpp:687: No compatible code objects found for: gfx1151
traps: grpcpp_sync_ser[...] general protection fault ... in libamdhip64.so.7.2.70201

This was initially misdiagnosed as a gRPC/HIP threading conflict (#9374), but AMD_LOG_LEVEL=4 tracing revealed the binary simply didn't contain code objects for the target GPU.

Fix

  1. Add gfx1151 to the default AMDGPU_TARGETS in backend/cpp/llama-cpp/Makefile. ROCm 7.2.1 ships Tensile libraries for gfx1151, so it should be included in default builds.

  2. Expose AMDGPU_TARGETS as an ARG/ENV in Dockerfile.llama-cpp so users can override the target list correctly:

docker buildx build -f backend/Dockerfile.llama-cpp \
    --build-arg BUILD_TYPE=hipblas \
    --build-arg AMDGPU_TARGETS=gfx1151 \
    ...

This sets the Make variable AMDGPU_TARGETS before the Makefile's ?= conditional assignment, so it takes precedence.

Testing

Tested on AMD Ryzen AI MAX+ 395 (gfx1151), ROCm 7.2.1, Ubuntu 24.04, kernel 6.17.0-1017-oem:

Scenario Result
Before fix: CMAKE_ARGS=\"-DAMDGPU_TARGETS=gfx1151\" GPF — binary missing gfx1151 code objects
After fix: --build-arg AMDGPU_TARGETS=gfx1151 PASS — model loads, inference works
Stock llama.cpp (same machine) PASS — baseline control

Signed commits

Yes, I signed my commits.

Add gfx1151 (AMD Strix Halo / Ryzen AI MAX) to the default AMDGPU_TARGETS
list in the llama-cpp backend Makefile. ROCm 7.2.1 ships with gfx1151
Tensile libraries, so this architecture should be included in default builds.

Also expose AMDGPU_TARGETS as an ARG/ENV in Dockerfile.llama-cpp so that
users building for non-default GPU architectures can override the target
list via --build-arg AMDGPU_TARGETS=<arch>. Previously, passing
-DAMDGPU_TARGETS=<arch> through CMAKE_ARGS was silently overridden by
the Makefile's own append of the default target list.

Fixes mudler#9374

Signed-off-by: Keith Mattix <keithmattix2@gmail.com>
@mudler mudler enabled auto-merge (squash) April 18, 2026 07:21
@mudler mudler disabled auto-merge April 18, 2026 18:39
@mudler mudler merged commit 8839a71 into mudler:master Apr 18, 2026
44 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gRPC backend crashes with GPF on AMD gfx1151 (Strix Halo APU) while stock llama.cpp works

2 participants