fix(rocm): add gfx1151 support and expose AMDGPU_TARGETS build-arg#9410
Merged
mudler merged 2 commits intomudler:masterfrom Apr 18, 2026
Merged
fix(rocm): add gfx1151 support and expose AMDGPU_TARGETS build-arg#9410mudler merged 2 commits intomudler:masterfrom
mudler merged 2 commits intomudler:masterfrom
Conversation
Add gfx1151 (AMD Strix Halo / Ryzen AI MAX) to the default AMDGPU_TARGETS list in the llama-cpp backend Makefile. ROCm 7.2.1 ships with gfx1151 Tensile libraries, so this architecture should be included in default builds. Also expose AMDGPU_TARGETS as an ARG/ENV in Dockerfile.llama-cpp so that users building for non-default GPU architectures can override the target list via --build-arg AMDGPU_TARGETS=<arch>. Previously, passing -DAMDGPU_TARGETS=<arch> through CMAKE_ARGS was silently overridden by the Makefile's own append of the default target list. Fixes mudler#9374 Signed-off-by: Keith Mattix <keithmattix2@gmail.com>
cae17a1 to
fc899ce
Compare
mudler
approved these changes
Apr 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds gfx1151 (AMD Strix Halo / Ryzen AI MAX) to the default
AMDGPU_TARGETSlist and fixes a bug where custom GPU architecture targets passed viaCMAKE_ARGSwere silently overridden.Fixes #9374
Problem
When building the llama-cpp backend for a non-default GPU architecture (e.g. gfx1151), the documented approach of passing
-DAMDGPU_TARGETS=gfx1151throughCMAKE_ARGSdoes not work:docker buildx build -f backend/Dockerfile.llama-cpp \ --build-arg BUILD_TYPE=hipblas \ --build-arg CMAKE_ARGS="-DAMDGPU_TARGETS=gfx1151" \ ...The Makefile's hipblas block (line 37) appends its own
-DAMDGPU_TARGETS=$(AMDGPU_TARGETS)using the default arch list after the user'sCMAKE_ARGSvalue. Since CMake uses the last-Dvalue for a given variable, the user's target is silently overridden.At runtime, this causes the gRPC backend to crash with a GPF in
libamdhip64.so:This was initially misdiagnosed as a gRPC/HIP threading conflict (#9374), but
AMD_LOG_LEVEL=4tracing revealed the binary simply didn't contain code objects for the target GPU.Fix
Add
gfx1151to the defaultAMDGPU_TARGETSinbackend/cpp/llama-cpp/Makefile. ROCm 7.2.1 ships Tensile libraries for gfx1151, so it should be included in default builds.Expose
AMDGPU_TARGETSas anARG/ENVinDockerfile.llama-cppso users can override the target list correctly:docker buildx build -f backend/Dockerfile.llama-cpp \ --build-arg BUILD_TYPE=hipblas \ --build-arg AMDGPU_TARGETS=gfx1151 \ ...This sets the Make variable
AMDGPU_TARGETSbefore the Makefile's?=conditional assignment, so it takes precedence.Testing
Tested on AMD Ryzen AI MAX+ 395 (gfx1151), ROCm 7.2.1, Ubuntu 24.04, kernel 6.17.0-1017-oem:
CMAKE_ARGS=\"-DAMDGPU_TARGETS=gfx1151\"--build-arg AMDGPU_TARGETS=gfx1151Signed commits
Yes, I signed my commits.