ci: add Linux HIP (ROCm) backend build to release workflow#467
ci: add Linux HIP (ROCm) backend build to release workflow#467dev-miro26 wants to merge 8 commits intojanhq:devfrom
Conversation
Add a new `build-hip-linux` job that builds llama-server with
`-DGGML_HIP=ON` inside the official `rocm/dev-ubuntu-22.04:6.2`
Docker container. The job targets popular AMD GPU architectures
(gfx906/908/90a/1030/1100/1101/1102) and uploads the packaged
binary as `llama-{version}-bin-linux-hip-x64.tar.gz` to each
GitHub release.
Update `create-checksum-file` to depend on the new HIP job and
include the HIP binary's SHA-512 checksum and size in checksum.yml,
so Jan's backend download verification works correctly.
This enables Jan to auto-detect AMD GPU + ROCm and download the
appropriate llama.cpp backend without requiring users to select
it manually.
Made-with: Cursor
|
I'm just a bystander with a Strix Halo based laptop. Any chance this change could be extended to also support gfx1151? (rdna 3.5). I could test it on my Asus Flow Z13 128GB model if it would help. |
|
|
@geeksville |
|
Hi, @louis-jan |
|
cc @qnixsynapse |
|
Also @Minh141120 |
|
I think this wont work because it needs correct runner with AMD gpu. Cmiiw @Minh141120 |
|
Offline compilation (i.e without AMD GPU might be possible. But yes, @Minh141120 might know better. |
|
Hi @dev-miro26, really appreciate the work on this 🙌. To help us validate the changes (especially across both Linux and Windows), would you mind sharing a short video showing a successful build and Jan running on both platforms? This will help us validate the changes more confidently. |
|
@Minh141120 |
|
Really appreciate the effort here, especially given the hardware limitation. Since this introduces a new backend across both Windows and Linux, we’ll need validation on real AMD hardware before merging to ensure it works reliably for users. Without proper test results, we won’t be able to proceed with the merge at this time. We’ll keep the PR open and revisit it once test results are available from verified AMD hardware. |
|
Hi, @louis-jan |
Overview
Add HIP/ROCm backend builds (Linux and Windows) to the Menlo CI release pipeline so that Jan can distribute pre-built
llama-serverbinaries for AMD GPU users.What changed in
menlo-build.yml:build-hip-linux— new standalone job that buildsllama-serverwith-DGGML_HIP=ONinside therocm/dev-ubuntu-22.04:6.2container, targetinggfx906/gfx908/gfx90a/gfx1030/gfx1100/gfx1101/gfx1102. Uploadsllama-{version}-bin-linux-hip-x64.tar.gzto the GitHub release.build-hip-windows— new standalone job that installs AMD HIP SDK (26.Q1) on awindows-2022runner, buildsllama-serverwith HIP clang via theUnix Makefilesgenerator, bundles required runtime DLLs (libhipblas.dll,libhipblaslt.dll,rocblas.dll+ libraries), code-signs, and uploadsllama-{version}-bin-win-hip-x64.tar.gz.create-checksum-file— updated to depend on both HIP jobs and conditionally include their SHA-512 checksums inchecksum.ymlonly when the respective build succeeds.continue-on-error: trueso a HIP build failure does not block the rest of the release (CUDA, Vulkan, CPU artifacts are always published).Additional information
windows-hipjob approach fromrelease.yml: it usesUnix Makefiles+ HIP clang instead of MSVC/Ninja, since HIP compilation on Windows requires the HIP SDK's clang compiler. ROCm installation is cached viaactions/cache@v5.llama.cppbackend without requiring users to select it manually.GPU targets
gfx906gfx908gfx90agfx1030gfx1100gfx1101gfx1102