Skip to content

ci: add Linux HIP (ROCm) backend build to release workflow#467

Open
dev-miro26 wants to merge 8 commits intojanhq:devfrom
dev-miro26:feat/hip-linux-backend
Open

ci: add Linux HIP (ROCm) backend build to release workflow#467
dev-miro26 wants to merge 8 commits intojanhq:devfrom
dev-miro26:feat/hip-linux-backend

Conversation

@dev-miro26
Copy link
Copy Markdown

@dev-miro26 dev-miro26 commented Mar 27, 2026

Overview

Add HIP/ROCm backend builds (Linux and Windows) to the Menlo CI release pipeline so that Jan can distribute pre-built llama-server binaries for AMD GPU users.

What changed in menlo-build.yml:

  • build-hip-linux — new standalone job that builds llama-server with -DGGML_HIP=ON inside the rocm/dev-ubuntu-22.04:6.2 container, targeting gfx906/gfx908/gfx90a/gfx1030/gfx1100/gfx1101/gfx1102. Uploads llama-{version}-bin-linux-hip-x64.tar.gz to the GitHub release.
  • build-hip-windows — new standalone job that installs AMD HIP SDK (26.Q1) on a windows-2022 runner, builds llama-server with HIP clang via the Unix Makefiles generator, bundles required runtime DLLs (libhipblas.dll, libhipblaslt.dll, rocblas.dll + libraries), code-signs, and uploads llama-{version}-bin-win-hip-x64.tar.gz.
  • create-checksum-file — updated to depend on both HIP jobs and conditionally include their SHA-512 checksums in checksum.yml only when the respective build succeeds.
  • Both HIP jobs use continue-on-error: true so a HIP build failure does not block the rest of the release (CUDA, Vulkan, CPU artifacts are always published).

Note: macOS is not included because AMD does not support ROCm/HIP on macOS.

Additional information

  • The Linux HIP job follows the same pattern as the existing CUDA matrix builds (checkout → replace Makefile → install tools → build → version file → package → checksum → upload release asset).
  • The Windows HIP job follows the upstream llama.cpp windows-hip job approach from release.yml: it uses Unix Makefiles + HIP clang instead of MSVC/Ninja, since HIP compilation on Windows requires the HIP SDK's clang compiler. ROCm installation is cached via actions/cache@v5.
  • This enables Jan to auto-detect AMD GPU + ROCm on a user's system and download the appropriate llama.cpp backend without requiring users to select it manually.

GPU targets

Architecture Example GPUs
gfx906 Radeon VII, MI50
gfx908 MI100
gfx90a MI210, MI250, MI250X
gfx1030 RX 6800, RX 6900 XT
gfx1100 RX 7900 XTX, RX 7900 XT
gfx1101 RX 7800 XT, RX 7700 XT
gfx1102 RX 7600

Add a new `build-hip-linux` job that builds llama-server with
`-DGGML_HIP=ON` inside the official `rocm/dev-ubuntu-22.04:6.2`
Docker container. The job targets popular AMD GPU architectures
(gfx906/908/90a/1030/1100/1101/1102) and uploads the packaged
binary as `llama-{version}-bin-linux-hip-x64.tar.gz` to each
GitHub release.

Update `create-checksum-file` to depend on the new HIP job and
include the HIP binary's SHA-512 checksum and size in checksum.yml,
so Jan's backend download verification works correctly.

This enables Jan to auto-detect AMD GPU + ROCm and download the
appropriate llama.cpp backend without requiring users to select
it manually.

Made-with: Cursor
@geeksville
Copy link
Copy Markdown

I'm just a bystander with a Strix Halo based laptop. Any chance this change could be extended to also support gfx1151? (rdna 3.5). I could test it on my Asus Flow Z13 128GB model if it would help.

@dev-miro26
Copy link
Copy Markdown
Author

Architecture Generation Example GPUs Status
gfx908 CDNA 1 MI100 existing
gfx90a CDNA 2 MI210, MI250, MI250X existing
gfx942 CDNA 3 MI300X, MI300A new
gfx1030 RDNA 2 RX 6800, RX 6900 XT existing
gfx1100 RDNA 3 RX 7900 XTX, RX 7900 XT existing
gfx1101 RDNA 3 RX 7800 XT, RX 7700 XT existing
gfx1102 RDNA 3 RX 7600 existing
gfx1150 RDNA 3.5 Strix Point APUs new
gfx1151 RDNA 3.5 Strix Halo (e.g. Asus Flow Z13) new
gfx1200 RDNA 4 RX 9070 XT new
gfx1201 RDNA 4 RX 9070 new

@dev-miro26
Copy link
Copy Markdown
Author

@geeksville
please check again.
Thanks

@dev-miro26
Copy link
Copy Markdown
Author

Hi, @louis-jan
Could you please review this PR?

@louis-jan
Copy link
Copy Markdown

cc @qnixsynapse

@louis-jan
Copy link
Copy Markdown

Also @Minh141120

@louis-jan
Copy link
Copy Markdown

I think this wont work because it needs correct runner with AMD gpu. Cmiiw @Minh141120

@qnixsynapse
Copy link
Copy Markdown

Offline compilation (i.e without AMD GPU might be possible. But yes, @Minh141120 might know better.

@Minh141120
Copy link
Copy Markdown
Member

Hi @dev-miro26, really appreciate the work on this 🙌.

To help us validate the changes (especially across both Linux and Windows), would you mind sharing a short video showing a successful build and Jan running on both platforms?

This will help us validate the changes more confidently.

@dev-miro26
Copy link
Copy Markdown
Author

@Minh141120
I am sorry. As I said to @louis-jan before, I don't have AMD device. So I am finding who has AMD device.
@Vanalite : Could you test this please?

@Minh141120
Copy link
Copy Markdown
Member

Really appreciate the effort here, especially given the hardware limitation.

Since this introduces a new backend across both Windows and Linux, we’ll need validation on real AMD hardware before merging to ensure it works reliably for users.

Without proper test results, we won’t be able to proceed with the merge at this time. We’ll keep the PR open and revisit it once test results are available from verified AMD hardware.

@dev-miro26
Copy link
Copy Markdown
Author

Hi, @louis-jan
How are you?
I think Vanalite can test this PR.
Right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants