Skip to content

ROCm 7 CI#1752

Merged
LostRuins merged 13 commits intoLostRuins:concedo_experimentalfrom
henk717:concedo_experimental
Oct 4, 2025
Merged

ROCm 7 CI#1752
LostRuins merged 13 commits intoLostRuins:concedo_experimentalfrom
henk717:concedo_experimental

Conversation

@henk717
Copy link
Copy Markdown
Collaborator

@henk717 henk717 commented Sep 22, 2025

I forgot we could use docker, that eliminates the diskspace and dependency issues making it work again.
However, ROCm 7 compiling is broken. Only merge this after uvos's fix lands upstream.

Its basically as follows, AMD can't properly release drivers.
So while they fixed the 9000 with the good speeds, they broke that for some older datacenter cards which now can't compile.
uvos is planning to dynamically exclude those again if ROCm7 is detected so they will use a slower method.

Because our users are primarily using consumer cards at home it still makes sense for us to switch to 7.
Datacenters can self compile with 6 should they want faster speeds.

Copy link
Copy Markdown
Owner

@LostRuins LostRuins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright i have no idea if it works, but let's give this a shot

@LostRuins
Copy link
Copy Markdown
Owner

Only merge this after uvos's fix lands upstream.

tell me when

@henk717
Copy link
Copy Markdown
Collaborator Author

henk717 commented Sep 23, 2025

Can be tracked here : ggml-org#16153

@LostRuins LostRuins force-pushed the concedo_experimental branch from 92df280 to f282362 Compare October 3, 2025 10:58
@LostRuins
Copy link
Copy Markdown
Owner

should i merge this now? @henk717

@henk717
Copy link
Copy Markdown
Collaborator Author

henk717 commented Oct 3, 2025

No, I am now also getting disk space limits on the docker method that worked a few weeks ago.
Unless you know of a way to clean up the host image prior to the docker step we can scrap this.

@henk717 henk717 marked this pull request as draft October 3, 2025 19:01
@henk717
Copy link
Copy Markdown
Collaborator Author

henk717 commented Oct 3, 2025

Found a way to get enough space but its a WIP for now

@henk717 henk717 marked this pull request as ready for review October 3, 2025 23:08
@henk717
Copy link
Copy Markdown
Collaborator Author

henk717 commented Oct 3, 2025

Compiles again (I accidentally had it generate cuda builds and we have enough space now for both rocm and cuda, so should be safe). Asking people in discord if this works well for them, I already confirmed it works well on Runpod.
Based on the feedback you see there feel free to merge.

@LostRuins
Copy link
Copy Markdown
Owner

Alright, merging.

@LostRuins LostRuins merged commit 118e589 into LostRuins:concedo_experimental Oct 4, 2025
LostRuins added a commit that referenced this pull request Oct 4, 2025
This reverts commit 9df2a02. (+1 squashed commits)

Squashed commits:

[2e96da6f0] Revert "ROCm 7 CI (#1752)"

This reverts commit 118e589.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants