[ROCm] Drop gfx900 (MI25) and gfx906 (MI50/MI60) support by nurmukhametov · Pull Request #847 · ROCm/xla

nurmukhametov · 2026-05-06T15:53:01Z

[Do not merge]

Motivation

For internal review.

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

i-chaochen · 2026-05-07T09:44:27Z

-    static constexpr absl::string_view kList[] = {"gfx900", "gfx906"};
-    return !IsThisGfxInAnyList(kList);
-  }
+  bool fence_before_barrier() const { return true; }


I don't think we need this anymore. Regarding in TF, please have a grep check there. We shouldn't use it anymore neither IIRC.

I have removed it.

i-chaochen · 2026-05-07T09:48:51Z

-  desc.set_fpus_per_core(fpus_per_core(gcn_arch_name));
+  // Source:
+  // https://www.amd.com/content/dam/amd/en/documents/instinct-business-docs/white-papers/amd-cdna2-white-paper.pdf
+  desc.set_fpus_per_core(128);


Instead of hard coding, I guess you can adjust this properly. @Eetusjo investigated this before https://github.com/ROCm/frameworks-internal/issues/15846#issuecomment-4073834863

I have added todo here, because I think a proper fix is out of scope of this PR.

draganmladjenovic · 2026-05-08T12:43:17Z

@i-chaochen What bothers me is that TheRock seems to have gfx900 and gfx906 builds https://therock-hud-dev.amd.com/

draganmladjenovic

Until we figure it out if TheRock needs this.

i-chaochen · 2026-05-08T13:22:55Z

@i-chaochen What bothers me is that TheRock seems to have gfx900 and gfx906 builds https://therock-hud-dev.amd.com/

No, but JAX and us never commit we will have gfx900 and gfx906 build, there is no gfx900 and gfx906 on JAX/Pytorch build neither.

https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html
https://github.com/ROCm/rocm-jax/blob/master/stack.py#L20
https://github.com/ROCm/TheRock/blob/main/.github/workflows/test_pytorch_wheels_full.yml

i-chaochen · 2026-05-11T14:39:05Z

Yes, thanks for @jamestangg found this table https://github.com/ROCm/TheRock/blob/main/SUPPORTED_GPUS.md

Since they just need a build pass, I don't think it's related to what we rm the support here in XLA.

We should be no problem to upstream this drop?

@draganmladjenovic @nurmukhametov

nurmukhametov · 2026-05-12T09:07:49Z

Since they just need a build pass, I don't think it's related to what we rm the support here in XLA.

We should be no problem to upstream this drop?

@draganmladjenovic @nurmukhametov

I am ready to upstream this. Any objections, @draganmladjenovic ?

claude · 2026-05-13T08:45:05Z

+  // TODO(ROCm): replace this hardcoded value with a per-arch lookup table and
+  // populate scalar_unit_description / matrix_unit_description so the perf
+  // model picks the right FP32 path (vector vs. matrix).
+  desc.set_fpus_per_core(128);


nit (pre-existing, not introduced by this PR): The hardcoded 128 is correct for all remaining CDNA architectures (gfx908, gfx90a), but RDNA architectures (gfx1030, gfx1100, gfx1101, gfx1200, gfx1201, gfx1250) have 64 FP32 stream processors per CU. The old fpus_per_core() function also defaulted to 128 for everything except gfx906, so this preserves existing behavior — just noting that the TODO here is still load-bearing for RDNA correctness.

claude · 2026-05-13T08:45:13Z

Claude Review Summary

Clean, well-scoped PR. Drops gfx900 (MI25) and gfx906 (MI50/MI60) support consistently across the codebase — supported versions list, fence_before_barrier() removal, fpus_per_core() inlining, and test updates are all complete with no orphaned references remaining. The removal aligns with AMD's current ROCm compatibility matrix.

One informational note left inline regarding pre-existing technical debt in the fpus_per_core hardcoding for RDNA architectures.

PiperOrigin-RevId: 915336621

PiperOrigin-RevId: 915338077

- Introduce `hlo_query::IsStandardAssociativeScan` to identify standard forward associative scans (single input/init, returning an `(output, carry)` tuple where the `carry` output is unused). - Use this new helper in `ReduceWindowRewriter` and GPU `ScanRewriter` to simplify scan matching. - In `ReduceWindowRewriter`, instead of skipping short scans, rewrite them directly into a single `kReduceWindow` operation. - Remove the complex carry computation logic in `ReduceWindowRewriter`, as `IsStandardAssociativeScan` guarantees the carry output is dead. PiperOrigin-RevId: 915338116

PiperOrigin-RevId: 915357414

When LayoutAssignment clones computations for conditional branches, the internal call graph needs to be rebuilt to include the newly cloned computations. PiperOrigin-RevId: 915366372

nurmukhametov requested review from draganmladjenovic and i-chaochen May 6, 2026 15:53

i-chaochen reviewed May 7, 2026

View reviewed changes

nurmukhametov requested a review from i-chaochen May 7, 2026 14:34

nurmukhametov added the claude-review Request a Claude AI code review for this PR label May 8, 2026

draganmladjenovic requested changes May 8, 2026

View reviewed changes

nurmukhametov added claude-review Request a Claude AI code review for this PR and removed claude-review Request a Claude AI code review for this PR labels May 13, 2026

claude Bot reviewed May 13, 2026

View reviewed changes

github-actions Bot removed the claude-review Request a Claude AI code review for this PR label May 13, 2026

KanishAnand and others added 6 commits May 14, 2026 03:34

Reverts eeb1650

52f078e

PiperOrigin-RevId: 915336621

Reverts 5dedd4a

05f7a79

PiperOrigin-RevId: 915338077

Automated Code Change

0d457c4

PiperOrigin-RevId: 915357414

Rebuild call graph after cloning computations in LayoutAssignment.

896cf3a

When LayoutAssignment clones computations for conditional branches, the internal call graph needs to be rebuilt to include the newly cloned computations. PiperOrigin-RevId: 915366372

[ROCm] Drop gfx900 (MI25) and gfx906 (MI50/MI60) support

df45abc

nurmukhametov force-pushed the anurmukh/remove-gfx900-gfx906 branch from d4689d4 to df45abc Compare May 14, 2026 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCm] Drop gfx900 (MI25) and gfx906 (MI50/MI60) support#847

[ROCm] Drop gfx900 (MI25) and gfx906 (MI50/MI60) support#847
nurmukhametov wants to merge 6 commits into
mainfrom
anurmukh/remove-gfx900-gfx906

nurmukhametov commented May 6, 2026 •

edited

Loading

Uh oh!

i-chaochen May 7, 2026

Uh oh!

nurmukhametov May 7, 2026

Uh oh!

i-chaochen May 7, 2026

Uh oh!

nurmukhametov May 7, 2026

Uh oh!

draganmladjenovic commented May 8, 2026

Uh oh!

draganmladjenovic left a comment

Uh oh!

i-chaochen commented May 8, 2026 •

edited

Loading

Uh oh!

i-chaochen commented May 11, 2026 •

edited

Loading

Uh oh!

nurmukhametov commented May 12, 2026

Uh oh!

claude Bot May 13, 2026

Uh oh!

claude Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

nurmukhametov commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Submission Checklist

Uh oh!

i-chaochen May 7, 2026

Choose a reason for hiding this comment

Uh oh!

nurmukhametov May 7, 2026

Choose a reason for hiding this comment

Uh oh!

i-chaochen May 7, 2026

Choose a reason for hiding this comment

Uh oh!

nurmukhametov May 7, 2026

Choose a reason for hiding this comment

Uh oh!

draganmladjenovic commented May 8, 2026

Uh oh!

draganmladjenovic left a comment

Choose a reason for hiding this comment

Uh oh!

i-chaochen commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

i-chaochen commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nurmukhametov commented May 12, 2026

Uh oh!

claude Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 13, 2026

Claude Review Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

nurmukhametov commented May 6, 2026 •

edited

Loading

i-chaochen commented May 8, 2026 •

edited

Loading

i-chaochen commented May 11, 2026 •

edited

Loading