Zach/mi300a fixes#1819
Merged
Merged
Conversation
b47b1b0 to
cd2c84c
Compare
jeremylt
reviewed
May 6, 2025
jeremylt
reviewed
May 6, 2025
cd2c84c to
b5cfe2e
Compare
jeremylt
reviewed
May 6, 2025
b5cfe2e to
b46df0d
Compare
jeremylt
approved these changes
May 6, 2025
Member
jeremylt
left a comment
There was a problem hiding this comment.
Ok, as long as this does as expected on the MI300A machines, LGTM
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reworks the stream implementation for
/gpu/hip/gento avoid creating and destroying streams on every operator apply.Updates hipblas calls to only sync stream, this matters on MI300A since hipblas seems to use an async stream. Avoids a full device sync.
Also makes working vectors come from the
Vectorobject delegate to avoid bad ref behavior.