[ROCm] Streamline bazel targets for rocm libraries#801
Closed
draganmladjenovic wants to merge 4 commits into
Closed
[ROCm] Streamline bazel targets for rocm libraries#801draganmladjenovic wants to merge 4 commits into
draganmladjenovic wants to merge 4 commits into
Conversation
b5ac783 to
22e1278
Compare
Imported from GitHub PR openxla#35211 Replace amd_comgr library with LLVM's native API to find NT_AMDGPU_METADATA note sections and extract the stack usage and register spill counts from there. Add detection for dynamic stack usage. Add VLOG(2) dumps for per-kernel stats as well as register counts. Change the logic of discarding the module. The module is discarded only if the stack is used, i.e., either .private_segment_fixed_size is not zero or .uses_dynamic_stack is true. There are examples where there are SGPR spills, but they are saved to VGPRs and not to the stack. Add tests in amdgpu_register_spilling_test.cc which cover cases where no spills, VGPR-only spills, SGPR-only spills, or dynamic stack usage occur. For that, the following LLVM IR inputs are added: - amdgpu_no_spills.ll: Simple kernel with minimal register usage - amdgpu_vgpr_spills.ll: High VGPR pressure with limited VGPRs (64) - amdgpu_sgpr_spills.ll: High SGPR pressure with limited SGPRs (32) - amdgpu_dynamic_stack.ll: Indirect function call requiring dynamic stack Copybara import of the project: -- b83efc6 by Aleksei Nurmukhametov <anurmukh@amd.com>: [ROCm] Reimplement register spilling detection Replace amd_comgr library with LLVM's native API to find NT_AMDGPU_METADATA note sections and extract the stack usage and register spill counts from there. Add detection for dynamic stack usage. Add VLOG(2) dumps for per-kernel stats as well as register counts. Change the logic of discarding the module. The module is discarded only if the stack is used, i.e., either .private_segment_fixed_size is not zero or .uses_dynamic_stack is true. There are examples where there are SGPR spills, but they are saved to VGPRs and not to the stack. Add tests in amdgpu_register_spilling_test.cc which cover cases where no spills, VGPR-only spills, SGPR-only spills, or dynamic stack usage occur. For that, the following LLVM IR inputs are added: - amdgpu_no_spills.ll: Simple kernel with minimal register usage - amdgpu_vgpr_spills.ll: High VGPR pressure with limited VGPRs (64) - amdgpu_sgpr_spills.ll: High SGPR pressure with limited SGPRs (32) - amdgpu_dynamic_stack.ll: Indirect function call requiring dynamic stack Merging this change closes openxla#35211 COPYBARA_INTEGRATE_REVIEW=openxla#35211 from ROCm:anurmukh/redo-regspill-check-no-comgr b83efc6 PiperOrigin-RevId: 845742402
22e1278 to
21d0991
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Remove DsoLoader indirection and directly link to rocm libs