[GH 28654] Scratch pad reuse #28677
Closed
yuslepukhin wants to merge 5 commits into
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request focuses on improving the reliability and determinism of convolution kernel packing and its associated tests. The main changes ensure that memory used for kernel packing is always initialized to a zero state, preventing potential issues from uninitialized memory, and enhance test coverage to verify stability regardless of invocation order.
Reliability and determinism improvements:
convolve_kleidiai.cpp, added a step to zero-initialize thelhspacked buffer before use, ensuring that any unused bytes do not contain stale heap data that could affect computation results.Test coverage enhancements:
test_conv2d.h, expanded the convolution pad-buffer growth test to execute both possible orders of small and large input channels, verifying that results remain stable regardless of invocation order.