clean up some runtime potential bugs#19447
Conversation
Summary: 1. Zero cache on allocation (line 65-66): std::fill on cache_data_ and update_data_ after allocator_.allocate() —eliminates uninitialized memory garbage that varies across devices. 2. Zero cache on reset (line 191): std::fill on cache_data_ in reset() — ensures stale KV cache from a previous prompt is fully cleared, not just the position counters. 3. Zero padding in last prefill chunk (line 618-621): When batch_len < input_len, fill the tail of the input buffer with zeros — prevents stale tokens from a previous chunk leaking through the embedding layer. sa_runner.cpp 4. Call runner.reset() before each prompt in the multi-prompt loop, stdin prompt loop, and stdin tokens loop —ensures the KV cache, masks, and input_pos_ are fully reset between prompts Differential Revision: D104615993
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19447
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Pending, 1 Unrelated FailureAs of commit 9e0d34d with merge base a49171d ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@billmguo has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104615993. |
This PR needs a
|
Summary:
Differential Revision: D104615993