Commit 8d632cf
fix: replace dual-stream unmask loop with single-stream 8x unroll
The dual-stream AVX-512 unmask loop used the same zmm0 mask vector for
both front and back streams. After a misaligned prologue rotates the
mask, the back stream starts at a different offset % 4 — producing
wrong XOR bytes. Replace with single-stream 8x-unrolled loop matching
ws_mask's proven pattern.
Fixes: "512 KB unmask, misaligned buffer — NT prologue mask cycling"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 1088461 commit 8d632cf
1 file changed
Lines changed: 16 additions & 26 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
620 | 620 | | |
621 | 621 | | |
622 | 622 | | |
623 | | - | |
624 | | - | |
625 | | - | |
| 623 | + | |
626 | 624 | | |
627 | 625 | | |
628 | 626 | | |
629 | 627 | | |
630 | | - | |
631 | | - | |
632 | 628 | | |
633 | | - | |
634 | | - | |
635 | | - | |
636 | | - | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
637 | 633 | | |
638 | | - | |
639 | 634 | | |
640 | 635 | | |
641 | 636 | | |
642 | 637 | | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
643 | 642 | | |
644 | 643 | | |
645 | 644 | | |
646 | 645 | | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
647 | 650 | | |
648 | | - | |
649 | | - | |
650 | | - | |
651 | | - | |
652 | | - | |
653 | | - | |
654 | | - | |
655 | | - | |
656 | | - | |
657 | | - | |
658 | | - | |
659 | | - | |
| 651 | + | |
660 | 652 | | |
661 | | - | |
| 653 | + | |
662 | 654 | | |
663 | | - | |
664 | | - | |
665 | | - | |
| 655 | + | |
666 | 656 | | |
667 | 657 | | |
668 | 658 | | |
| |||
0 commit comments