Commit 10cb187
feat: symmetric turbo3 K support in TurboFlash + research conclusions
Added turbo3 K dequant path to TurboFlash kernel via function constant
FC_turbo_flash_p1_k_is_turbo3. Symmetric turbo3/turbo3 now dispatches
through TurboFlash instead of baseline FA.
Result: symmetric TurboFlash is neutral vs baseline FA (-0.7%).
This confirms the 56->145 tok/s gap to Eric's MLX-Swift is 100%
framework overhead (dispatch count, graph evaluation), not kernel-level.
Best config remains asymmetric q8_0-K/turbo3-V with TurboFlash V4
+ simd_shuffle WHT: 56.82 tok/s, 93% of q8_0, +1.5% over baseline.
Co-Authored-By: tturney@psyguard.ai
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent b0b8dde commit 10cb187
2 files changed
Lines changed: 55 additions & 19 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2709 | 2709 | | |
2710 | 2710 | | |
2711 | 2711 | | |
2712 | | - | |
2713 | | - | |
| 2712 | + | |
| 2713 | + | |
2714 | 2714 | | |
2715 | 2715 | | |
2716 | 2716 | | |
| |||
2947 | 2947 | | |
2948 | 2948 | | |
2949 | 2949 | | |
2950 | | - | |
| 2950 | + | |
2951 | 2951 | | |
2952 | 2952 | | |
2953 | 2953 | | |
| |||
3013 | 3013 | | |
3014 | 3014 | | |
3015 | 3015 | | |
| 3016 | + | |
| 3017 | + | |
| 3018 | + | |
3016 | 3019 | | |
3017 | 3020 | | |
3018 | 3021 | | |
3019 | | - | |
3020 | | - | |
| 3022 | + | |
| 3023 | + | |
3021 | 3024 | | |
3022 | | - | |
| 3025 | + | |
3023 | 3026 | | |
3024 | 3027 | | |
3025 | 3028 | | |
3026 | | - | |
3027 | | - | |
3028 | | - | |
| 3029 | + | |
| 3030 | + | |
| 3031 | + | |
| 3032 | + | |
3029 | 3033 | | |
3030 | 3034 | | |
3031 | 3035 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8587 | 8587 | | |
8588 | 8588 | | |
8589 | 8589 | | |
| 8590 | + | |
8590 | 8591 | | |
8591 | 8592 | | |
8592 | 8593 | | |
| |||
8666 | 8667 | | |
8667 | 8668 | | |
8668 | 8669 | | |
| 8670 | + | |
| 8671 | + | |
| 8672 | + | |
| 8673 | + | |
| 8674 | + | |
| 8675 | + | |
| 8676 | + | |
| 8677 | + | |
8669 | 8678 | | |
8670 | 8679 | | |
8671 | 8680 | | |
| |||
8697 | 8706 | | |
8698 | 8707 | | |
8699 | 8708 | | |
8700 | | - | |
8701 | 8709 | | |
8702 | | - | |
8703 | | - | |
8704 | 8710 | | |
8705 | | - | |
8706 | | - | |
8707 | | - | |
8708 | 8711 | | |
8709 | | - | |
8710 | | - | |
8711 | | - | |
8712 | | - | |
| 8712 | + | |
| 8713 | + | |
| 8714 | + | |
| 8715 | + | |
| 8716 | + | |
| 8717 | + | |
| 8718 | + | |
| 8719 | + | |
| 8720 | + | |
| 8721 | + | |
| 8722 | + | |
| 8723 | + | |
| 8724 | + | |
| 8725 | + | |
| 8726 | + | |
| 8727 | + | |
| 8728 | + | |
| 8729 | + | |
| 8730 | + | |
| 8731 | + | |
| 8732 | + | |
| 8733 | + | |
| 8734 | + | |
| 8735 | + | |
| 8736 | + | |
| 8737 | + | |
| 8738 | + | |
| 8739 | + | |
| 8740 | + | |
| 8741 | + | |
| 8742 | + | |
| 8743 | + | |
| 8744 | + | |
8713 | 8745 | | |
8714 | 8746 | | |
8715 | 8747 | | |
| |||
0 commit comments