Commit a623a8c
Aegis-AI
feat(turboquant): implement turbo_encode_k/v CPU encode path
Adds the missing mlx::core::fast::turbo_encode_k() and turbo_encode_v()
functions that the C API stub was placeholding.
Algorithm (from turbo_quant.h):
turbo_encode_k: 3-bit PolarQuant (WHT rotation + Lloyd-Max centroids)
+ 1-bit QJL residual — K cache, 68 bytes/token
turbo_encode_v: 3-bit PolarQuant only — V cache, 50 bytes/token
Buffer layout per token:
K: indices[48] | qjl_signs[16] | norm_fp16[2] | rnorm_fp16[2]
V: indices[48] | norm_fp16[2]
Layout matches the Metal decompression path in sdpa_vector.h which
already implements turbo_dequant_k/v for on-the-fly decode during SDPA.
The encode path is CPU-side (eval + iterate), which is appropriate since
compression runs once per appended KV token, not in the hot forward pass.
Files changed:
fast.h — declare turbo_encode_k/v in namespace mlx::core::fast
fast.cpp — implement using turbo_quant.h primitives
mlx-c fast.cpp — replace runtime_error stub with real call1 parent a60235f commit a623a8c
3 files changed
Lines changed: 136 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
789 | 789 | | |
790 | 790 | | |
791 | 791 | | |
792 | | - | |
793 | | - | |
794 | | - | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
795 | 810 | | |
796 | 811 | | |
797 | 812 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
933 | 933 | | |
934 | 934 | | |
935 | 935 | | |
| 936 | + | |
936 | 937 | | |
937 | 938 | | |
938 | 939 | | |
939 | 940 | | |
940 | 941 | | |
941 | 942 | | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
| 962 | + | |
| 963 | + | |
| 964 | + | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
| 1002 | + | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
103 | 121 | | |
0 commit comments