Commit c548f6f
committed
Address remaining PR review comments in triton_fa.py
- quantize_p: raise NotImplementedError when any of q/k/v requires_grad
since backward does not model the quantized P path (inference-only)
- b_start_loc_k: only synthesize dummy tensor in paged mode; raise
ValueError in contiguous separate-KV path when b_start_loc_k is None;
also validate that v_cache and block_table are provided alongside k_cache
Signed-off-by: Ye Yu <yeyu@nvidia.com>1 parent 4248327 commit c548f6f
1 file changed
+15
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
985 | 985 | | |
986 | 986 | | |
987 | 987 | | |
988 | | - | |
989 | | - | |
990 | | - | |
991 | | - | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
992 | 998 | | |
993 | 999 | | |
994 | 1000 | | |
| |||
1012 | 1018 | | |
1013 | 1019 | | |
1014 | 1020 | | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
1015 | 1026 | | |
1016 | 1027 | | |
1017 | 1028 | | |
| |||
0 commit comments