Commit 9eed2dc
feat(loader): MLA metadata capture for deepseek2 (Phase 2.1)
Add tq_model_config_t fields for MLA: is_mla, kv_lora_rank,
qk_rope_head_dim, qk_nope_head_dim, v_head_dim. Loader detects
arch=deepseek2 with attn_kv_a_mqa + attn_kv_b tensors and reads
the GGUF metadata keys (attention.kv_lora_rank, attention.key_length,
attention.value_length, rope.dimension_count) to populate them.
Logs the architectural KV compression at load time:
MLA — kv_lora_rank=512, key_length=192 (rope=64 + nope=128),
v_head_dim=128 (KV cache compression 5120→576 = 8.9x
vs standard)
That stacks with our turbo_kv_4b 8x for ~71x total compression —
the moat for 256K context on 16 GB once Phase 2.2+ lands the
forward-pass MLA decompression.
Forward pass still emits the loud Phase 1 warning. Phase 2.1 is
strictly metadata; weight pointers and attention compute are TBD.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 1c85bdc commit 9eed2dc
2 files changed
Lines changed: 49 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
77 | 96 | | |
78 | 97 | | |
79 | 98 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3047 | 3047 | | |
3048 | 3048 | | |
3049 | 3049 | | |
| 3050 | + | |
| 3051 | + | |
| 3052 | + | |
| 3053 | + | |
| 3054 | + | |
| 3055 | + | |
| 3056 | + | |
| 3057 | + | |
| 3058 | + | |
| 3059 | + | |
| 3060 | + | |
| 3061 | + | |
| 3062 | + | |
| 3063 | + | |
| 3064 | + | |
| 3065 | + | |
| 3066 | + | |
| 3067 | + | |
| 3068 | + | |
| 3069 | + | |
| 3070 | + | |
| 3071 | + | |
| 3072 | + | |
| 3073 | + | |
| 3074 | + | |
| 3075 | + | |
| 3076 | + | |
| 3077 | + | |
| 3078 | + | |
| 3079 | + | |
3050 | 3080 | | |
3051 | 3081 | | |
3052 | 3082 | | |
| |||
0 commit comments