[Model] Support MiniCPM-V 4.6#22529
Conversation
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
| if chkhsh == "1444df51289cfa8063b96f0e62b1125440111bc79a52003ea14b6eac7016fd5f": | ||
| # ref: MiniCPM-V 4.6 (Qwen3.5 Flash based) | ||
| res = "qwen35" |
There was a problem hiding this comment.
Don't add it directly like this, it needs to be added to convert_hf_to_gguf_update.py and ran, which will add it here.
Since it's a duplicate you need to add it to pre_computed_hashes.
| # GGUF tensor names mirror the C++ definitions in tools/mtmd/clip-impl.h: | ||
| # TN_INSERT_MERGER_* -> v.insert_merger.* | ||
| # TN_MERGER_* -> merger.* | ||
| _VIT_MERGER_MAP = { |
There was a problem hiding this comment.
Not cool, use tensor_mapping.py like everyone else.
ngxson
left a comment
There was a problem hiding this comment.
please change the naming to minicpmv2_6 everywhere for consistency
| ggml_tensor * kq = ggml_mul_mat(ctx0, k, q); | ||
| kq = ggml_soft_max_ext(ctx0, kq, nullptr, kq_scale, 0.0f); | ||
| ggml_tensor * kqv = ggml_mul_mat(ctx0, v, kq); | ||
| cur = ggml_permute(ctx0, kqv, 0, 2, 1, 3); | ||
| cur = ggml_cont_2d(ctx0, cur, n_embd, n_patches); | ||
|
|
||
| cur = ggml_mul_mat(ctx0, model.insert_merger_attn_o_w, cur); |
There was a problem hiding this comment.
use build_attn to allow flash attention support
| ggml_cgraph * build() override; | ||
| }; | ||
|
|
||
| struct clip_graph_minicpmv_merger : clip_graph { |
There was a problem hiding this comment.
tbh the naming has been quite messy, better to just name the model by its version instead:
| struct clip_graph_minicpmv_merger : clip_graph { | |
| struct clip_graph_minicpmv2_6 : clip_graph { |
There was a problem hiding this comment.
Okay, I accept the version number naming convention.
However, this version is V4.6. Could we use clip_graph_minicpmv4_6 instead?
| // MiniCPM-V 4.5 / 4.6 final merger (DownsampleMLP) | ||
| ggml_tensor * merger_pre_norm_w = nullptr; | ||
| ggml_tensor * merger_pre_norm_b = nullptr; | ||
| ggml_tensor * merger_mlp_up_w = nullptr; | ||
| ggml_tensor * merger_mlp_up_b = nullptr; | ||
| ggml_tensor * merger_mlp_down_w = nullptr; | ||
| ggml_tensor * merger_mlp_down_b = nullptr; |
There was a problem hiding this comment.
use the existing mm_ffn_* tensors instead
| tok_row_end_trail = false; // no trailing end-of-row token | ||
| ov_img_first = true; | ||
| } else if (minicpmv_version == 3 || minicpmv_version == 4 || minicpmv_version == 5 || minicpmv_version == 6 || minicpmv_version == 100045) { | ||
| } else if (minicpmv_version == 3 || minicpmv_version == 4 || minicpmv_version == 5 || minicpmv_version == 6 || minicpmv_version == 100045 || minicpmv_version == 46) { |
There was a problem hiding this comment.
the level of consistency is quite questionable here
better to just don't use the minicpmv_version and add a new projector type for each version instead
This PR adds support for MiniCPM-V 4.6.
convert_hf_to_gguf.pyflow.