Skip to content

Commit 1e9d771

Browse files
authored
convert : force f16 or f32 on step3-vl conv weights (ggml-org#21646)
1 parent aa4695c commit 1e9d771

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

convert_hf_to_gguf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4992,6 +4992,8 @@ def set_gguf_parameters(self):
49924992
def tensor_force_quant(self, name, new_name, bid, n_dims):
49934993
if ".position_embd." in new_name:
49944994
return gguf.GGMLQuantizationType.F32
4995+
if ("mm.0." in new_name or "mm.1." in new_name) and new_name.endswith(".weight"):
4996+
return gguf.GGMLQuantizationType.F16 if self.ftype == gguf.LlamaFileType.MOSTLY_F16 else gguf.GGMLQuantizationType.F32
49954997
return super().tensor_force_quant(name, new_name, bid, n_dims)
49964998

49974999
def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]:

0 commit comments

Comments
 (0)