Skip to content

Eval bug: regression introduced in #20503 #21115

@EAddario

Description

@EAddario

Name and Version

Any versions post #20503

Operating systems

Mac

GGML backends

Metal

Hardware

All hardware

Models

Possibly all models

Problem description & steps to reproduce

LLM_TN_IMPL::str() now includes a check to verify if a given tensor is defined in the current model architecture's model_tensors list.

If a model includes tensors that might not be strictly listed in the architecture definition, like position_embd or token_types for example, the console is spammed with many

str: cannot properly format tensor name position_embd with suffix=weight bid=-1 xid=-1
str: cannot properly format tensor name token_types with suffix=weight bid=-1 xid=-1

To reproduce, simply quantize a model like: llama-quantize Qwen3.5-9B-F16.gguf Model-Q4_K.gguf Q4_K 12

First Bad Commit

#20503

Relevant log output

llama-quantize Qwen3.5-9B-F16.gguf Model-Q4_K.gguf Q4_K 12
main: build = 8563 (1f5d15e66)
main: built with AppleClang 17.0.0.17000604 for Darwin arm64
main: quantizing 'Qwen3.5-9B-F16.gguf' to 'Model-Q4_K.gguf' as Q4_K using 12 threads
llama_model_loader: loaded meta data with 41 key-value pairs and 427 tensors from Qwen3.5-9B-F16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen35
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Qwen3.5 9B
llama_model_loader: - kv   3:                           general.basename str              = Qwen3.5
llama_model_loader: - kv   4:                         general.size_label str              = 9B
llama_model_loader: - kv   5:                            general.license str              = apache-2.0
llama_model_loader: - kv   6:                       general.license.link str              = https://huggingface.co/Qwen/Qwen3.5-9...
llama_model_loader: - kv   7:                   general.base_model.count u32              = 1
llama_model_loader: - kv   8:                  general.base_model.0.name str              = Qwen3.5 9B Base
llama_model_loader: - kv   9:          general.base_model.0.organization str              = Qwen
llama_model_loader: - kv  10:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen3.5-9...
llama_model_loader: - kv  11:                               general.tags arr[str,1]       = ["image-text-to-text"]
llama_model_loader: - kv  12:                         qwen35.block_count u32              = 32
llama_model_loader: - kv  13:                      qwen35.context_length u32              = 262144
llama_model_loader: - kv  14:                    qwen35.embedding_length u32              = 4096
llama_model_loader: - kv  15:                 qwen35.feed_forward_length u32              = 12288
llama_model_loader: - kv  16:                qwen35.attention.head_count u32              = 16
llama_model_loader: - kv  17:             qwen35.attention.head_count_kv u32              = 4
llama_model_loader: - kv  18:             qwen35.rope.dimension_sections arr[i32,4]       = [11, 11, 10, 0]
llama_model_loader: - kv  19:                      qwen35.rope.freq_base f32              = 10000000.000000
llama_model_loader: - kv  20:    qwen35.attention.layer_norm_rms_epsilon f32              = 0.000001
llama_model_loader: - kv  21:                qwen35.attention.key_length u32              = 256
llama_model_loader: - kv  22:              qwen35.attention.value_length u32              = 256
llama_model_loader: - kv  23:                          general.file_type u32              = 1
llama_model_loader: - kv  24:                     qwen35.ssm.conv_kernel u32              = 4
llama_model_loader: - kv  25:                      qwen35.ssm.state_size u32              = 128
llama_model_loader: - kv  26:                     qwen35.ssm.group_count u32              = 16
llama_model_loader: - kv  27:                  qwen35.ssm.time_step_rank u32              = 32
llama_model_loader: - kv  28:                      qwen35.ssm.inner_size u32              = 4096
llama_model_loader: - kv  29:             qwen35.full_attention_interval u32              = 4
llama_model_loader: - kv  30:                qwen35.rope.dimension_count u32              = 64
llama_model_loader: - kv  31:               general.quantization_version u32              = 2
llama_model_loader: - kv  32:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  33:                         tokenizer.ggml.pre str              = qwen35
llama_model_loader: - kv  34:                      tokenizer.ggml.tokens arr[str,248320]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  35:                  tokenizer.ggml.token_type arr[i32,248320]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  36:                      tokenizer.ggml.merges arr[str,247587]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  37:                tokenizer.ggml.eos_token_id u32              = 248046
llama_model_loader: - kv  38:            tokenizer.ggml.padding_token_id u32              = 248044
llama_model_loader: - kv  39:               tokenizer.ggml.add_bos_token bool             = false
llama_model_loader: - kv  40:                    tokenizer.chat_template str              = {%- set image_count = namespace(value...
llama_model_loader: - type  f32:  177 tensors
llama_model_loader: - type  f16:  250 tensors
str: cannot properly format tensor name position_embd with suffix=weight bid=-1 xid=-1
str: cannot properly format tensor name token_types with suffix=weight bid=-1 xid=-1
str: cannot properly format tensor name position_embd with suffix=weight bid=-1 xid=-1
str: cannot properly format tensor name token_types with suffix=weight bid=-1 xid=-1
str: cannot properly format tensor name position_embd with suffix=weight bid=-1 xid=-1
...

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions