Skip to content

Commit 75b31bb

Browse files
authored
Fix smollm2 alias to point at SmolLM2-135M (v2) instead of SmolLM-135M (v1) (#18859)
The original SmolLM2 PR (#9354) started as v1 support, was renamed to `smollm2` during review, but the repo ID and `rope_theta` were never updated to v2 values. The two checkpoints are genuinely different models (0/272 tensors match). - `HUGGING_FACE_REPO_IDS["smollm2"]`: `HuggingFaceTB/SmolLM-135M` → `HuggingFaceTB/SmolLM2-135M` - `examples/models/smollm2/135M_config.json`: `rope_theta` `10000.0` → `100000.0` (matches [SmolLM2-135M HF config](https://huggingface.co/HuggingFaceTB/SmolLM2-135M/blob/main/config.json)) ### Test plan Data-only change (one string, one number). Verified values match the upstream HuggingFace SmolLM2-135M config.
1 parent edb8c98 commit 75b31bb

2 files changed

Lines changed: 2 additions & 2 deletions

File tree

examples/models/llama/export_llama_lib.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@
123123
"qwen2_5_1_5b": "Qwen/Qwen2.5-1.5B",
124124
"qwen2_5_coder_32b": "Qwen/Qwen2.5-Coder-32B-Instruct",
125125
"phi_4_mini": "microsoft/Phi-4-mini-instruct",
126-
"smollm2": "HuggingFaceTB/SmolLM-135M",
126+
"smollm2": "HuggingFaceTB/SmolLM2-135M",
127127
"qwen3_0_6b": "Qwen/Qwen3-0.6B",
128128
"qwen3_1_7b": "Qwen/Qwen3-1.7B",
129129
"qwen3_4b": "Qwen/Qwen3-4B",

examples/models/smollm2/135M_config.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
"n_kv_heads": 3,
77
"n_layers": 30,
88
"norm_eps": 1e-05,
9-
"rope_theta": 10000.0,
9+
"rope_theta": 100000.0,
1010
"use_scaled_rope": false,
1111
"vocab_size": 49152,
1212
"use_hf_rope": false,

0 commit comments

Comments
 (0)