Fix smollm2 alias to point at SmolLM2-135M (v2) instead of SmolLM-135M (v1)#18859
Fix smollm2 alias to point at SmolLM2-135M (v2) instead of SmolLM-135M (v1)#18859kirklandsign merged 2 commits intomainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18859
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ You can merge normally! (2 Unrelated Failures)As of commit f8eb537 with merge base fe71bd4 ( BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Agent-Logs-Url: https://github.com/pytorch/executorch/sessions/bf2a52e4-8d29-4371-8a0e-b4c5cfe98be0 Co-authored-by: kirklandsign <107070759+kirklandsign@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR corrects the smollm2 model aliasing in the LLaMA export tooling so it points to the SmolLM2 v2 checkpoint and uses the matching RoPE theta value, aligning ExecuTorch’s built-in smollm2 configuration with the upstream HuggingFace SmolLM2-135M settings.
Changes:
- Update
HUGGING_FACE_REPO_IDS["smollm2"]toHuggingFaceTB/SmolLM2-135M. - Update SmolLM2 135M example config
rope_thetato100000.0.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| examples/models/smollm2/135M_config.json | Updates RoPE theta to match SmolLM2-135M upstream config. |
| examples/models/llama/export_llama_lib.py | Fixes smollm2 HF repo ID used for auto-download/convert when no checkpoint is provided. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@claude review |
|
Claude finished @kirklandsign's task in 49s —— View job Review of PR #18859
LGTM. This is a straightforward data-only fix and the changes are correct. I verified both values against the upstream HuggingFace configs:
All other config values in No issues found. |
Summary
The original SmolLM2 PR (#9354) started as v1 support, was renamed to
smollm2during review, but the repo ID andrope_thetawere never updated to v2 values. The two checkpoints are genuinely different models (0/272 tensors match).HUGGING_FACE_REPO_IDS["smollm2"]:HuggingFaceTB/SmolLM-135M→HuggingFaceTB/SmolLM2-135Mexamples/models/smollm2/135M_config.json:rope_theta10000.0→100000.0(matches SmolLM2-135M HF config)Test plan
Data-only change (one string, one number). Verified values match the upstream HuggingFace SmolLM2-135M config.