Component
Other
Problem Description
Feature Request: Support for Qwen3 0.6B
Description
Qwen3 series has been released (from Alibaba) and the 0.6B variant is an extremely compelling small model:
- Only ~0.6 billion parameters → very low memory footprint (~400–700 MB quantized)
- Strong performance on many benchmarks (often beats or matches older 1.5B–3B models)
- Excellent multilingual support (including Indic languages, which is great for users in India)
- Apache 2.0 license
It would be a fantastic addition to RunAnywhere's on-device lineup, sitting nicely between SmolLM2-360M / Qwen2.5-0.5B (faster but weaker) and Qwen2.5-1.5B / 3B (stronger but heavier).
Many users are already looking for the smallest possible Qwen3 variant that still gives good instruction-following and reasoning.
Why this model?
- Memory + speed sweet spot for mid-range Android/iOS devices
- Noticeably better coherence and instruction following than Qwen2.5-0.5B in many early user reports
- Would give developers more granularity in model choices (0.36B → 0.6B → 1B → 1.5B → 3B+)
- Complements the existing Qwen2.5 support nicely
Proposed Solution
Proposed implementation
- Add GGUF-quantized variants of Qwen3-0.6B-Instruct to the model registry / recommended list
- Most useful quants: Q4_K_M, Q5_K_M, Q6_K (maybe Q8_0 for quality testing)
- Add the correct chat template / tokenizer config if it differs from Qwen2.5
- Update example code / model picker UI/docs with something like:
await TextGeneration.loadModel(
'/models/qwen3-0.6b-instruct-q5_K_M.gguf',
'qwen3-0.6b-instruct'
);
Alternatives Considered
No response
Additional Context
Hugging Face: https://huggingface.co/Qwen/Qwen3-0.6B-Instruct (or search "Qwen3 0.6B GGUF" for community quants)
Official repo: https://github.com/QwenLM/Qwen3 (announcement + technical report)
Component
Other
Problem Description
Feature Request: Support for Qwen3 0.6B
Description
Qwen3 series has been released (from Alibaba) and the 0.6B variant is an extremely compelling small model:
It would be a fantastic addition to RunAnywhere's on-device lineup, sitting nicely between SmolLM2-360M / Qwen2.5-0.5B (faster but weaker) and Qwen2.5-1.5B / 3B (stronger but heavier).
Many users are already looking for the smallest possible Qwen3 variant that still gives good instruction-following and reasoning.
Why this model?
Proposed Solution
Proposed implementation
Alternatives Considered
No response
Additional Context
Hugging Face: https://huggingface.co/Qwen/Qwen3-0.6B-Instruct (or search "Qwen3 0.6B GGUF" for community quants)
Official repo: https://github.com/QwenLM/Qwen3 (announcement + technical report)