From 9deb1a7765976e44d1a42f2f4a46f605fe240cc4 Mon Sep 17 00:00:00 2001 From: WOOD_C <51071696+WOODchen7@users.noreply.github.com> Date: Mon, 9 Feb 2026 12:58:39 +0800 Subject: [PATCH 1/6] Update latest news section in README.md Update latest news section in README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 654785d4..68a3be36 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,7 @@ A more accessible, comprehensive, and efficient toolkit for large model compress

## 📣Latest News +- [26/02/09] We have released HY-1.8B-2Bit, 2bit on-device large language model,[[Huggingface]]([https://arxiv.org/abs/2601.07892](https://huggingface.co/AngelSlim/HY-1.8B-2Bit)). - [26/01/13] We have released v0.3. We support the training and deployment of Eagle3 for all-scale LLMs/VLMs/Audio models, as detailed in the [guidance documentation](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html). And We released **Sherry**, the hardware-efficient 1.25 bit quantization algorithm [[Paper]](https://arxiv.org/abs/2601.07892) | [[Code]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry)🔥🔥🔥 - [25/11/05] We have released v0.2. Quantization support for new models, such as `GLM-4.6`, `Qwen3-VL` and `Qwen3-Omni`, open-sources the Eagle3 speculative decoding training framework, and updates the Diffusion model quantization tools. - [25/09/30] We have released **SpecExit**, the reasoning early-exit algorithm: [[Paper]](http://arxiv.org/abs/2509.24248) | [[Docs]](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html) | [[vLLM Code]](https://github.com/vllm-project/vllm/pull/27192) From 6f60e60be5c3d66bb2a5057eb5c5d065a28be4bd Mon Sep 17 00:00:00 2001 From: WOOD_C <51071696+WOODchen7@users.noreply.github.com> Date: Mon, 9 Feb 2026 15:36:31 +0800 Subject: [PATCH 2/6] Update README.md --- README.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 68a3be36..f656bab8 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,12 @@ A more accessible, comprehensive, and efficient toolkit for large model compress

- 📖 Documentation   |   🤗 Hugging Face   |   🤖 ModelScope   |   💬 WeChat |   🫨 Discord + ✒️ TechnicalReport   |    📖 Documentation   |   🤗 Hugging Face   |   🤖 ModelScope +
+

+ +

+ 💬 WeChat |   🫨 Discord

From eee0a0c394aecef6b1020f195da02f2d53d549ec Mon Sep 17 00:00:00 2001 From: WOOD_C <51071696+WOODchen7@users.noreply.github.com> Date: Mon, 9 Feb 2026 15:37:43 +0800 Subject: [PATCH 3/6] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index f656bab8..fc6d55d7 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ A more accessible, comprehensive, and efficient toolkit for large model compress

- ✒️ TechnicalReport   |    📖 Documentation   |   🤗 Hugging Face   |   🤖 ModelScope + ✒️ TechnicalReport   |    📖 Documentation   |   🤗 Hugging Face   |   🤖 ModelScope

From be8d8ada1c52e302af3e146e5185b6984e1c1698 Mon Sep 17 00:00:00 2001 From: WOOD_C <51071696+WOODchen7@users.noreply.github.com> Date: Mon, 9 Feb 2026 15:41:28 +0800 Subject: [PATCH 4/6] Update README_cn.md --- README_cn.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/README_cn.md b/README_cn.md index 5b2cbf0f..3a843f03 100644 --- a/README_cn.md +++ b/README_cn.md @@ -12,11 +12,17 @@

- 📖 Documentation   |   🤗 Hugging Face   |   🤖 ModelScope   |   💬 WeChat (微信) |   🫨 Discord + ✒️ TechnicalReport   |    📖 Documentation   |   🤗 Hugging Face   |   🤖 ModelScope +
+

+ +

+ 💬 WeChat |   🫨 Discord

## 📣最新进展 +- [26/02/09] 我们发布了 HY-1.8B-2Bit, 2比特端侧大模型, 模型可见[[Huggingface]]([https://arxiv.org/abs/2601.07892](https://huggingface.co/AngelSlim/HY-1.8B-2Bit)). - [26/01/13]我们发布V0.3版本, 支持了全模态场景的投机采样训练及部署,文档:[Eagle3 for LLM/VLM/Audio](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html)。并且我们发布了 **Sherry** 新的硬件高效的1.25bit三值量化算法 [[论文]](https://arxiv.org/abs/2601.07892) | [[代码]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry)🔥🔥🔥 - [25/11/05] 我们发布V0.2版本,支持了包括GLM-4.6/Qwen3-VL/Qwen3-Omni等更多模型的量化,开源投机采样Eagle3训练框架,更新Diffusion模型量化工具。 - [25/09/30] 我们开源了思考早退新算法 **SpecExit** [[论文]](http://arxiv.org/abs/2509.24248) | [[文档]](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html) | [[vLLM代码]](https://github.com/vllm-project/vllm/pull/27192) From e7d8b522594ea7e2af0098def8a8a5e2ae9d0ce8 Mon Sep 17 00:00:00 2001 From: WOOD_C <51071696+WOODchen7@users.noreply.github.com> Date: Mon, 9 Feb 2026 15:42:08 +0800 Subject: [PATCH 5/6] Fix Markdown link formatting in README --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index fc6d55d7..5132cf21 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ A more accessible, comprehensive, and efficient toolkit for large model compress

## 📣Latest News -- [26/02/09] We have released HY-1.8B-2Bit, 2bit on-device large language model,[[Huggingface]]([https://arxiv.org/abs/2601.07892](https://huggingface.co/AngelSlim/HY-1.8B-2Bit)). +- [26/02/09] We have released HY-1.8B-2Bit, 2bit on-device large language model,[[Huggingface]](https://huggingface.co/AngelSlim/HY-1.8B-2Bit). - [26/01/13] We have released v0.3. We support the training and deployment of Eagle3 for all-scale LLMs/VLMs/Audio models, as detailed in the [guidance documentation](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html). And We released **Sherry**, the hardware-efficient 1.25 bit quantization algorithm [[Paper]](https://arxiv.org/abs/2601.07892) | [[Code]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry)🔥🔥🔥 - [25/11/05] We have released v0.2. Quantization support for new models, such as `GLM-4.6`, `Qwen3-VL` and `Qwen3-Omni`, open-sources the Eagle3 speculative decoding training framework, and updates the Diffusion model quantization tools. - [25/09/30] We have released **SpecExit**, the reasoning early-exit algorithm: [[Paper]](http://arxiv.org/abs/2509.24248) | [[Docs]](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html) | [[vLLM Code]](https://github.com/vllm-project/vllm/pull/27192) From 0c6d975667683ac5fa2d15656fc92480f9e570dd Mon Sep 17 00:00:00 2001 From: WOOD_C <51071696+WOODchen7@users.noreply.github.com> Date: Mon, 9 Feb 2026 15:43:24 +0800 Subject: [PATCH 6/6] Fix markdown link formatting in README_cn.md --- README_cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README_cn.md b/README_cn.md index 3a843f03..af88721b 100644 --- a/README_cn.md +++ b/README_cn.md @@ -22,7 +22,7 @@

## 📣最新进展 -- [26/02/09] 我们发布了 HY-1.8B-2Bit, 2比特端侧大模型, 模型可见[[Huggingface]]([https://arxiv.org/abs/2601.07892](https://huggingface.co/AngelSlim/HY-1.8B-2Bit)). +- [26/02/09] 我们发布了 HY-1.8B-2Bit, 2比特端侧大模型, 模型可见[[Huggingface]](https://huggingface.co/AngelSlim/HY-1.8B-2Bit). - [26/01/13]我们发布V0.3版本, 支持了全模态场景的投机采样训练及部署,文档:[Eagle3 for LLM/VLM/Audio](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/eagle/index.html)。并且我们发布了 **Sherry** 新的硬件高效的1.25bit三值量化算法 [[论文]](https://arxiv.org/abs/2601.07892) | [[代码]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry)🔥🔥🔥 - [25/11/05] 我们发布V0.2版本,支持了包括GLM-4.6/Qwen3-VL/Qwen3-Omni等更多模型的量化,开源投机采样Eagle3训练框架,更新Diffusion模型量化工具。 - [25/09/30] 我们开源了思考早退新算法 **SpecExit** [[论文]](http://arxiv.org/abs/2509.24248) | [[文档]](https://angelslim.readthedocs.io/zh-cn/latest/features/speculative_decoding/spec_exit.html) | [[vLLM代码]](https://github.com/vllm-project/vllm/pull/27192)