Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ ______________________________________________________________________
<details open>
<summary><b>2026</b></summary>

- \[2026/04\] The LMDeploy project on PyPI has reached its storage quota, so pre-built wheels for new releases cannot be uploaded for the time being. You can download packages from the [GitHub Releases](https://github.com/InternLM/lmdeploy/releases) page or install from source instead. We will update this notice when wheel uploads to PyPI resume. Affected versions: >=0.12.2
- \[2026/02\] Support [Qwen3.5](https://huggingface.co/collections/Qwen/qwen35)
- \[2026/02\] Support [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor) 4bit symmetric/asymmetric quantization. Refer [here](./docs/en/quantization/llm_compressor.md) for detailed guide

Expand Down Expand Up @@ -228,7 +229,7 @@ Since v0.3.0, the default prebuilt package is compiled on **CUDA 12**. Starting
If you are using a GeForce RTX 50 series graphics card, please install the LMDeploy prebuilt package compiled with **CUDA 12.8** as follows:

```shell
export LMDEPLOY_VERSION=0.12.2
export LMDEPLOY_VERSION=0.12.3
export PYTHON_VERSION=312
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu128-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
```
Expand Down
3 changes: 2 additions & 1 deletion README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ ______________________________________________________________________
<details open>
<summary><b>2026</b></summary>

- \[2026/04\] 由于 LMDeploy 在 PyPI 上的项目存储配额已满,新版本目前无法上传预编译安装包(wheels)。用户可以通过 [GitHub Releases](https://github.com/InternLM/lmdeploy/releases) 页面下载安装包,或者通过源码安装等方式使用最新版本;预编译包恢复上传后我们会另行通知。受影响版本:>=0.12.2
- \[2026/02\] 支持 [Qwen3.5](https://huggingface.co/collections/Qwen/qwen35)
- \[2026/02\] 支持 [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor) 4bit 对称和非对称量化。 具体操作指南详见[此处](./docs/zh_cn/quantization/llm_compressor.md)

Expand Down Expand Up @@ -230,7 +231,7 @@ pip install lmdeploy
若使用 GeForce RTX 50 系列显卡,请按照如下方式安装基于 **CUDA 12.8** 编译的 LMDeploy 预编译包。

```shell
export LMDEPLOY_VERSION=0.12.2
export LMDEPLOY_VERSION=0.12.3
export PYTHON_VERSION=312
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu128-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
```
Expand Down
2 changes: 1 addition & 1 deletion lmdeploy/version.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Copyright (c) OpenMMLab. All rights reserved.

__version__ = '0.12.2'
__version__ = '0.12.3'
short_version = __version__


Expand Down
Loading