You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Choose the patch according to your development needs.
106
+
If you are working on **sparse attention** or **ReRoPE** independently, applying only the corresponding patch is sufficient.
80
107
81
108
82
109
### Option 3: Install by pip
@@ -91,6 +118,7 @@ Download the pre-built `vllm/vllm-openai:v0.9.2` docker image and build unified-
91
118
export PLATFORM=cuda
92
119
pip install uc-manager
93
120
```
121
+
>**Note:** If installing via `pip install`, you need to manually add the `config.yaml` file, similar to `unified-cache-management/examples/ucm_config_example.yaml`, because PyPI packages do not include YAML files.
The ReRoPE algorithm is not supported on Ascend at the moment.
49
+
Only the standard UCM integration is applicable for vLLM-Ascend.
50
+
26
51
27
52
### Option 2: Install by pip
28
53
Install by pip or find the pre-build wheels on [Pypi](https://pypi.org/project/uc-manager/).
29
54
```
30
55
export PLATFORM=ascend
31
56
pip install uc-manager
32
57
```
58
+
> **Note:** If installing via `pip install`, you need to manually add the `config.yaml` file, similar to `unified-cache-management/examples/ucm_config_example.yaml`, because PyPI packages do not include YAML files.
33
59
34
60
### Option 3: Setup from docker
35
61
Download the pre-built `vllm-ascend` docker image and build unified-cache-management docker image by commands below:
@@ -39,6 +65,14 @@ Download the pre-built `vllm-ascend` docker image and build unified-cache-manage
0 commit comments