Skip to content

Commit 4a15b8b

Browse files
feat: add VLLMRanker (#3174)
* feat: add VLLMRanker * better links, better types * readme * Update integrations/vllm/src/haystack_integrations/components/rankers/vllm/ranker.py Co-authored-by: bogdankostic <bogdankostic@web.de> * fix --------- Co-authored-by: bogdankostic <bogdankostic@web.de>
1 parent c6e1d42 commit 4a15b8b

11 files changed

Lines changed: 550 additions & 14 deletions

File tree

.github/workflows/vllm.yml

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,9 @@ env:
3131
FORCE_COLOR: "1"
3232
VLLM_MODEL: "Qwen/Qwen3-0.6B"
3333
VLLM_EMBEDDING_MODEL: "sentence-transformers/all-MiniLM-L6-v2"
34+
VLLM_RANKER_MODEL: "BAAI/bge-reranker-base"
35+
VLLM_TARGET_DEVICE: "cpu"
36+
VLLM_CPU_KVCACHE_SPACE: "4"
3437
# we only test on Ubuntu to keep vLLM server running simple
3538
TEST_MATRIX_OS: '["ubuntu-latest"]'
3639
# vLLM is not compatible with Python 3.14. https://github.com/vllm-project/vllm/issues/34096
@@ -90,9 +93,6 @@ jobs:
9093
--torch-backend cpu
9194
9295
- name: Start vLLM chat server
93-
env:
94-
VLLM_TARGET_DEVICE: "cpu"
95-
VLLM_CPU_KVCACHE_SPACE: "4"
9696
run: |
9797
nohup hatch run -- vllm serve ${{ env.VLLM_MODEL }} \
9898
--port 8000 \
@@ -120,9 +120,6 @@ jobs:
120120
echo "vLLM chat server started successfully."
121121
122122
- name: Start vLLM embedding server
123-
env:
124-
VLLM_TARGET_DEVICE: "cpu"
125-
VLLM_CPU_KVCACHE_SPACE: "4"
126123
run: |
127124
nohup hatch run -- vllm serve ${{ env.VLLM_EMBEDDING_MODEL }} \
128125
--port 8001 \
@@ -144,6 +141,27 @@ jobs:
144141
145142
echo "vLLM embedding server started successfully."
146143
144+
- name: Start vLLM ranker server
145+
run: |
146+
nohup hatch run -- vllm serve ${{ env.VLLM_RANKER_MODEL }} \
147+
--port 8002 \
148+
--enforce-eager \
149+
--max-num-seqs 1 &
150+
151+
# Wait for the vLLM ranker server to be ready with a timeout of 300 seconds
152+
timeout=300
153+
while [ $timeout -gt 0 ] && ! curl -sSf http://localhost:8002/health > /dev/null 2>&1; do
154+
echo "Waiting for vLLM ranker server to start..."
155+
sleep 10
156+
((timeout-=10))
157+
done
158+
159+
if [ $timeout -eq 0 ]; then
160+
echo "Timed out waiting for vLLM ranker server to start."
161+
exit 1
162+
fi
163+
164+
echo "vLLM ranker server started successfully."
147165
- name: Lint
148166
if: matrix.python-version == '3.10' && runner.os == 'Linux'
149167
run: hatch run fmt-check && hatch run test:types

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ Please check out our [Contribution Guidelines](CONTRIBUTING.md) for all the deta
7878
| [togetherai-haystack](integrations/togetherai/) | Generator | [![PyPI - Version](https://img.shields.io/pypi/v/togetherai-haystack.svg)](https://pypi.org/project/togetherai-haystack) | [![Test / togetherai](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/togetherai.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/togetherai.yml) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-togetherai/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-togetherai/htmlcov/index.html) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-togetherai-combined/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-togetherai-combined/htmlcov/index.html) |
7979
| [unstructured-fileconverter-haystack](integrations/unstructured/) | File converter | [![PyPI - Version](https://img.shields.io/pypi/v/unstructured-fileconverter-haystack.svg)](https://pypi.org/project/unstructured-fileconverter-haystack) | [![Test / unstructured](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/unstructured.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/unstructured.yml) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-unstructured/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-unstructured/htmlcov/index.html) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-unstructured-combined/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-unstructured-combined/htmlcov/index.html) |
8080
| [valkey-haystack](integrations/valkey/) | Document Store | [![PyPI - Version](https://img.shields.io/pypi/v/valkey-haystack.svg)](https://pypi.org/project/valkey-haystack) | [![Test / valkey](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/valkey.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/valkey.yml) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-valkey/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-valkey/htmlcov/index.html) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-valkey-combined/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-valkey-combined/htmlcov/index.html) |
81-
| [vllm-haystack](integrations/vllm/) | Embedder, Generator | [![PyPI - Version](https://img.shields.io/pypi/v/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack) | [![Test / vllm](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/vllm.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/vllm.yml) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-vllm/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-vllm/htmlcov/index.html) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-vllm-combined/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-vllm-combined/htmlcov/index.html) |
81+
| [vllm-haystack](integrations/vllm/) | Embedder, Generator, Ranker | [![PyPI - Version](https://img.shields.io/pypi/v/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack) | [![Test / vllm](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/vllm.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/vllm.yml) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-vllm/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-vllm/htmlcov/index.html) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-vllm-combined/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-vllm-combined/htmlcov/index.html) |
8282
| [watsonx-haystack](integrations/watsonx/) | Embedder, Generator | [![PyPI - Version](https://img.shields.io/pypi/v/watsonx-haystack.svg?color=orange)](https://pypi.org/project/watsonx-haystack) | [![Test / watsonx](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/watsonx.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/watsonx.yml) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-watsonx/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-watsonx/htmlcov/index.html) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-watsonx-combined/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-watsonx-combined/htmlcov/index.html) |
8383
| [weave-haystack](integrations/weave/) | Tracer | [![PyPI - Version](https://img.shields.io/pypi/v/weave-haystack.svg)](https://pypi.org/project/weave-haystack) | [![Test / weave](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/weave.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/weave.yml) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-weave/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-weave/htmlcov/index.html) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-weave-combined/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-weave-combined/htmlcov/index.html) |
8484
| [weaviate-haystack](integrations/weaviate/) | Document Store | [![PyPI - Version](https://img.shields.io/pypi/v/weaviate-haystack.svg)](https://pypi.org/project/weaviate-haystack) | [![Test / weaviate](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/weaviate.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/weaviate.yml) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-weaviate/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-weaviate/htmlcov/index.html) | [![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-weaviate-combined/endpoint.json&label=)](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-weaviate-combined/htmlcov/index.html) |

integrations/vllm/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,11 @@ vLLM-metal does not support embedding models. On macOS, you can run the embeddin
2626
# embedders server (port 8001)
2727
docker run --rm -p 8001:8000 -e VLLM_CPU_OMP_THREADS_BIND=0-3 vllm/vllm-openai-cpu:latest \
2828
--model sentence-transformers/all-MiniLM-L6-v2 --enforce-eager
29+
```
30+
31+
To run the ranker server, use CPU Docker image:
32+
```bash
33+
# ranker server (port 8002)
34+
docker run --rm -p 8002:8000 -e VLLM_CPU_OMP_THREADS_BIND=0-3 vllm/vllm-openai-cpu:latest \
35+
--model BAAI/bge-reranker-base --enforce-eager
2936
```

integrations/vllm/pydoc/config_docusaurus.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ loaders:
33
- haystack_integrations.components.generators.vllm.chat.chat_generator
44
- haystack_integrations.components.embedders.vllm.text_embedder
55
- haystack_integrations.components.embedders.vllm.document_embedder
6+
- haystack_integrations.components.rankers.vllm.ranker
67
search_path: [../src]
78
processors:
89
- type: filter

integrations/vllm/pyproject.toml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,6 @@ dependencies = [
5757
"pytest-rerunfailures",
5858
"mypy",
5959
"pip",
60-
"Pillow",
6160
]
6261

6362
[tool.hatch.envs.test.scripts]
@@ -66,7 +65,7 @@ integration = 'pytest -m "integration" {args:tests}'
6665
all = 'pytest {args:tests}'
6766
unit-cov-retry = 'pytest --cov=haystack_integrations --reruns 3 --reruns-delay 30 -x -m "not integration" {args:tests}'
6867
integration-cov-append-retry = 'pytest --cov=haystack_integrations --cov-append --reruns 3 --reruns-delay 30 -x -m "integration" {args:tests}'
69-
types = "mypy -p haystack_integrations.components.generators.vllm -p haystack_integrations.components.embedders.vllm -p haystack_integrations.common.vllm {args}"
68+
types = "mypy -p haystack_integrations.components.generators.vllm -p haystack_integrations.components.embedders.vllm -p haystack_integrations.components.rankers.vllm -p haystack_integrations.common.vllm {args}"
7069

7170
[tool.mypy]
7271
install_types = true

integrations/vllm/src/haystack_integrations/components/embedders/vllm/document_embedder.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ def _validate_documents(documents: list[Document]) -> None:
241241
raise TypeError(msg)
242242

243243
@component.output_types(documents=list[Document], meta=dict[str, Any])
244-
def run(self, documents: list[Document]) -> dict[str, Any]:
244+
def run(self, documents: list[Document]) -> dict[str, list[Document] | dict[str, Any]]:
245245
"""
246246
Embed a list of Documents.
247247
@@ -267,7 +267,7 @@ def run(self, documents: list[Document]) -> dict[str, Any]:
267267
return {"documents": new_documents, "meta": meta}
268268

269269
@component.output_types(documents=list[Document], meta=dict[str, Any])
270-
async def run_async(self, documents: list[Document]) -> dict[str, Any]:
270+
async def run_async(self, documents: list[Document]) -> dict[str, list[Document] | dict[str, Any]]:
271271
"""
272272
Asynchronously embed a list of Documents.
273273

integrations/vllm/src/haystack_integrations/components/embedders/vllm/text_embedder.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -138,14 +138,14 @@ def _prepare_input(self, text: str) -> dict[str, Any]:
138138
return kwargs
139139

140140
@staticmethod
141-
def _prepare_output(response: CreateEmbeddingResponse) -> dict[str, Any]:
141+
def _prepare_output(response: CreateEmbeddingResponse) -> dict[str, list[float] | dict[str, Any]]:
142142
return {
143143
"embedding": response.data[0].embedding,
144144
"meta": {"model": response.model, "usage": dict(response.usage)},
145145
}
146146

147147
@component.output_types(embedding=list[float], meta=dict[str, Any])
148-
def run(self, text: str) -> dict[str, Any]:
148+
def run(self, text: str) -> dict[str, list[float] | dict[str, Any]]:
149149
"""
150150
Embed a single string.
151151
@@ -162,7 +162,7 @@ def run(self, text: str) -> dict[str, Any]:
162162
return self._prepare_output(response)
163163

164164
@component.output_types(embedding=list[float], meta=dict[str, Any])
165-
async def run_async(self, text: str) -> dict[str, Any]:
165+
async def run_async(self, text: str) -> dict[str, list[float] | dict[str, Any]]:
166166
"""
167167
Asynchronously embed a single string.
168168

integrations/vllm/src/haystack_integrations/components/rankers/py.typed

Whitespace-only changes.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# SPDX-FileCopyrightText: 2022-present deepset GmbH <info@deepset.ai>
2+
#
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
from .ranker import VLLMRanker
6+
7+
__all__ = ["VLLMRanker"]

0 commit comments

Comments
 (0)