You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With this integration, you can use models through Hugging Face APIs:
29
+
-[Serverless Inference API (Inference Providers)](https://huggingface.co/docs/inference-providers): access many models from different providers through a unified API.
30
+
-[Inference Endpoints](https://huggingface.co/inference-endpoints): deploy models on dedicated, fully managed infrastructure.
Haystack supports Hugging Face models in other ways too:
34
+
-[Hugging Face Transformers](https://haystack.deepset.ai/integrations/huggingface) for local models (LLMs, extractive QA, classification, NER)
35
+
-[Sentence Transformers](https://haystack.deepset.ai/integrations/sentence-transformers) for local embedding and ranking models
36
+
-[Optimum](https://haystack.deepset.ai/integrations/optimum) for high-performance inference with ONNX Runtime
37
+
38
+
## Installation
39
+
40
+
```bash
41
+
pip install huggingface-api-haystack
42
+
```
43
+
44
+
## Usage
45
+
46
+
Unless you are using a self-hosted TGI/TEI server, set your Hugging Face token as the `HF_API_TOKEN` or `HF_TOKEN` environment variable.
47
+
48
+
### Components
49
+
50
+
This integration provides several components to interact with Hugging Face APIs:
51
+
-[`HuggingFaceAPIChatGenerator`](https://docs.haystack.deepset.ai/docs/huggingfaceapichatgenerator): chat generation with LLMs.
52
+
-[`HuggingFaceAPITextEmbedder`](https://docs.haystack.deepset.ai/docs/huggingfaceapitextembedder): creates an embedding for text (used in query/RAG pipelines).
53
+
-[`HuggingFaceAPIDocumentEmbedder`](https://docs.haystack.deepset.ai/docs/huggingfaceapidocumentembedder): enriches documents with embeddings (used in indexing pipelines).
54
+
-[`HuggingFaceTEIRanker`](https://docs.haystack.deepset.ai/docs/huggingfaceteiranker): ranks documents based on their similarity to the query, using a TEI endpoint.
55
+
56
+
### Chat Generation
57
+
58
+
Use [`HuggingFaceAPIChatGenerator`](https://docs.haystack.deepset.ai/docs/huggingfaceapichatgenerator) with the Serverless Inference API (Inference Providers):
59
+
60
+
```python
61
+
from haystack.dataclasses import ChatMessage
62
+
from haystack_integrations.components.generators.huggingface_api import HuggingFaceAPIChatGenerator
result = generator.run("What's Natural Language Processing? Be brief.")
70
+
print(result)
71
+
```
72
+
73
+
To use a dedicated Inference Endpoint or a self-hosted TGI server, pass its URL instead:
74
+
75
+
```python
76
+
generator = HuggingFaceAPIChatGenerator(
77
+
api_type="inference_endpoints", # or "text_generation_inference" for self-hosted TGI
78
+
api_params={"url": "<your-endpoint-url>"},
79
+
)
80
+
```
81
+
82
+
### Embedding Models
83
+
84
+
To create semantic embeddings for documents, use [`HuggingFaceAPIDocumentEmbedder`](https://docs.haystack.deepset.ai/docs/huggingfaceapidocumentembedder) in your indexing pipeline. For generating embeddings for queries, use [`HuggingFaceAPITextEmbedder`](https://docs.haystack.deepset.ai/docs/huggingfaceapitextembedder).
85
+
86
+
```python
87
+
from haystack_integrations.components.embedders.huggingface_api import HuggingFaceAPITextEmbedder
Both embedders also work with a self-hosted TEI server:
99
+
100
+
```python
101
+
text_embedder = HuggingFaceAPITextEmbedder(
102
+
api_type="text_embeddings_inference",
103
+
api_params={"url": "http://localhost:8080"},
104
+
)
105
+
```
106
+
107
+
### Ranking Models
108
+
109
+
Use [`HuggingFaceTEIRanker`](https://docs.haystack.deepset.ai/docs/huggingfaceteiranker) to rank documents with a reranking model served by a TEI endpoint:
110
+
111
+
```python
112
+
from haystack import Document
113
+
from haystack_integrations.components.rankers.huggingface_api import HuggingFaceTEIRanker
You can use models on [Hugging Face](https://huggingface.co/) in your Haystack pipelines with [Generators](https://docs.haystack.deepset.ai/docs/generators), [Embedders](https://docs.haystack.deepset.ai/docs/embedders), [Rankers](https://docs.haystack.deepset.ai/docs/rankers) and [Readers](https://docs.haystack.deepset.ai/docs/readers)!
28
+
[Transformers](https://huggingface.co/docs/transformers/index) is Hugging Face's library for state-of-the-art machine learning models. With this integration, you can run models from the [Hugging Face Hub](https://huggingface.co/models)**locally**, on your own machine, in your Haystack pipelines.
29
29
30
-
### Installation
30
+
Haystack supports Hugging Face models in other ways too:
31
+
-[Sentence Transformers](https://haystack.deepset.ai/integrations/sentence-transformers) for local embedding and ranking models
32
+
-[Hugging Face API](https://haystack.deepset.ai/integrations/huggingface-api) to call models via Inference Providers, Inference Endpoints, or self-hosted TGI/TEI
33
+
-[Optimum](https://haystack.deepset.ai/integrations/optimum) for high-performance inference with ONNX Runtime
You can use models on Hugging Face in various ways:
39
-
40
-
#### Embedding Models
41
+
## Usage
41
42
42
-
You can leverage embedding models from Hugging Face through four components: [SentenceTransformersTextEmbedder](https://docs.haystack.deepset.ai/docs/sentencetransformerstextembedder), [SentenceTransformersDocumentEmbedder](https://docs.haystack.deepset.ai/docs/sentencetransformersdocumentembedder), [HuggingFaceAPITextEmbedder](https://docs.haystack.deepset.ai/docs/huggingfaceapitextembedder) and [HuggingFaceAPIDocumentEmbedder](https://docs.haystack.deepset.ai/docs/huggingfaceapidocumentembedder).
43
+
### Components
43
44
44
-
To create semantic embeddings for documents, use a Document Embedder in your indexing pipeline. For generating embeddings for queries, use a Text Embedder.
45
+
Haystack provides several components that run Transformers models locally:
46
+
-[`HuggingFaceLocalChatGenerator`](https://docs.haystack.deepset.ai/docs/huggingfacelocalchatgenerator): chat generation with local LLMs.
47
+
-[`ExtractiveReader`](https://docs.haystack.deepset.ai/docs/extractivereader): extracts answers from documents using question answering models.
48
+
-[`TransformersTextRouter`](https://docs.haystack.deepset.ai/docs/transformerstextrouter) and [`TransformersZeroShotTextRouter`](https://docs.haystack.deepset.ai/docs/transformerszeroshottextrouter): route text to different pipeline branches based on classification.
49
+
-[`TransformersZeroShotDocumentClassifier`](https://docs.haystack.deepset.ai/docs/transformerszeroshotdocumentclassifier): classifies documents with zero-shot classification models.
50
+
-[`NamedEntityExtractor`](https://docs.haystack.deepset.ai/docs/namedentityextractor): annotates named entities in documents (with the `hugging_face` backend).
45
51
46
-
Depending on the hosting option (local Sentence Transformers model, Serverless Inference API, Inference Endpoints, or self-hosted Text Embeddings Inference), select the suitable Hugging Face Embedder component and initialize it with the model name.
52
+
### Chat Generation
47
53
48
-
Below is the example indexing pipeline with `InMemoryDocumentStore`, `DocumentWriter` and `SentenceTransformersDocumentEmbedder`:
54
+
Use [`HuggingFaceLocalChatGenerator`](https://docs.haystack.deepset.ai/docs/huggingfacelocalchatgenerator) to run a chat model locally:
49
55
50
56
```python
51
-
from haystack import Document
52
-
from haystack import Pipeline
53
-
from haystack.document_stores.in_memory import InMemoryDocumentStore
54
-
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
55
-
from haystack.components.writers import DocumentWriter
You can leverage text generation models from Hugging Face through three components: [HuggingFaceLocalGenerator](https://docs.haystack.deepset.ai/docs/huggingfacelocalgenerator), [HuggingFaceAPIGenerator](https://docs.haystack.deepset.ai/docs/huggingfaceapigenerator) and [HuggingFaceAPIChatGenerator](https://docs.haystack.deepset.ai/docs/huggingfaceapichatgenerator).
75
-
76
-
Depending on the model type (chat or text completion) and hosting option (local Transformer model, Serverless Inference API, Inference Endpoints, or self-hosted Text Generation Inference), select the suitable Hugging Face Generator component and initialize it with the model name.
77
-
78
-
Below is the example query pipeline that uses `HuggingFaceH4/zephyr-7b-beta` hosted on Serverless Inference API with `HuggingFaceAPIGenerator`:
79
-
80
-
```python
81
-
from haystack import Pipeline
82
-
from haystack.utils import Secret
83
-
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
84
-
from haystack.components.builders.prompt_builder import PromptBuilder
85
-
from haystack.components.generators import HuggingFaceAPIGenerator
86
-
87
-
template ="""
88
-
Given the following information, answer the question.
89
-
90
-
Context:
91
-
{% for document in documents %}
92
-
{{ document.text }}
93
-
{% endfor %}
94
-
95
-
Question: What's the official language of {{ country }}?
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
63
+
print(generator.run(messages))
114
64
```
115
65
116
-
#### Ranker Models
117
-
118
-
To use cross encoder models on Hugging Face, initialize a `SentenceTransformersRanker` with the model name. You can then use this `SentenceTransformersRanker` to sort documents based on their relevancy to the query.
66
+
### Extractive Question Answering
119
67
120
-
Below is the example of document retrieval pipeline with `InMemoryBM25Retriever` and `SentenceTransformersRanker`:
121
-
122
-
```python
123
-
from haystack import Document, Pipeline
124
-
from haystack.document_stores.in_memory import InMemoryDocumentStore
125
-
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
126
-
from haystack.components.rankers import TransformersSimilarityRanker
To use question answering models on Hugging Face, initialize a `ExtractiveReader` with the model name. You can then use this `ExtractiveReader` to extract answers from the relevant context.
150
-
151
-
Below is the example of extractive question answering pipeline with `InMemoryBM25Retriever` and `ExtractiveReader`:
68
+
Use [`ExtractiveReader`](https://docs.haystack.deepset.ai/docs/extractivereader) to extract answers from the relevant context:
152
69
153
70
```python
154
71
from haystack import Document, Pipeline
@@ -163,16 +80,55 @@ docs = [Document(content="Paris is the capital of France."),
Use [`TransformersZeroShotDocumentClassifier`](https://docs.haystack.deepset.ai/docs/transformerszeroshotdocumentclassifier) to classify documents with labels of your choice, without fine-tuning:
99
+
100
+
```python
101
+
from haystack import Document
102
+
from haystack.components.classifiers import TransformersZeroShotDocumentClassifier
103
+
104
+
documents = [Document(content="Today was a nice day!"),
0 commit comments