Skip to content

Commit b7985d6

Browse files
authored
chore: deprecate transformers components + update docs (#11570)
1 parent 2d1f229 commit b7985d6

27 files changed

Lines changed: 856 additions & 42 deletions

docs-website/docs/pipeline-components/classifiers/transformerszeroshotdocumentclassifier.mdx

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,9 @@ Classifies the documents based on the provided labels and adds them to their met
1717
| **Mandatory init variables** | `model`: The name or path of a Hugging Face model for zero shot document classification <br /> <br />`labels`: The set of possible class labels to classify each document into, for example, [`positive`, `negative`]. The labels depend on the selected model. |
1818
| **Mandatory run variables** | `documents`: A list of documents to classify |
1919
| **Output variables** | `documents`: A list of processed documents with an added `classification` metadata field |
20-
| **API reference** | [Classifiers](/reference/classifiers-api) |
21-
| **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/classifiers/zero_shot_document_classifier.py |
22-
| **Package name** | `haystack-ai` |
20+
| **API reference** | [Transformers](/reference/integrations-transformers) |
21+
| **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/transformers |
22+
| **Package name** | `transformers-haystack` |
2323

2424
</div>
2525

@@ -42,11 +42,19 @@ Available models for the task of zero-shot-classification are:
4242

4343
## Usage
4444

45+
Install the `transformers-haystack` package to use the `TransformersZeroShotDocumentClassifier`:
46+
47+
```shell
48+
pip install transformers-haystack
49+
```
50+
4551
### On its own
4652

4753
```python
4854
from haystack import Document
49-
from haystack.components.classifiers import TransformersZeroShotDocumentClassifier
55+
from haystack_integrations.components.classifiers.transformers import (
56+
TransformersZeroShotDocumentClassifier,
57+
)
5058

5159
documents = [
5260
Document(id="0", content="Cats don't get teeth cavities."),
@@ -71,7 +79,9 @@ from haystack import Document
7179
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
7280
from haystack.document_stores.in_memory import InMemoryDocumentStore
7381
from haystack.core.pipeline import Pipeline
74-
from haystack.components.classifiers import TransformersZeroShotDocumentClassifier
82+
from haystack_integrations.components.classifiers.transformers import (
83+
TransformersZeroShotDocumentClassifier,
84+
)
7585

7686
documents = [
7787
Document(id="0", content="Today was a nice day!"),
@@ -88,8 +98,8 @@ document_classifier = TransformersZeroShotDocumentClassifier(
8898
document_store.write_documents(documents)
8999

90100
pipeline = Pipeline()
91-
pipeline.add_component(retriever, name="retriever")
92-
pipeline.add_component(document_classifier, name="document_classifier")
101+
pipeline.add_component(name="retriever", instance=retriever)
102+
pipeline.add_component(name="document_classifier", instance=document_classifier)
93103
pipeline.connect("retriever", "document_classifier")
94104

95105
queries = ["How was your day today?", "How was your day yesterday?"]

docs-website/docs/pipeline-components/extractors/namedentityextractor.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ description: "This component extracts predefined entities out of a piece of text
99

1010
This component extracts predefined entities out of a piece of text and writes them into documents’ meta field.
1111

12+
:::warning[Deprecated]
13+
14+
`NamedEntityExtractor` is deprecated and will be removed in Haystack 3.0. It has moved to the `transformers-haystack` package and was renamed to `TransformersNamedEntityExtractor`. See [TransformersNamedEntityExtractor](transformersnamedentityextractor.mdx) for the updated documentation.
15+
16+
:::
17+
1218
<div className="key-value-table">
1319

1420
| | |
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
---
2+
title: "TransformersNamedEntityExtractor"
3+
id: transformersnamedentityextractor
4+
slug: "/transformersnamedentityextractor"
5+
description: "This component extracts predefined entities out of a piece of text and writes them into documents’ meta field."
6+
---
7+
8+
# TransformersNamedEntityExtractor
9+
10+
This component extracts predefined entities out of a piece of text and writes them into documents’ meta field.
11+
12+
<div className="key-value-table">
13+
14+
| | |
15+
| --- | --- |
16+
| **Most common position in a pipeline** | After the [PreProcessor](../preprocessors.mdx) in an indexing pipeline or after a [Retriever](../retrievers.mdx) in a query pipeline |
17+
| **Mandatory init variables** | `model`: Name or path of the model to use |
18+
| **Mandatory run variables** | `documents`: A list of documents |
19+
| **Output variables** | `documents`: A list of documents |
20+
| **API reference** | [Transformers](/reference/integrations-transformers) |
21+
| **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/transformers |
22+
| **Package name** | `transformers-haystack` |
23+
24+
</div>
25+
26+
## Overview
27+
28+
`TransformersNamedEntityExtractor` looks for entities, which are spans in the text. The extractor automatically recognizes and groups them depending on their class, such as people's names, organizations, locations, and other types. The exact classes are determined by the model that you initialize the component with.
29+
30+
`TransformersNamedEntityExtractor` takes a list of documents as input and returns a list of the same documents with their `meta` data enriched with `NamedEntityAnnotations`. A `NamedEntityAnnotation` consists of the type of the entity, the start and end of the span, and a score calculated by the model, for example: `NamedEntityAnnotation(entity='PER', start=11, end=16, score=0.9)`.
31+
32+
When the `TransformersNamedEntityExtractor` is initialized, you need to set a `model`. Optionally, you can set `pipeline_kwargs`, which are then passed on to the Hugging Face pipeline. You can additionally set the `device` that is used to run the component.
33+
34+
Authentication with a Hugging Face API token is only required to access private or gated models. You can pass the token at initialization with `token`, or set the `HF_API_TOKEN` or `HF_TOKEN` environment variable.
35+
36+
## Usage
37+
38+
Install the `transformers-haystack` package to use the `TransformersNamedEntityExtractor`:
39+
40+
```shell
41+
pip install transformers-haystack
42+
```
43+
44+
The component works with any Hugging Face model that supports token classification or NER.
45+
46+
`TransformersNamedEntityExtractor` accepts a list of `Documents` as its input. The extractor annotates the raw text in the documents and stores the annotations in the document's `meta` dictionary under the `named_entities` key.
47+
48+
```python
49+
from haystack.dataclasses import Document
50+
from haystack_integrations.components.extractors.transformers import (
51+
TransformersNamedEntityExtractor,
52+
)
53+
54+
extractor = TransformersNamedEntityExtractor(model="dslim/bert-base-NER")
55+
56+
documents = [
57+
Document(content="My name is Clara and I live in Berkeley, California."),
58+
Document(content="I'm Merlin, the happy pig!"),
59+
Document(content="New York State is home to the Empire State Building."),
60+
]
61+
62+
extractor.run(documents)
63+
print(documents)
64+
```
65+
66+
Here is the example result:
67+
68+
```python
69+
[Document(id=aec840d1b6c85609f4f16c3e222a5a25fd8c4c53bd981a40c1268ab9c72cee10, content: 'My name is Clara and I live in Berkeley, California.', meta: {'named_entities': [NamedEntityAnnotation(entity='PER', start=11, end=16, score=0.99641764), NamedEntityAnnotation(entity='LOC', start=31, end=39, score=0.996198), NamedEntityAnnotation(entity='LOC', start=41, end=51, score=0.9990196)]}),
70+
Document(id=98f1dc5d0ccd9d9950cd191d1076db0f7af40c401dd7608f11c90cb3fc38c0c2, content: 'I'm Merlin, the happy pig!', meta: {'named_entities': [NamedEntityAnnotation(entity='PER', start=4, end=10, score=0.99054915)]}),
71+
Document(id=44948ea0eec018b33aceaaedde4616eb9e93ce075e0090ec1613fc145f84b4a9, content: 'New York State is home to the Empire State Building.', meta: {'named_entities': [NamedEntityAnnotation(entity='LOC', start=0, end=14, score=0.9989541), NamedEntityAnnotation(entity='LOC', start=30, end=51, score=0.95746297)]})]
72+
```
73+
74+
### Get stored annotations
75+
76+
This component includes the `get_stored_annotations` helper class method that allows you to retrieve the annotations stored in a `Document` transparently:
77+
78+
```python
79+
from haystack.dataclasses import Document
80+
from haystack_integrations.components.extractors.transformers import (
81+
TransformersNamedEntityExtractor,
82+
)
83+
84+
extractor = TransformersNamedEntityExtractor(model="dslim/bert-base-NER")
85+
86+
documents = [
87+
Document(content="My name is Clara and I live in Berkeley, California."),
88+
Document(content="I'm Merlin, the happy pig!"),
89+
Document(content="New York State is home to the Empire State Building."),
90+
]
91+
92+
extractor.run(documents)
93+
94+
annotations = [
95+
TransformersNamedEntityExtractor.get_stored_annotations(doc) for doc in documents
96+
]
97+
print(annotations)
98+
99+
# If a Document doesn't contain any annotations, this returns None.
100+
new_doc = Document(content="In one of many possible worlds...")
101+
assert TransformersNamedEntityExtractor.get_stored_annotations(new_doc) is None
102+
```

docs-website/docs/pipeline-components/generators/huggingfacelocalchatgenerator.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ description: "Provides an interface for chat completion using a Hugging Face mod
99

1010
Provides an interface for chat completion using a Hugging Face model that runs locally.
1111

12+
:::warning[Deprecated]
13+
14+
`HuggingFaceLocalChatGenerator` is deprecated and will be removed in Haystack 3.0. It has moved to the `transformers-haystack` package and was renamed to `TransformersChatGenerator`. See [TransformersChatGenerator](transformerschatgenerator.mdx) for the updated documentation.
15+
16+
:::
17+
1218
<div className="key-value-table">
1319

1420
| | |
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
---
2+
title: "TransformersChatGenerator"
3+
id: transformerschatgenerator
4+
slug: "/transformerschatgenerator"
5+
description: "Provides an interface for chat completion using a Hugging Face model that runs locally."
6+
---
7+
8+
# TransformersChatGenerator
9+
10+
Provides an interface for chat completion using a Hugging Face model that runs locally.
11+
12+
<div className="key-value-table">
13+
14+
| | |
15+
| --- | --- |
16+
| **Most common position in a pipeline** | After a [ChatPromptBuilder](../builders/chatpromptbuilder.mdx) |
17+
| **Mandatory init variables** | None |
18+
| **Mandatory run variables** | `messages`: A list of [`ChatMessage`](../../concepts/data-classes/chatmessage.mdx) objects representing the chat or a plain string |
19+
| **Output variables** | `replies`: A list of [`ChatMessage`](../../concepts/data-classes/chatmessage.mdx) objects generated by the LLM |
20+
| **API reference** | [Transformers](/reference/integrations-transformers) |
21+
| **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/transformers |
22+
| **Package name** | `transformers-haystack` |
23+
24+
</div>
25+
26+
## Overview
27+
28+
Keep in mind that if LLMs run locally, you may need a powerful machine to run them. This depends strongly on the model you select and its parameter count.
29+
30+
If a string is passed to `messages`, it is converted into a list containing a single `ChatMessage` with the `user` role.
31+
32+
Authentication with a Hugging Face API token is only required to access private or gated models. You can pass the token at initialization with `token`, or set the `HF_API_TOKEN` or `HF_TOKEN` environment variable:
33+
34+
```python
35+
generator = TransformersChatGenerator(
36+
token=Secret.from_token("<your-api-key>"),
37+
)
38+
```
39+
40+
### Streaming
41+
42+
This Generator supports [streaming](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) the tokens from the LLM directly in output. To do so, pass a function to the `streaming_callback` init parameter.
43+
44+
## Usage
45+
46+
Install the `transformers-haystack` package to use the `TransformersChatGenerator`:
47+
48+
```shell
49+
pip install transformers-haystack
50+
```
51+
52+
### On its own
53+
54+
```python
55+
from haystack_integrations.components.generators.transformers import (
56+
TransformersChatGenerator,
57+
)
58+
from haystack.dataclasses import ChatMessage
59+
60+
generator = TransformersChatGenerator(model="Qwen/Qwen3-0.6B")
61+
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
62+
print(generator.run(messages))
63+
```
64+
65+
### In a Pipeline
66+
67+
```python
68+
from haystack import Pipeline
69+
from haystack.components.builders.prompt_builder import ChatPromptBuilder
70+
from haystack_integrations.components.generators.transformers import (
71+
TransformersChatGenerator,
72+
)
73+
from haystack.dataclasses import ChatMessage
74+
from haystack.utils import Secret
75+
76+
prompt_builder = ChatPromptBuilder()
77+
llm = TransformersChatGenerator(
78+
model="Qwen/Qwen3-0.6B",
79+
token=Secret.from_env_var("HF_API_TOKEN"),
80+
)
81+
82+
pipe = Pipeline()
83+
pipe.add_component("prompt_builder", prompt_builder)
84+
pipe.add_component("llm", llm)
85+
pipe.connect("prompt_builder.prompt", "llm.messages")
86+
location = "Berlin"
87+
messages = [
88+
ChatMessage.from_system(
89+
"Always respond in German even if some input data is in other languages.",
90+
),
91+
ChatMessage.from_user("Tell me about {{location}}"),
92+
]
93+
pipe.run(
94+
data={
95+
"prompt_builder": {
96+
"template_variables": {"location": location},
97+
"template": messages,
98+
},
99+
},
100+
)
101+
```

docs-website/docs/pipeline-components/readers/extractivereader.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ description: "Use this component in extractive question answering pipelines base
99

1010
Use this component in extractive question answering pipelines based on a query and a list of documents.
1111

12+
:::warning[Deprecated]
13+
14+
`ExtractiveReader` is deprecated and will be removed in Haystack 3.0. It has moved to the `transformers-haystack` package and was renamed to `TransformersExtractiveReader`. See [TransformersExtractiveReader](transformersextractivereader.mdx) for the updated documentation.
15+
16+
:::
17+
1218
<div className="key-value-table">
1319

1420
| | |

0 commit comments

Comments
 (0)