File tree Expand file tree Collapse file tree 6 files changed +10
-34
lines changed
Expand file tree Collapse file tree 6 files changed +10
-34
lines changed Original file line number Diff line number Diff line change @@ -198,6 +198,11 @@ integration:pgvector:
198198 - any-glob-to-any-file : " integrations/pgvector/**/*"
199199 - any-glob-to-any-file : " .github/workflows/pgvector.yml"
200200
201+ integration:presidio :
202+ - changed-files :
203+ - any-glob-to-any-file : " integrations/presidio/**/*"
204+ - any-glob-to-any-file : " .github/workflows/presidio.yml"
205+
201206integration:pinecone :
202207 - changed-files :
203208 - any-glob-to-any-file : " integrations/pinecone/**/*"
Original file line number Diff line number Diff line change 4242 - " Test / optimum"
4343 - " Test / paddleocr"
4444 - " Test / pgvector"
45+ - " Test / presidio"
4546 - " Test / pinecone"
4647 - " Test / pyversity"
4748 - " Test / qdrant"
Original file line number Diff line number Diff line change 2929 fail-fast : false
3030 matrix :
3131 os : [ubuntu-latest]
32- python-version : ["3.10", "3.13 "]
32+ python-version : ["3.10", "3.14 "]
3333
3434 steps :
3535 - uses : actions/checkout@v4
Original file line number Diff line number Diff line change @@ -65,6 +65,7 @@ Please check out our [Contribution Guidelines](CONTRIBUTING.md) for all the deta
6565| [opensearch-haystack](integrations/opensearch/) | Document Store | [](https://pypi.org/project/opensearch-haystack) | [](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/opensearch.yml) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-opensearch/htmlcov/index.html) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-opensearch-combined/htmlcov/index.html) |
6666| [optimum-haystack](integrations/optimum/) | Embedder | [](https://pypi.org/project/optimum-haystack) | [](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/optimum.yml) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-optimum/htmlcov/index.html) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-optimum-combined/htmlcov/index.html) |
6767| [paddleocr-haystack](integrations/paddleocr/) | Converter | [](https://pypi.org/project/paddleocr-haystack) | [](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/paddleocr.yml) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-paddleocr/htmlcov/index.html) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-paddleocr-combined/htmlcov/index.html) |
68+ | [ presidio-haystack] ( integrations/presidio/ ) | Preprocessor | [ ![ PyPI - Version] ( https://img.shields.io/pypi/v/presidio-haystack.svg )] ( https://pypi.org/project/presidio-haystack ) | [ ![ Test / presidio] ( https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/presidio.yml/badge.svg )] ( https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/presidio.yml ) | | |
6869| [pinecone-haystack](integrations/pinecone/) | Document Store | [](https://pypi.org/project/pinecone-haystack) | [](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/pinecone.yml) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-pinecone/htmlcov/index.html) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-pinecone-combined/htmlcov/index.html) |
6970| [pgvector-haystack](integrations/pgvector/) | Document Store | [](https://pypi.org/project/pgvector-haystack) | [](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/pgvector.yml) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-pgvector/htmlcov/index.html) | [](https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-pgvector-combined/htmlcov/index.html) |
7071| [ pyversity-haystack] ( integrations/pyversity/ ) | Ranker | [ ![ PyPI - Version] ( https://img.shields.io/pypi/v/pyversity-haystack.svg )] ( https://pypi.org/project/pyversity-haystack ) | [ ![ Test / pyversity] ( https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/pyversity.yml/badge.svg )] ( https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/pyversity.yml ) | [ ![ Coverage badge] ( https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/python-coverage-comment-action-data-pyversity/endpoint.json&label= )] ( https://htmlpreview.github.io/?https://github.com/deepset-ai/haystack-core-integrations/blob/python-coverage-comment-action-data-pyversity/htmlcov/index.html ) | |
Original file line number Diff line number Diff line change 33[ ![ PyPI - Version] ( https://img.shields.io/pypi/v/presidio-haystack.svg )] ( https://pypi.org/project/presidio-haystack )
44[ ![ PyPI - Python Version] ( https://img.shields.io/pypi/pyversions/presidio-haystack.svg )] ( https://pypi.org/project/presidio-haystack )
55
6- Haystack integration for [ Microsoft Presidio] ( https://microsoft.github.io/presidio/ ) — PII detection and anonymization.
7-
8- ---
9-
10- ## Installation
11-
12- ``` bash
13- pip install presidio-haystack
14- ```
15-
16- You also need to download the spaCy model used by Presidio:
17-
18- ``` bash
19- python -m spacy download en_core_web_lg
20- ```
21-
22- ## Components
23-
24- - ** PresidioDocumentCleaner** — anonymizes PII in ` list[Document] `
25- - ** PresidioTextCleaner** — anonymizes PII in ` list[str] ` (useful for query sanitization)
26- - ** PresidioEntityExtractor** — detects PII entities and stores them in Document metadata
27-
28- ## Usage
29-
30- ``` python
31- from haystack import Document
32- from haystack_integrations.components.preprocessors.presidio import PresidioDocumentCleaner
33-
34- cleaner = PresidioDocumentCleaner()
35- result = cleaner.run(documents = [Document(content = " My name is John, email: john@example.com" )])
36- print (result[" documents" ][0 ].content)
37- # My name is <PERSON>, email: <EMAIL_ADDRESS>
38- ```
6+ - [ Changelog] ( https://github.com/deepset-ai/haystack-core-integrations/blob/main/integrations/presidio/CHANGELOG.md )
397
408---
419
Original file line number Diff line number Diff line change @@ -19,6 +19,7 @@ classifiers = [
1919 " Programming Language :: Python :: 3.11" ,
2020 " Programming Language :: Python :: 3.12" ,
2121 " Programming Language :: Python :: 3.13" ,
22+ " Programming Language :: Python :: 3.14" ,
2223 " Programming Language :: Python :: Implementation :: CPython" ,
2324 " Programming Language :: Python :: Implementation :: PyPy" ,
2425]
You can’t perform that action at this time.
0 commit comments