Add New `SemanticResolver` component

**Is your feature request related to a problem? Please describe.**
When generating your own knowledge graph using LLMs, the `LLMMetadataExtractor` component can be used to encode entities and relationships inside the metadata of haystack `Document` objects. `LLMMetadataExtractor` supports `ChatGenerator`'s that could use structured output to define possible graph ontology, however it is common when creating knowledge to not have predefined possible entities, especially when working with large amounts of data.

This means you can have generated metadata for `document_one` and `document_two` such that;
```python
document_one.meta = {'entities': [{'entity': 'deepset', 'entity_type': 'company'}]}
document_two.meta = {'entities': [{'entity': 'deepset GmbH', 'entity_type': 'company'}]}
``` 
when these entities are referring to the same concept. These entities are often ingested into a graph as separate nodes.

**Describe the solution you'd like**
Implement a new `SemanticResolver` component that uses the text-embedding-inference (TEI) [similarity endpoint](https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/similarity) along with a given threshold to determine whether entities are similar enough to merge and then merges the entities.

**Describe alternatives you've considered**
Using `spaCy` to determine similarity, however this is much slower than TEI. However, another component could be created that utilises `spaCy`.

**Additional context**
Add any other context or screenshots about the feature request here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add New `SemanticResolver` component #10985

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add New SemanticResolver component #10985

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add New `SemanticResolver` component #10985