Qdrant With all-MiniLM-L6-v2

This might be a stupid question - I have provided the embedding dimension to match with the dimension of the embedder `all-MiniLM-L6-v2`

``` python
document_store = QdrantDocumentStore(
    path=config["document_store"]["persist_path"],
    recreate_index=True,
    return_embedding=True,
    wait_result_from_api=True,
    embedding_dim=384
)

# Define the filetype router (this will route the files to their appropriate converter)
# In this example, we only allow plaintext, PDF, and markdown files.
file_type_router = FileTypeRouter(mime_types=["text/markdown"])


# Define the converter used for .md -> Document
markdown_converter = MarkdownToDocument()


# Define the document cleaner, which will remove all extraneous material (extended blankspace, images, etc.)
# You can change this behaviour by passing different parameters into DocumentCleaner()
document_cleaner = DocumentCleaner()



# Define the embedder. This is where the slices will be converted to vectors/embeddings
# These vectors will then be searched against when we submit our query, to find the most relevant chunks of text
document_embedder = SentenceTransformersDocumentEmbedder(model=r"all-MiniLM-L6-v2",device=ComponentDevice.from_str("cuda"),local_files_only=True)
# nvidia-smi -l

# Define the document writer, this will actually write the vectors to the DB
document_writer = DocumentWriter(document_store)

# This is where the pipeline is actually created
# First we add the routers, then the converters, then joiner, cleaner, splitter, embedder, and finally the writer.
# Adding the components...
preprocessing_pipeline = Pipeline()
preprocessing_pipeline.add_component(instance=file_type_router, name="file_type_router")
preprocessing_pipeline.add_component(instance=markdown_converter, name="markdown_converter")
preprocessing_pipeline.add_component(instance=document_cleaner, name="document_cleaner")
preprocessing_pipeline.add_component(instance=document_embedder, name="document_embedder")
preprocessing_pipeline.add_component(instance=document_writer, name="document_writer")

# Connecting the components...
preprocessing_pipeline.connect("file_type_router.text/markdown", "markdown_converter.sources")
preprocessing_pipeline.connect("markdown_converter", "document_cleaner")
preprocessing_pipeline.connect("document_cleaner", "document_embedder")
preprocessing_pipeline.connect("document_embedder", "document_writer")
```

But I get the following error - I am pretty sure I am doing something stupid - Any help will be appreciated

```
haystack.core.errors.PipelineRuntimeError: The following component failed to run:
Component name: 'document_writer'
Component type: 'DocumentWriter'
Error: could not broadcast input array from shape (384,) into shape (768,)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qdrant With all-MiniLM-L6-v2 #2066

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Qdrant With all-MiniLM-L6-v2 #2066

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions