| title | Document Writers |
|---|---|
| id | document-writers-api |
| description | Writes Documents to a DocumentStore. |
| slug | /document-writers-api |
Writes documents to a DocumentStore.
from haystack import Document
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore
docs = [
Document(content="Python is a popular programming language"),
]
doc_store = InMemoryDocumentStore()
writer = DocumentWriter(document_store=doc_store)
writer.run(docs)__init__(
document_store: DocumentStore,
policy: DuplicatePolicy = DuplicatePolicy.NONE,
) -> NoneCreate a DocumentWriter component.
Parameters:
- document_store (
DocumentStore) – The instance of the document store where you want to store your documents. - policy (
DuplicatePolicy) – The policy to apply when a Document with the same ID already exists in the DocumentStore. DuplicatePolicy.NONE: Default policy, relies on the DocumentStore settings.DuplicatePolicy.SKIP: Skips documents with the same ID and doesn't write them to the DocumentStore.DuplicatePolicy.OVERWRITE: Overwrites documents with the same ID.DuplicatePolicy.FAIL: Raises an error if a Document with the same ID is already in the DocumentStore.
to_dict() -> dict[str, Any]Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
from_dict(data: dict[str, Any]) -> DocumentWriterDeserializes the component from a dictionary.
Parameters:
- data (
dict[str, Any]) – The dictionary to deserialize from.
Returns:
DocumentWriter– The deserialized component.
Raises:
DeserializationError– If the document store is not properly specified in the serialization data or its type cannot be imported.
run(
documents: list[Document], policy: DuplicatePolicy | None = None
) -> dict[str, int]Run the DocumentWriter on the given input data.
Parameters:
- documents (
list[Document]) – A list of documents to write to the document store. - policy (
DuplicatePolicy | None) – The policy to use when encountering duplicate documents.
Returns:
dict[str, int]– Number of documents written to the document store.
Raises:
ValueError– If the specified document store is not found.
run_async(
documents: list[Document], policy: DuplicatePolicy | None = None
) -> dict[str, int]Asynchronously run the DocumentWriter on the given input data.
This is the asynchronous version of the run method. It has the same parameters and return values
but can be used with await in async code.
Parameters:
- documents (
list[Document]) – A list of documents to write to the document store. - policy (
DuplicatePolicy | None) – The policy to use when encountering duplicate documents.
Returns:
dict[str, int]– Number of documents written to the document store.
Raises:
ValueError– If the specified document store is not found.TypeError– If the specified document store does not implementwrite_documents_async.