Describe the bug
AnswerBuilder.run() permanently modifies the meta dict of the original input Document objects.
After calling run(), every input document gains a source_index key (and a referenced key when
reference_pattern is set) — even though the caller never asked for that.
Error message
No error is thrown — this is a silent correctness bug.
Expected behavior
doc.meta should remain unchanged after run() returns. The source_index key should only appear
on the copy of the document inside GeneratedAnswer.documents, not on the original input document.
To Reproduce
from haystack import Document
from haystack.components.builders import AnswerBuilder
doc = Document(content="Paris is the capital of France.", meta={"source": "wiki"})
builder = AnswerBuilder()
builder.run(query="Capital of France?", replies=["Paris."], documents=[doc])
print(doc.meta)
# {"source": "wiki", "source_index": 1} ← original was mutated
Additional context
Root cause: answer_builder.py line 207:
doc_meta: dict[str, Any] = doc.meta or {}
doc.meta or {} returns the SAME dict object when meta is non-empty (truthy). The next line
doc_meta["source_index"] = idx + 1 then mutates the original doc.meta directly.
dataclasses.replace(doc, meta=doc_meta) was clearly intended to avoid mutation, but the
shallow alias defeats it.
Fix: doc_meta: dict[str, Any] = dict(doc.meta)
Note: chat_prompt_builder.py even has an explicit comment "use dataclasses.replace to avoid
in-place mutation" — answer_builder.py was missed. Similar bug was fixed in Document.from_dict
via PR #11330.
FAQ Check
System:
- OS: Windows 11
- GPU/CPU:
- Haystack version: main branch
- DocumentStore:
- Reader:
- Retriever:
Describe the bug
AnswerBuilder.run()permanently modifies themetadict of the original inputDocumentobjects.After calling
run(), every input document gains asource_indexkey (and areferencedkey whenreference_patternis set) — even though the caller never asked for that.Error message
No error is thrown — this is a silent correctness bug.
Expected behavior
doc.metashould remain unchanged afterrun()returns. Thesource_indexkey should only appearon the copy of the document inside
GeneratedAnswer.documents, not on the original input document.To Reproduce
Additional context
Root cause:
answer_builder.pyline 207:doc.meta or {}returns the SAME dict object when meta is non-empty (truthy). The next linedoc_meta["source_index"] = idx + 1then mutates the originaldoc.metadirectly.dataclasses.replace(doc, meta=doc_meta)was clearly intended to avoid mutation, but theshallow alias defeats it.
Fix:
doc_meta: dict[str, Any] = dict(doc.meta)Note:
chat_prompt_builder.pyeven has an explicit comment "use dataclasses.replace to avoidin-place mutation" —
answer_builder.pywas missed. Similar bug was fixed inDocument.from_dictvia PR #11330.
FAQ Check
System: