Skip to content

Commit 16124d1

Browse files
nihalnihalaniclaude
andcommitted
fix(nodes): propagate strategy metadata to emitted chunk documents
Copy chunk_index, start_char, end_char, and total_chunks from the strategy's per-chunk metadata dict onto each emitted Doc's metadata so downstream nodes can locate chunks within the source document. Update the writeDocuments docstring to list all metadata keys. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 36ce8f8 commit 16124d1

1 file changed

Lines changed: 10 additions & 3 deletions

File tree

nodes/src/nodes/chunker/IInstance.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,9 @@ def writeDocuments(self, documents: List[Doc]):
4848
"""
4949
Chunk each incoming document and emit multiple documents (one per chunk).
5050
51-
Each emitted document gets metadata with chunk_index, parent_id, and
52-
total_chunks so downstream nodes can reconstruct the original document
53-
if needed.
51+
Each emitted document gets metadata with chunkId, parentId, chunk_index,
52+
start_char, end_char, and total_chunks so downstream nodes can
53+
reconstruct the original document if needed.
5454
"""
5555
if self.IGlobal.strategy is None:
5656
raise RuntimeError('Chunker strategy not initialized')
@@ -85,6 +85,13 @@ def writeDocuments(self, documents: List[Doc]):
8585
chunk_doc.metadata.chunkId = self.chunkId
8686
chunk_doc.metadata.parentId = parent_id
8787

88+
# Propagate strategy metadata (chunk_index, start_char, end_char)
89+
strategy_meta = chunk_data.get('metadata', {})
90+
chunk_doc.metadata.chunk_index = strategy_meta.get('chunk_index', 0)
91+
chunk_doc.metadata.start_char = strategy_meta.get('start_char', 0)
92+
chunk_doc.metadata.end_char = strategy_meta.get('end_char', 0)
93+
chunk_doc.metadata.total_chunks = total_chunks
94+
8895
self.chunkId += 1
8996
output_docs.append(chunk_doc)
9097

0 commit comments

Comments
 (0)