Skip to content

Commit 6cd79a7

Browse files
author
MPCoreDeveloper
committed
batch readme guidelines
1 parent f7083d9 commit 6cd79a7

File tree

2 files changed

+49
-0
lines changed

2 files changed

+49
-0
lines changed

docs/graphrag/GRAPHRAG_PROPOSAL_ANALYSIS.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,19 @@ if (column.Type == DataType.RowRef)
221221

222222
---
223223

224+
### L1 Storage: Bulk Edge Insert
225+
226+
LLM-based ingestion can generate large bursts of edges. To avoid per-edge WAL/B-Tree overhead,
227+
use the existing batch insert APIs on the edge table:
228+
229+
- `Database.InsertBatch` / `InsertBatchAsync` for SQL-free batch ingestion.
230+
- `ExecuteBatchSQL` for batched INSERT statements.
231+
232+
These paths execute a single storage transaction and bulk index updates, making edge ingestion
233+
throughput bounded by serialization rather than transaction overhead.
234+
235+
---
236+
224237
### Phase 2: Graph Traversal Executor (3-4 weeks)
225238

226239
**Goal:** Execute queries like: `SELECT * FROM articles WHERE article_id IN (graph_traverse(start_id, 'references', 2))`

docs/graphrag/README.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,42 @@ SELECT GRAPH_TRAVERSE(1, 'nextId', 5, 1) -- DFS from node 1
7070

7171
---
7272

73+
## Bulk Edge Ingestion
74+
75+
GraphRAG ingestion workloads (LLM extraction) should batch edges to avoid per-edge WAL/BTREE overhead.
76+
Use the existing batch insert APIs on the edge table so the storage engine performs a single transaction
77+
and index update sequence.
78+
79+
### Recommended API
80+
81+
```csharp
82+
// Edge table schema: (SourceId, TargetId, Relationship)
83+
var edges = new List<Dictionary<string, object>>
84+
{
85+
new()
86+
{
87+
["SourceId"] = 1L,
88+
["TargetId"] = 2L,
89+
["Relationship"] = "calls"
90+
},
91+
new()
92+
{
93+
["SourceId"] = 1L,
94+
["TargetId"] = 3L,
95+
["Relationship"] = "uses"
96+
}
97+
};
98+
99+
database.InsertBatch("GraphEdges", edges);
100+
```
101+
102+
### Notes
103+
- `InsertBatch` and `InsertBatchAsync` execute a single engine transaction.
104+
- For SQL pipelines, prefer `ExecuteBatchSQL` with batched INSERT statements.
105+
- Follow bulk inserts with `Flush()` and `ForceSave()` when persistence is required.
106+
107+
---
108+
73109
## Features
74110

75111
### ✅ Traversal Algorithms

0 commit comments

Comments
 (0)