Skip to content

Commit da8953e

Browse files
committed
Refs: #73 Enhance performance and indexing for MongoDB persistence
- Added performance notes for per-stream read queries in Changelog.md - Introduced explain audit script for validating index usage in MongoDB - Implemented tests to ensure expected indexes are used for various queries - Added MongoDB snapshot indexes for improved snapshot retrieval - Refactored MongoPersistenceEngine to optimize query filters and sorting
1 parent d23021f commit da8953e

9 files changed

Lines changed: 784 additions & 29 deletions

File tree

Changelog.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
- Added explicit support for net8.0, net9.0, net10.0.
66
- Updated NEventStore to 10.2.0
7+
- Performance: per-stream read queries require in-memory sort (CheckpointNumber not in index) [#73](https://github.com/NEventStore/NEventStore.Persistence.MongoDB/issues/73)
78

89
### Breaking Changes
910

docs/Performance-Investigation.md

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -36,17 +36,8 @@ dotnet .\src\NEventStore.Persistence.MongoDB.Benchmark\bin\Release\net10.0\NEven
3636

3737
```powershell
3838
[Environment]::SetEnvironmentVariable('NEventStore.MongoDB', 'mongodb://localhost:50002/NEventStore', 'Process')
39-
dotnet build .\src\NEventStore.Persistence.MongoDB.Benchmark\NEventStore.Persistence.MongoDB.Benchmark.csproj -c Release
40-
dotnet .\src\NEventStore.Persistence.MongoDB.Benchmark\bin\Release\net8.0\NEventStore.Persistence.MongoDB.Benchmark.dll
4139
```
4240

43-
### Result
44-
45-
- BenchmarkDotNet now discovers `14` benchmark cases from one executable.
46-
- The benchmark entrypoint uses `BenchmarkSwitcher` and supports `--filter` routing.
47-
- Serializer registration is aligned with acceptance tests for `CSharpLegacy` GUID handling.
48-
- Baseline reports are now standardized on `net10.0` host runtime across all benchmark classes.
49-
5041
### Relevant evidence in the codebase
5142

5243
- Acceptance tests register the required MongoDB serializers in `src/NEventStore.Persistence.MongoDB.Tests/AcceptanceTestMongoPersistenceFactory.cs`.
@@ -55,8 +46,6 @@ dotnet .\src\NEventStore.Persistence.MongoDB.Benchmark\bin\Release\net8.0\NEvent
5546

5647
## Benchmark Suite Gaps
5748

58-
The benchmark suite now covers all previously identified harness-level gaps.
59-
6049
### Structural gaps
6150

6251
- No structural benchmark-entrypoint gaps remain for sync/async coverage currently implemented.
@@ -270,6 +259,23 @@ To avoid committing raw benchmark artifacts, the key baseline values are summari
270259
| Recycle-bin read slice | `ReadDeletedCommitsFromRecycleBinBucket` (`CommitsPerStream=1000`, `DeletedStreams=5`, `ActiveStreams=1`) | 108.028 ms |
271260
| Duplicate conflict slice | `DuplicateCommitIdPath` (`Iterations=100`) | 683.60 ms |
272261

262+
## Explain Audit Workflow
263+
264+
Use the explain audit script to validate the index usage of the persistence engine query shapes against a local MongoDB container:
265+
266+
```powershell
267+
.\scripts\explain-persistence-engine.ps1
268+
```
269+
270+
The script seeds a scratch database, recreates the same indexes defined by the engine, and runs `explain("executionStats")` for the main `Find` and equivalent delete/update filter shapes in `MongoPersistenceEngine`.
271+
272+
Current findings from the issue [#73](https://github.com/NEventStore/NEventStore.Persistence.MongoDB/issues/73) follow-up:
273+
274+
- The changed query shapes for stream reads, snapshot reads, snapshot deletes, and bucket-scoped stream-head reads all use the intended indexes.
275+
- Bucket checkpoint scans and duplicate-commit lookups also use the expected indexes.
276+
- The legacy date-based bucket reads and all-buckets checkpoint scans remain less ideal query shapes and should be reviewed separately if they become performance-sensitive.
277+
- Decision: do not add a new `CommitStamp`-oriented compound index for the obsolete `GetFrom(bucketId, DateTime)` and `GetFromTo(bucketId, DateTime, DateTime)` APIs. They are sync-only compatibility methods on an upstream obsolete contract, they are already documented for removal, and the preferred checkpoint-based APIs are the supported optimization target.
278+
273279
### After Snapshot Template
274280

275281
After implementing optimizations, run the same net10 baseline profile and fill this table using the same method/parameter rows selected from the "before" reports.
@@ -279,8 +285,8 @@ After implementing optimizations, run the same net10 baseline profile and fill t
279285
| Checkpoint generator write path (Always, 1000 commits) | 2,908.6 ms | | |
280286
| Global read (bucket-qualified, 1000/3) | 22.142 ms | | |
281287
| Global read (all buckets, 1000/3) | 67.011 ms | | |
282-
| Per-stream full read (10000 commits) | 148.004 ms | | |
283-
| Per-stream revision-window read (10000, window 1000) | 17.207 ms | | |
288+
| Per-stream full read (10000 commits) | 148.004 ms | 120.660 ms | -18.5% |
289+
| Per-stream revision-window read (10000, window 1000) | 17.207 ms | 14.732 ms | -14.4% |
284290
| Write path (sync, 10000 commits) | 10,489.8 ms | | |
285291
| Write path (async, 10000 commits) | 10,705.0 ms | | |
286292
| Global read (async, 10000 commits) | 120.443 ms | | |
Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
param(
2+
[string]$ContainerName = "nesci-mongo-1",
3+
[string]$DatabaseName = "issue73_explain",
4+
[switch]$KeepDatabase
5+
)
6+
7+
$dropDatabaseStatement = if ($KeepDatabase) {
8+
""
9+
}
10+
else {
11+
"db.dropDatabase();"
12+
}
13+
14+
$script = @'
15+
const dbName = "__DATABASE_NAME__";
16+
const db = db.getSiblingDB(dbName);
17+
__DROP_DATABASE__
18+
19+
const commits = db.getCollection("Commits");
20+
const streamHeads = db.getCollection("Streams");
21+
const snapshots = db.getCollection("Snapshots");
22+
23+
commits.createIndex({ BucketId: 1, _id: 1 }, { name: "GetFrom_Checkpoint_Index", unique: true });
24+
commits.createIndex({ BucketId: 1, StreamId: 1, StreamRevisionFrom: 1, StreamRevisionTo: 1 }, { name: "GetFrom_Index", unique: true });
25+
commits.createIndex({ BucketId: 1, StreamId: 1, CommitSequence: 1 }, { name: "LogicalKey_Index", unique: true });
26+
commits.createIndex({ CommitStamp: 1 }, { name: "CommitStamp_Index" });
27+
commits.createIndex({ BucketId: 1, StreamId: 1, CommitId: 1 }, { name: "CommitId_Index", unique: true });
28+
29+
snapshots.createIndex({ "_id.BucketId": 1, "_id.StreamId": 1, "_id.StreamRevision": -1 }, { name: "BucketStreamRevision_Index" });
30+
streamHeads.createIndex({ Unsnapshotted: 1 }, { name: "Unsnapshotted_Index" });
31+
streamHeads.createIndex({ "_id.BucketId": 1, Unsnapshotted: -1 }, { name: "BucketUnsnapshotted_Index" });
32+
33+
for (let i = 0; i < 20; i++) {
34+
const bucketId = i < 16 ? "default" : (i < 18 ? "other" : ":rb");
35+
const streamId = i < 8 ? "stream-1" : `stream-${i}`;
36+
const baseRevision = (i * 2) + 1;
37+
commits.insertOne({
38+
_id: i + 1,
39+
BucketId: bucketId,
40+
StreamId: streamId,
41+
StreamRevisionFrom: baseRevision,
42+
StreamRevisionTo: baseRevision + 1,
43+
CommitSequence: i + 1,
44+
CommitId: `commit-${i + 1}`,
45+
CommitStamp: new Date(Date.now() + i * 1000),
46+
Headers: {},
47+
Events: []
48+
});
49+
}
50+
51+
streamHeads.insertMany([
52+
{ _id: { BucketId: "default", StreamId: "stream-1" }, HeadRevision: 8, SnapshotRevision: 2, Unsnapshotted: 6 },
53+
{ _id: { BucketId: "default", StreamId: "stream-2" }, HeadRevision: 4, SnapshotRevision: 0, Unsnapshotted: 4 },
54+
{ _id: { BucketId: "other", StreamId: "stream-1" }, HeadRevision: 9, SnapshotRevision: 0, Unsnapshotted: 9 }
55+
]);
56+
57+
snapshots.insertMany([
58+
{ _id: { BucketId: "default", StreamId: "stream-1", StreamRevision: 1 }, Payload: "s1-r1" },
59+
{ _id: { BucketId: "default", StreamId: "stream-1", StreamRevision: 3 }, Payload: "s1-r3" },
60+
{ _id: { BucketId: "default", StreamId: "stream-1", StreamRevision: 5 }, Payload: "s1-r5" },
61+
{ _id: { BucketId: "default", StreamId: "stream-2", StreamRevision: 2 }, Payload: "s2-r2" },
62+
{ _id: { BucketId: "other", StreamId: "stream-1", StreamRevision: 4 }, Payload: "other-r4" }
63+
]);
64+
65+
function summarizePlan(explain) {
66+
const stages = [];
67+
const indexNames = [];
68+
69+
function walk(node) {
70+
if (!node || typeof node !== "object") return;
71+
if (node.stage) stages.push(node.stage);
72+
if (node.indexName) indexNames.push(node.indexName);
73+
for (const value of Object.values(node)) {
74+
if (Array.isArray(value)) {
75+
for (const item of value) walk(item);
76+
} else if (value && typeof value === "object") {
77+
walk(value);
78+
}
79+
}
80+
}
81+
82+
walk(explain.queryPlanner?.winningPlan);
83+
return {
84+
winningStages: [...new Set(stages)],
85+
indexes: [...new Set(indexNames)],
86+
totalKeysExamined: explain.executionStats?.totalKeysExamined,
87+
totalDocsExamined: explain.executionStats?.totalDocsExamined,
88+
nReturned: explain.executionStats?.nReturned
89+
};
90+
}
91+
92+
const results = {
93+
commitRangeRead: summarizePlan(
94+
commits.find({
95+
BucketId: "default",
96+
StreamId: "stream-1",
97+
StreamRevisionTo: { $gte: 3 },
98+
StreamRevisionFrom: { $lte: 6 }
99+
}).sort({ StreamRevisionFrom: 1 }).explain("executionStats")
100+
),
101+
bucketDateRead: summarizePlan(
102+
commits.find({
103+
BucketId: "default",
104+
CommitStamp: { $gte: new Date(Date.now() - 1000) }
105+
}).sort({ _id: 1 }).explain("executionStats")
106+
),
107+
bucketDateRangeRead: summarizePlan(
108+
commits.find({
109+
BucketId: "default",
110+
CommitStamp: {
111+
$gte: new Date(Date.now() - 1000),
112+
$lt: new Date(Date.now() + 60000)
113+
}
114+
}).sort({ _id: 1 }).explain("executionStats")
115+
),
116+
bucketCheckpointRead: summarizePlan(
117+
commits.find({
118+
BucketId: "default",
119+
_id: { $gt: 3 }
120+
}).sort({ _id: 1 }).explain("executionStats")
121+
),
122+
bucketCheckpointRangeRead: summarizePlan(
123+
commits.find({
124+
BucketId: "default",
125+
_id: { $gt: 3, $lte: 12 }
126+
}).sort({ _id: 1 }).explain("executionStats")
127+
),
128+
allBucketsCheckpointRead: summarizePlan(
129+
commits.find({
130+
BucketId: { $ne: ":rb" },
131+
_id: { $gt: 3 }
132+
}).sort({ _id: 1 }).explain("executionStats")
133+
),
134+
allBucketsCheckpointRangeRead: summarizePlan(
135+
commits.find({
136+
BucketId: { $ne: ":rb" },
137+
_id: { $gt: 3, $lte: 12 }
138+
}).sort({ _id: 1 }).explain("executionStats")
139+
),
140+
duplicateCommitLookup: summarizePlan(
141+
commits.find({
142+
BucketId: "default",
143+
StreamId: "stream-1",
144+
CommitId: "commit-1"
145+
}).explain("executionStats")
146+
),
147+
streamsToSnapshot: summarizePlan(
148+
streamHeads.find({
149+
"_id.BucketId": "default",
150+
Unsnapshotted: { $gte: 0 }
151+
}).sort({ Unsnapshotted: -1 }).explain("executionStats")
152+
),
153+
getSnapshot: summarizePlan(
154+
snapshots.find({
155+
"_id.BucketId": "default",
156+
"_id.StreamId": "stream-1",
157+
"_id.StreamRevision": { $lte: 6 }
158+
}).sort({ "_id.StreamRevision": -1 }).limit(1).explain("executionStats")
159+
),
160+
addSnapshotById: summarizePlan(
161+
snapshots.find({
162+
_id: { BucketId: "default", StreamId: "stream-1", StreamRevision: 3 }
163+
}).limit(1).explain("executionStats")
164+
),
165+
addSnapshotStreamHeadLookup: summarizePlan(
166+
streamHeads.find({
167+
_id: { BucketId: "default", StreamId: "stream-1" }
168+
}).limit(1).explain("executionStats")
169+
),
170+
purgeBucketCommitsFilter: summarizePlan(
171+
commits.find({
172+
BucketId: "default"
173+
}).explain("executionStats")
174+
),
175+
purgeBucketSnapshotsFilter: summarizePlan(
176+
snapshots.find({
177+
"_id.BucketId": "default"
178+
}).explain("executionStats")
179+
),
180+
purgeBucketStreamHeadsFilter: summarizePlan(
181+
streamHeads.find({
182+
"_id.BucketId": "default"
183+
}).explain("executionStats")
184+
),
185+
deleteStreamHeadById: summarizePlan(
186+
streamHeads.find({
187+
_id: { BucketId: "default", StreamId: "stream-1" }
188+
}).explain("executionStats")
189+
),
190+
deleteStreamSnapshotsFilter: summarizePlan(
191+
snapshots.find({
192+
"_id.BucketId": "default",
193+
"_id.StreamId": "stream-1"
194+
}).explain("executionStats")
195+
),
196+
deleteStreamCommitsFilter: summarizePlan(
197+
commits.find({
198+
BucketId: "default",
199+
StreamId: "stream-1"
200+
}).explain("executionStats")
201+
),
202+
updateStreamHeadById: summarizePlan(
203+
streamHeads.find({
204+
_id: { BucketId: "default", StreamId: "stream-1" }
205+
}).explain("executionStats")
206+
),
207+
lastCommittedCheckpoint: summarizePlan(
208+
commits.find({}).sort({ _id: -1 }).limit(1).explain("executionStats")
209+
),
210+
emptyRecycleBin: summarizePlan(
211+
commits.find({
212+
BucketId: ":rb",
213+
_id: { $lt: 20 }
214+
}).explain("executionStats")
215+
),
216+
getDeletedCommits: summarizePlan(
217+
commits.find({
218+
BucketId: ":rb"
219+
}).sort({ _id: 1 }).explain("executionStats")
220+
)
221+
};
222+
223+
print(JSON.stringify(results, null, 2));
224+
'@
225+
226+
$script = $script.Replace('__DATABASE_NAME__', $DatabaseName).Replace('__DROP_DATABASE__', $dropDatabaseStatement)
227+
$script | rtk docker exec -i $ContainerName mongosh --quiet

0 commit comments

Comments
 (0)