Skip to content

Commit aa3d238

Browse files
committed
Clean final benchmark documentation
1 parent 586e16d commit aa3d238

4 files changed

Lines changed: 29 additions & 202 deletions

File tree

README.md

Lines changed: 13 additions & 112 deletions
Original file line numberDiff line numberDiff line change
@@ -1319,117 +1319,18 @@ MARKDOWN_LD_KB_BENCHMARK_PROFILE=cpu dotnet run --project benchmarks/MarkdownLd.
13191319

13201320
Benchmark reports are written to `artifacts/benchmarks/results` as Markdown, CSV, and full JSON. The reports are intentionally ignored by git because they depend on the local machine and current system load. PR validation and the dedicated workflow in `.github/workflows/benchmarks.yml` both run the complete BenchmarkDotNet suite and upload the `benchmarkdotnet-results` artifact. The benchmark config adds one `Default` job only when the command does not already pass `--job`, `--job=...`, or `-j`.
13211321

1322-
The exported BenchmarkDotNet reports include the diagnostic columns that matter for this library:
1322+
The full metric definitions, workload profiles, and current result tables are maintained in [Performance Benchmarks](docs/Features/PerformanceBenchmarks.md). The README keeps only the current headline numbers from the May 3, 2026 local BenchmarkDotNet 0.15.8 run on Apple M2 Pro with .NET 10.0.5:
13231323

1324-
| Area | Report data | Used for |
1325-
| --- | --- | --- |
1326-
| Latency | `Mean`, `Error`, `StdDev`, `Ratio`, `RatioSD`; full JSON also keeps min, quartiles, max, percentiles, and raw measurements | compare retrieval paths under the same generated workload |
1327-
| Allocation and GC | `Allocated`, `Alloc Ratio`, `Gen0`, `Gen1`, `Gen2` | find APIs that allocate enough to hurt repeated search calls |
1328-
| Threading | `Completed Work Items`, `Lock Contentions` | identify SPARQL and federation paths that schedule work or contend on locks |
1329-
| Repro metadata | runtime, JIT, platform, job, iteration counts, corpus profile, query scenario | keep local runs comparable without pretending they are machine-independent |
1330-
| Optional profiles | EventPipe `cpu`, `gc`, or `jit` artifacts when `MARKDOWN_LD_KB_BENCHMARK_PROFILE` is set | inspect hot methods after a suspicious benchmark result |
1331-
1332-
Benchmark workload profiles are named by shape instead of using unexplained document-count params:
1333-
1334-
| Profile | Shape |
1324+
| Area | Current local result |
13351325
| --- | --- |
1336-
| `ShortDocuments` | 250 compact runbook-like Markdown documents |
1337-
| `LongDocuments` | 80 long recovery playbooks with repeated sections |
1338-
| `LargeCorpus` | 1000 compact documents for scale, persistence, and build pressure |
1339-
| `TokenizedMultilingual` | 250 multilingual/CJK/token-heavy documents |
1340-
| `FederatedRunbooks` | 250 SPARQL/service/runbook documents for local federation paths |
1341-
1342-
Latest local benchmark run, executed on May 3, 2026 with BenchmarkDotNet 0.15.8, .NET 10.0.5, Apple M2 Pro, exported these reports:
1343-
1344-
| Suite | Job | Benchmarks executed | Export prefix |
1345-
| --- | --- | ---: | --- |
1346-
| Fuzzy edit distance | Default | 8 | `ManagedCode.MarkdownLd.Kb.Benchmarks.FuzzyEditDistanceBenchmarks-report` |
1347-
| Graph build | Default | 4 | `ManagedCode.MarkdownLd.Kb.Benchmarks.GraphBuildBenchmarks-report` |
1348-
| Graph search | Default | 54 | `ManagedCode.MarkdownLd.Kb.Benchmarks.GraphSearchBenchmarks-report` |
1349-
| Tiktoken search | Default | 12 | `ManagedCode.MarkdownLd.Kb.Benchmarks.TiktokenSearchBenchmarks-report` |
1350-
| Graph persistence | Default | 39 | `ManagedCode.MarkdownLd.Kb.Benchmarks.GraphPersistenceBenchmarks-report` |
1351-
| Graph lifecycle | Default | 1 | `ManagedCode.MarkdownLd.Kb.Benchmarks.GraphLifecycleBenchmarks-report` |
1352-
1353-
The full local pass executed 118 BenchmarkDotNet cases.
1354-
1355-
Graph build:
1356-
1357-
| Profile | Mean | StdDev | Allocated |
1358-
| --- | ---: | ---: | ---: |
1359-
| `ShortDocuments` | 9.462 ms | 0.0324 ms | 14.61 MB |
1360-
| `LongDocuments` | 7.509 ms | 0.0127 ms | 14.35 MB |
1361-
| `LargeCorpus` | 45.457 ms | 0.5488 ms | 57.74 MB |
1362-
| `TokenizedMultilingual` | 12.206 ms | 0.2035 ms | 17.77 MB |
1363-
1364-
Graph search exact-query mean time:
1365-
1366-
| Profile | Ranked graph | BM25 | BM25 fuzzy | Focused | Schema SPARQL | Local federated |
1367-
| --- | ---: | ---: | ---: | ---: | ---: | ---: |
1368-
| `ShortDocuments` | 1.195 ms | 1.659 ms | 1.979 ms | 2.036 ms | 41.078 ms | 39.410 ms |
1369-
| `LongDocuments` | 0.460 ms | 1.989 ms | 1.984 ms | 0.634 ms | 13.007 ms | 14.030 ms |
1370-
| `FederatedRunbooks` | 1.317 ms | 2.022 ms | 2.041 ms | 2.244 ms | 41.528 ms | 44.219 ms |
1371-
1372-
Graph search exact-query allocated memory per operation:
1373-
1374-
| Profile | Ranked graph | BM25 | BM25 fuzzy | Focused | Schema SPARQL | Local federated |
1375-
| --- | ---: | ---: | ---: | ---: | ---: | ---: |
1376-
| `ShortDocuments` | 2.37 MB | 3.07 MB | 3.07 MB | 3.27 MB | 60.33 MB | 62.31 MB |
1377-
| `LongDocuments` | 1.91 MB | 3.46 MB | 3.46 MB | 1.21 MB | 20.22 MB | 22.22 MB |
1378-
| `FederatedRunbooks` | 2.54 MB | 3.52 MB | 3.52 MB | 3.48 MB | 61.10 MB | 62.65 MB |
1379-
1380-
The `ShortDocuments` exact-query diagnostic slice shows the current hot paths:
1381-
1382-
| Method | Mean | Allocated | Alloc ratio | Gen0 | Gen1 | Gen2 | Work items | Lock contentions |
1383-
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
1384-
| Ranked graph | 1.195 ms | 2.37 MB | 1.00x | 296.8750 | 101.5625 | 0 | 0 | 0 |
1385-
| BM25 | 1.659 ms | 3.07 MB | 1.29x | 384.7656 | 142.5781 | 0 | 0 | 0 |
1386-
| BM25 fuzzy | 1.979 ms | 3.07 MB | 1.29x | 375.0000 | 125.0000 | 0 | 0 | 0 |
1387-
| Focused | 2.036 ms | 3.27 MB | 1.38x | 406.2500 | 179.6875 | 0 | 0 | 0 |
1388-
| Schema SPARQL | 41.078 ms | 60.33 MB | 25.43x | 8400.0000 | 1800.0000 | 400.0000 | 551 | 300.6000 |
1389-
| Local federated | 39.410 ms | 62.31 MB | 26.27x | 8600.0000 | 1800.0000 | 400.0000 | 552 | 326.0000 |
1390-
1391-
Allocation, GC, work-item, and lock-contention columns come directly from BenchmarkDotNet diagnosers. Treat ratios and relative pressure inside the same run as the useful signal; local numbers are diagnostics, not release-grade SLA measurements.
1392-
1393-
Persistence and export on the `LargeCorpus` profile:
1394-
1395-
| Method | Mean | StdDev | Allocated |
1396-
| --- | ---: | ---: | ---: |
1397-
| `CreateSnapshot` | 4.494 ms | 0.0045 ms | 5.18 MB |
1398-
| `SerializeTurtle` | 9.249 ms | 0.0436 ms | 18.07 MB |
1399-
| `SerializeJsonLd` | 12.371 ms | 0.0586 ms | 20.31 MB |
1400-
| `ExportMermaidFlowchart` | 5.884 ms | 0.0899 ms | 7.15 MB |
1401-
| `ExportDotGraph` | 6.039 ms | 0.0050 ms | 7.55 MB |
1402-
| `SaveTurtleToFile` | 29.641 ms | 0.1868 ms | 34.74 MB |
1403-
| `SaveJsonLdToFile` | 38.491 ms | 1.5349 ms | 37.02 MB |
1404-
| `LoadTurtleFromFile` | 35.708 ms | 0.8051 ms | 28.10 MB |
1405-
| `LoadJsonLdFromFile` | 90.663 ms | 2.9780 ms | 75.32 MB |
1406-
1407-
Broad graph lifecycle:
1408-
1409-
| Method | Mean | StdDev | Allocated | Gen0 | Gen1 | Gen2 | Work items |
1410-
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
1411-
| `BuildSearchSaveLoadAndExport` | 55.35 ms | 3.571 ms | 54.44 MB | 6750.0000 | 2250.0000 | 750.0000 | 52.0000 |
1412-
1413-
Tiktoken token-distance search:
1414-
1415-
| Profile | Query | Exact | Fuzzy-corrected | Exact allocated | Fuzzy allocated |
1416-
| --- | --- | ---: | ---: | ---: | ---: |
1417-
| `LongDocuments` | Exact | 298.1 us | 300.2 us | 212.24 KB | 213.16 KB |
1418-
| `LongDocuments` | Typo | 334.8 us | 391.5 us | 212.88 KB | 216.30 KB |
1419-
| `LongDocuments` | NoMatch | 254.1 us | 257.1 us | 212.19 KB | 213.49 KB |
1420-
| `TokenizedMultilingual` | Exact | 219.8 us | 221.4 us | 139.18 KB | 140.30 KB |
1421-
| `TokenizedMultilingual` | Typo | 245.2 us | 267.6 us | 139.59 KB | 142.20 KB |
1422-
| `TokenizedMultilingual` | NoMatch | 182.7 us | 183.1 us | 138.91 KB | 140.15 KB |
1423-
1424-
Fuzzy edit-distance mean time:
1425-
1426-
| Scenario | Bounded bit-vector/banded | Naive Levenshtein | Speedup vs naive | Bounded allocation | Naive allocation |
1427-
| --- | ---: | ---: | ---: | ---: | ---: |
1428-
| Short deletion | 6.726 ns | 94.380 ns | 14.03x | 0 B | 112 B |
1429-
| Short substitution | 33.756 ns | 82.509 ns | 2.44x | 0 B | 112 B |
1430-
| Long insertion | 21.894 ns | 8,244.786 ns | 376.58x | 0 B | 640 B |
1431-
| Long no-match | 53.268 ns | 9,208.866 ns | 172.88x | 0 B | 672 B |
1432-
1433-
This run reflects the allocation-focused search hot-path pass: BM25 now uses the shared allocation-aware tokenizer, direct scoring loops, and bounded top-N match retention; fuzzy edit distance uses stack-backed bit-vector masks for short residual tokens and pooled rows for the long-token fallback; and Tiktoken search keeps only bounded top-N candidates while TF-IDF weighting updates dictionary values without temporary key arrays.
1434-
1435-
These numbers are local measurements, not a cross-machine performance contract. The README keeps compact slices only; [Performance Benchmarks](docs/Features/PerformanceBenchmarks.md) and the full Markdown, CSV, and JSON BenchmarkDotNet reports remain the source for detailed diagnostics.
1326+
| Full suite | 118 BenchmarkDotNet cases using the `Default` job |
1327+
| Graph build | `LargeCorpus` builds in 45.457 ms with 57.74 MB allocated |
1328+
| Low-latency search | `ShortDocuments` exact ranked graph search is 1.195 ms / 2.37 MB; BM25 is 1.659 ms / 3.07 MB |
1329+
| Typo-tolerant search | BM25 fuzzy stays opt-in; `ShortDocuments` exact fuzzy search is 1.979 ms / 3.07 MB |
1330+
| RDF query paths | `ShortDocuments` exact schema SPARQL is 41.078 ms / 60.33 MB; local federated schema search is 39.410 ms / 62.31 MB |
1331+
| Tiktoken search | `LongDocuments` exact token-distance search is 298.1 us / 212.24 KB; typo correction is 391.5 us / 216.30 KB |
1332+
| Persistence | `LargeCorpus` Turtle file load is 35.708 ms / 28.10 MB; JSON-LD file load is 90.663 ms / 75.32 MB |
1333+
| Lifecycle | Build/search/save/load/export is 55.35 ms / 54.44 MB |
1334+
| Fuzzy edit distance | Long insertion is 376.58x faster than naive Levenshtein; long no-match is 172.88x faster, both with 0 B allocated |
1335+
1336+
These numbers are local diagnostics, not a cross-machine performance contract. The full Markdown, CSV, and JSON BenchmarkDotNet reports remain the source for raw measurements.

branch-coverage-improvement.brainstorm.md

Lines changed: 0 additions & 41 deletions
This file was deleted.

branch-coverage-improvement.plan.md

Lines changed: 0 additions & 47 deletions
This file was deleted.

0 commit comments

Comments
 (0)