Experiment 002b: insert_list_size Parameter Sweep

Date: 2026-02-11 Engineer: [Fill in your name] Status: In Progress Git Commit: 1398e9a

Hypothesis

There exists an optimal insert_list_size value where recall plateaus - increasing beyond this point wastes build time without improving recall.

Reasoning:

insert_list_size controls candidate pool during graph construction
Higher values = more exploration = better graph connectivity = higher recall
But diminishing returns: recall plateaus when graph is "well-connected enough"
libSQL uses 75, we currently use 100 (from Exp 001)
Hypothesis: Recall plateaus around 100-150 for 50k vectors @ 256D

Motivation

Problem: Build time directly proportional to insert_list_size. Need to find minimum value that achieves target recall (≥95%).

Why now: Exp 001 showed only 2% build time improvement from 200→100, suggesting we may already be near optimal. Need to validate across full parameter range.

Success criteria:

Identify plateau point where recall stops improving
Validate insert_list_size=100 is optimal (or find better default)
Document recall vs build time tradeoff curve

Test Setup

Parameters Under Test

Parameter	Baseline	Test Values	Range Rationale
insert_list_size	100	[50, 75, 100, 150, 200]	libSQL=75, old default=200
dimensions	256	(fixed)	Representative
max_neighbors	32	(fixed, may change after Exp 003)	Current default
search_list	100	(fixed)	Consistent with insert

Dataset

Size: 50,000 vectors
Dimensions: 256
Metric: Cosine
Source: Synthetic (random, seed=42)

Hardware

CPU: AMD Ryzen 9 5950X (16 cores, 32 threads)
RAM: 62 GB
Disk: NVMe SSD (912GB capacity, 38% used)
OS: Ubuntu 24.04, Linux 6.17.0-14-generic

Comparison Baseline

Control: insert_list_size=100 (current default)
Baseline: From 25k @ insert_list=100 (Exp 001):
- Build time: 432s
- Recall@10: 99.2%
- QPS: 82

Benchmark Profile

benchmarks/profiles/param-sweep-insert-list.json

Expected Results

insert_list	Build Time (s)	Recall@10 (%)	QPS	Notes
50	440 (−50%)	95-96%	90	Too low? Graph may fragment
75	660 (−25%)	98%	87	libSQL default, should be good
100 (base)	880	99%	85	Current default
150	1320 (+50%)	99.2%	82	Diminishing returns start
200	1760 (+100%)	99.5%	80	Plateau - marginal improvement

Key prediction: Recall plateaus between 100-150. Optimal is likely 75-100 range.

Risk:

insert_list=50 may be too aggressive, causing recall <95%
I/O contention from parallel experiments may affect build times
Cache (from Exp 001) may mask build time differences

Execution

Commands Run

cd /home/mrm/src/sqlite-diskann-experiments/exp002b-insert-list
cd benchmarks
rm -rf datasets/synthetic/*.db
npm install --ignore-scripts  # Done already
# Fix symlink: Already done
npm run prepare
date && npm run bench -- --profile=profiles/param-sweep-insert-list.json 2>&1 | \
  tee ../experiments/experiment-002b-output.txt && date

Timeline

Start: [Fill in timestamp]
End: [Fill in timestamp]
Duration: [Expected: 25-35 minutes]

Actual Results

Raw Data

See experiments/experiment-002b-output.txt for full benchmark output.

[Paste results table from benchmark]

Key Metrics

insert_list	Build Time (s)	Recall@10 (%)	QPS	Δ from Expected
50	[X]	[X]%	[X]	[±N%]
75	[X]	[X]%	[X]	[±N%]
100 (base)	[X]	[X]%	[X]	[±N%]
150	[X]	[X]%	[X]	[±N%]
200	[X]	[X]%	[X]	[±N%]

Recall Plateau Analysis

[Plot or describe where recall stops improving significantly]

Plateau point: insert_list=[X] (recall stops improving beyond this)

Build Time Efficiency

[Calculate recall improvement per second of build time]

insert_list	Recall/BuildTime Ratio	Efficiency vs 100
50	[X]	[±N%]
75	[X]	[±N%]
100	[X]	baseline
150	[X]	[±N%]
200	[X]	[±N%]

Anomalies

[Note anything unexpected]

Analysis

Hypothesis Validation

✅ Confirmed: [What matched predictions about plateau point] ❌ Refuted: [What didn't match] ❓ Unclear: [Ambiguous results]

Key Insights

Optimal value: [Best insert_list_size for recall vs build time tradeoff]
Plateau behavior: [At what point does recall stop improving?]
libSQL comparison: [Is their 75 value justified by our data?]

Confounding Factors

Parallel experiments (exp003, exp004) - I/O contention
Cache enabled (from Exp 001) - may mask I/O-based build time differences
Dataset size 50k vs baseline 25k - not direct comparison
[Any other factors]

Conclusions

Summary

[2-3 sentences: What's the optimal insert_list_size? Should we change the default?]

Impact on Recommendations

Update defaults?
- If optimal != 100: Change DEFAULT_INSERT_LIST_SIZE in src/diskann_api.c:25
- Document reasoning in code comment
Update documentation:
- Update PARAMETERS.md with recall plateau data
- Add guidance: "Use insert_list=X for datasets <Yk, Y for >Yk"
User guidance:
- Fast build (lower recall): insert_list=50-75
- Balanced (recommended): insert_list=[optimal value]
- Maximum recall (slow): insert_list=150-200

Limitations

Only tested synthetic data at one scale (50k)
Cache may be masking true I/O cost differences
Real embeddings may have different connectivity requirements

Follow-up Questions

Does optimal insert_list_size vary with dataset size? (test at 100k, 200k)
Does max_neighbors (Exp 003 result) affect optimal insert_list?
Can we predict optimal value analytically from dimensions and dataset size?

Next Steps

If optimal != 100: Update DEFAULT_INSERT_LIST_SIZE in src/diskann_api.c
Update PARAMETERS.md with plateau curve
Document recall vs build time tradeoff in README.md
Consider combined sweep: (max_neighbors, insert_list) grid search
Update experiments/README.md index

Artifacts

Benchmark profile: benchmarks/profiles/param-sweep-insert-list.json
Raw output: experiments/experiment-002b-output.txt
Results JSON: benchmarks/results/results-*.json (5 files)

References

Prior experiment: experiment-001 (established insert_list=100 default)
libSQL default: insert_list=75 (source: libSQL codebase)
TPP: _todo/20260211-build-speed-optimization.md

Lessons for Future Experiments:

[After completion, note insights about parameter sweeps, identifying plateaus, etc.]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment 002b: insert_list_size Parameter Sweep

Hypothesis

Motivation

Test Setup

Parameters Under Test

Dataset

Hardware

Comparison Baseline

Benchmark Profile

Expected Results

Execution

Commands Run

Timeline

Actual Results

Raw Data

Key Metrics

Recall Plateau Analysis

Build Time Efficiency

Anomalies

Analysis

Hypothesis Validation

Key Insights

Confounding Factors

Conclusions

Summary

Impact on Recommendations

Limitations

Follow-up Questions

Next Steps

Artifacts

References

FilesExpand file tree

experiment-002b-insert-list-sweep.md

Latest commit

History

experiment-002b-insert-list-sweep.md

File metadata and controls

Experiment 002b: insert_list_size Parameter Sweep

Hypothesis

Motivation

Test Setup

Parameters Under Test

Dataset

Hardware

Comparison Baseline

Benchmark Profile

Expected Results

Execution

Commands Run

Timeline

Actual Results

Raw Data

Key Metrics

Recall Plateau Analysis

Build Time Efficiency

Anomalies

Analysis

Hypothesis Validation

Key Insights

Confounding Factors

Conclusions

Summary

Impact on Recommendations

Limitations

Follow-up Questions

Next Steps

Artifacts

References