Version: 0.7.0 Audience: Developers optimizing EdgeVec for specific workloads Prerequisites: Basic EdgeVec usage (see Tutorial)
- Overview
- Key Performance Metrics
- HNSW Parameters
- Memory vs Speed Tradeoffs
- Recommended Configurations
- Benchmarking Your Workload
- Advanced Tuning
EdgeVec uses the HNSW (Hierarchical Navigable Small World) algorithm for approximate nearest neighbor search. HNSW builds a multi-layer graph structure that enables logarithmic-time search complexity.
Performance can be tuned by adjusting:
- Graph connectivity (M, M0)
- Build quality (efConstruction)
- Search accuracy (efSearch)
Key tradeoffs:
- Higher accuracy = slower search + more memory
- Faster index build = lower quality graph
- More connections = better recall but more memory
Understanding these metrics helps you tune effectively:
| Metric | Description | How to Measure |
|---|---|---|
| Search Latency | Time to find k nearest neighbors | performance.now() around search() |
| Insert Latency | Time to add a vector | performance.now() around insert() |
| Recall@k | % of true neighbors found vs brute force | Compare to exact search |
| Memory Usage | RAM consumed by index | Browser DevTools / process.memoryUsage() |
| Build Time | Time to index N vectors | Time batch insert operations |
| Metric | Target | Typical Performance |
|---|---|---|
| Search (100k vectors, k=10) | <1ms | 0.23-0.57ms |
| Insert (single vector) | <2ms | 0.5-1.5ms |
| Recall@10 | >95% | 97-99% at default settings |
| Memory (100k, 768d, quantized) | <100MB | ~83MB |
What it does: Controls how many neighbors each node connects to in layers above layer 0.
Default: 16
const config = new EdgeVecConfig(768);
config.m = 16; // DefaultImpact:
| M Value | Memory | Recall | Search Speed | Build Speed |
|---|---|---|---|---|
| 8 | Low | ~93% | Fast | Fast |
| 16 | Medium | ~97% | Medium | Medium |
| 32 | High | ~99% | Slower | Slower |
| 64 | Very High | ~99.5% | Slowest | Slowest |
Guidelines:
- M = 8-12: Memory-constrained, acceptable accuracy loss
- M = 16: General purpose (recommended default)
- M = 24-32: High accuracy requirements
- M = 48-64: Maximum accuracy, memory not a concern
Memory formula (approximate):
Memory per vector ≈ dimensions × 4 bytes + M × 2 × 8 bytes
Example:
// Memory-optimized configuration
const config = new EdgeVecConfig(768);
config.m = 8; // ~15% less memory vs M=16
// Accuracy-optimized configuration
const config = new EdgeVecConfig(768);
config.m = 32; // ~99% recall, 2x memory vs M=16What it does: Controls the search width during index construction. Higher values build a higher-quality graph but take longer.
Default: 200
const config = new EdgeVecConfig(768);
config.ef_construction = 200; // DefaultImpact:
| efConstruction | Build Time | Graph Quality | Recall |
|---|---|---|---|
| 100 | Fast | Lower | ~95% |
| 200 | Medium | Good | ~97% |
| 400 | Slow | High | ~99% |
| 500+ | Very Slow | Maximum | ~99%+ |
Guidelines:
- efConstruction = 100: Rapid prototyping, frequent rebuilds
- efConstruction = 200: General purpose (recommended)
- efConstruction = 400: Production systems requiring high recall
- efConstruction ≥ 500: Diminishing returns, rarely needed
Rule of thumb: efConstruction should be at least 2× M for good graph quality.
Example:
// Fast build for development
const config = new EdgeVecConfig(768);
config.m = 16;
config.ef_construction = 100;
// High-quality build for production
const config = new EdgeVecConfig(768);
config.m = 16;
config.ef_construction = 400;What it does: Controls how many candidates are explored during search. Higher values improve recall at the cost of latency.
Default: 50
const config = new EdgeVecConfig(768);
config.ef_search = 50; // DefaultImpact:
| efSearch | Recall@10 | Search Latency | Notes |
|---|---|---|---|
| 10 | ~85% | Very Fast | Too low for most uses |
| 50 | ~97% | Fast | Good default |
| 100 | ~99% | Medium | High accuracy |
| 200 | ~99.5% | Slower | Near-exact |
| 500 | ~99.9% | Slow | Maximum recall |
Guidelines:
- efSearch must be ≥ k (the number of results requested)
- Start with efSearch = 50, increase if recall is insufficient
- For real-time applications, keep efSearch ≤ 100
- For batch processing, efSearch = 200-500 is acceptable
Example:
const config = new EdgeVecConfig(768);
config.ef_search = 50; // Set default
const index = new EdgeVec(config);
// Search uses the configured ef_search
const results = index.search(query, 10); // Uses ef_search=50Dynamic efSearch adjustment pattern:
// Real-time search: fast but lower recall
config.ef_search = 50;
// Re-ranking pass: higher recall
config.ef_search = 200;
const candidates = index.search(query, 100);
// Note: ef_search must be set before creating the index
// There is currently no way to change it per-queryWhat it does: Controls the maximum connections in layer 0 (the bottom, densest layer). Higher M0 improves recall but increases memory.
Default: 32 (typically 2× M)
const config = new EdgeVecConfig(768);
config.m0 = 32; // DefaultGuidelines:
- Usually set to 2× M
- Increasing M0 beyond 2×M has diminishing returns
- Only adjust if you've tuned M and ef values first
Example:
// High-recall configuration
const config = new EdgeVecConfig(768);
config.m = 24;
config.m0 = 48; // 2× M for consistencyMemory per vector ≈
(dimensions × 4 bytes) // Vector storage
+ (M × 2 × 4 bytes) // Upper layer neighbors
+ (M0 × 4 bytes) // Layer 0 neighbors
+ 16 bytes // Node metadata
Example for 768-dimensional vectors with M=16, M0=32:
Memory per vector ≈ 768×4 + 16×2×4 + 32×4 + 16 = 3,216 bytes
100k vectors ≈ 307 MB
| Goal | M | efConstruction | efSearch | Result |
|---|---|---|---|---|
| Minimum Memory | 8 | 100 | 30 | ~25% less memory, ~93% recall |
| Balanced | 16 | 200 | 50 | Default, ~97% recall |
| Maximum Recall | 32 | 400 | 200 | 2× memory, ~99.5% recall |
| Fastest Search | 12 | 200 | 30 | 20% faster search, ~95% recall |
Low-latency search for user-facing applications.
const config = new EdgeVecConfig(768); // sentence-transformers dimension
config.metric = 'cosine';
config.m = 16;
config.ef_construction = 200;
config.ef_search = 50;
// Expected performance:
// - Search: <0.5ms at 100k vectors
// - Recall: ~97%
// - Memory: ~3KB per vectorWhen recall matters more than speed.
const config = new EdgeVecConfig(384); // smaller embeddings
config.metric = 'cosine';
config.m = 32;
config.ef_construction = 400;
config.ef_search = 200;
// Expected performance:
// - Search: ~1-2ms at 100k vectors
// - Recall: ~99.5%
// - Memory: ~2.5KB per vectorMobile browser or IoT device with limited RAM.
const config = new EdgeVecConfig(128); // smaller dimension helps
config.metric = 'l2';
config.m = 8;
config.ef_construction = 100;
config.ef_search = 30;
// Expected performance:
// - Search: <0.2ms at 50k vectors
// - Recall: ~93%
// - Memory: <1KB per vectorOffline processing where latency doesn't matter.
const config = new EdgeVecConfig(1536); // OpenAI embeddings
config.metric = 'cosine';
config.m = 24;
config.ef_construction = 400;
config.ef_search = 500;
// Expected performance:
// - Search: ~5ms at 1M vectors
// - Recall: ~99.9%
// - Memory: ~5KB per vectorQuick iteration during development.
const config = new EdgeVecConfig(768);
config.metric = 'cosine';
config.m = 12;
config.ef_construction = 100; // Faster builds
config.ef_search = 50;
// Expected performance:
// - Faster index builds for iteration
// - Acceptable recall for testingimport init, { EdgeVec, EdgeVecConfig } from 'edgevec';
async function benchmark() {
await init();
const dimensions = 768;
const numVectors = 10000;
const numQueries = 100;
const k = 10;
// Create index
const config = new EdgeVecConfig(dimensions);
config.metric = 'cosine';
const index = new EdgeVec(config);
// Generate test data
const vectors = Array.from({ length: numVectors }, () =>
new Float32Array(dimensions).map(() => Math.random() - 0.5)
);
const queries = Array.from({ length: numQueries }, () =>
new Float32Array(dimensions).map(() => Math.random() - 0.5)
);
// Benchmark insertion
console.log(`Inserting ${numVectors} vectors...`);
const insertStart = performance.now();
const result = index.insertBatch(vectors);
const insertTime = performance.now() - insertStart;
console.log(`Insert time: ${insertTime.toFixed(1)}ms`);
console.log(`Insert rate: ${(numVectors / insertTime * 1000).toFixed(0)} vectors/sec`);
// Benchmark search
console.log(`\nSearching ${numQueries} queries (k=${k})...`);
const searchTimes = [];
for (const query of queries) {
const start = performance.now();
index.search(query, k);
searchTimes.push(performance.now() - start);
}
// Calculate statistics
searchTimes.sort((a, b) => a - b);
const mean = searchTimes.reduce((a, b) => a + b) / searchTimes.length;
const p50 = searchTimes[Math.floor(searchTimes.length * 0.5)];
const p95 = searchTimes[Math.floor(searchTimes.length * 0.95)];
const p99 = searchTimes[Math.floor(searchTimes.length * 0.99)];
console.log(`Search latency:`);
console.log(` Mean: ${mean.toFixed(3)}ms`);
console.log(` P50: ${p50.toFixed(3)}ms`);
console.log(` P95: ${p95.toFixed(3)}ms`);
console.log(` P99: ${p99.toFixed(3)}ms`);
}
benchmark().catch(console.error);// Compare EdgeVec results to brute-force search
function bruteForceSearch(vectors, query, k, metric = 'cosine') {
const distances = vectors.map((vec, idx) => ({
id: idx,
distance: calculateDistance(vec, query, metric)
}));
distances.sort((a, b) => a.distance - b.distance);
return distances.slice(0, k);
}
function calculateDistance(a, b, metric) {
if (metric === 'cosine') {
let dot = 0, normA = 0, normB = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return 1 - dot / (Math.sqrt(normA) * Math.sqrt(normB));
}
// Add l2, dot implementations as needed
}
function calculateRecall(edgevecResults, bruteForceResults) {
const bfIds = new Set(bruteForceResults.map(r => r.id));
const hits = edgevecResults.filter(r => bfIds.has(r.id)).length;
return hits / bruteForceResults.length;
}
// Usage
const edgevecResults = index.search(query, k);
const exactResults = bruteForceSearch(vectors, query, k);
const recall = calculateRecall(edgevecResults, exactResults);
console.log(`Recall@${k}: ${(recall * 100).toFixed(1)}%`);Test different configurations to find optimal settings:
const configs = [
{ m: 8, ef_c: 100, ef_s: 30 },
{ m: 16, ef_c: 200, ef_s: 50 },
{ m: 16, ef_c: 200, ef_s: 100 },
{ m: 32, ef_c: 400, ef_s: 100 },
];
for (const cfg of configs) {
const config = new EdgeVecConfig(768);
config.m = cfg.m;
config.ef_construction = cfg.ef_c;
config.ef_search = cfg.ef_s;
// Run benchmark...
console.log(`M=${cfg.m}, efC=${cfg.ef_c}, efS=${cfg.ef_s}`);
console.log(` Latency: ${latency}ms, Recall: ${recall}%`);
}Consider rebuilding your index when:
- You change M or efConstruction
- Tombstone ratio exceeds 30-50% (use
compact()) - You've identified better parameters for your workload
Key metrics to track:
- Search latency percentiles (P50, P95, P99)
- Tombstone ratio (call
tombstoneRatio()) - Memory usage growth
- Recall (if you can sample and verify)
// Periodic health check
setInterval(() => {
const ratio = index.tombstoneRatio();
if (ratio > 0.3) {
console.warn(`High tombstone ratio: ${(ratio*100).toFixed(1)}%`);
}
}, 60000);| Parameter | Default | Low | Medium | High |
|---|---|---|---|---|
| M | 16 | 8 | 16 | 32-64 |
| efConstruction | 200 | 100 | 200 | 400-500 |
| efSearch | 50 | 30 | 50-100 | 200-500 |
| M0 | 32 | 16 | 32 | 64-128 |
Rules of thumb:
- M0 ≈ 2 × M
- efConstruction ≥ 2 × M
- efSearch ≥ k (number of results)
- For 99%+ recall: M=32, efConstruction=400, efSearch=200
- Tutorial — Getting started
- Troubleshooting — Common issues
- API Reference — Full API documentation
- Competitive Analysis — EdgeVec vs alternatives