You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2.0+ Gelem/s throughput for vector distance calculations
Sub-microsecond search latency for 10k vectors
~5ns Hamming distance with 36+ GiB/s throughput
Operation
Time
Throughput
Notes
Dot Product (768-dim)
374ns
2.05 Gelem/s
Per pair
L2 Distance (768-dim)
358ns
2.14 Gelem/s
Per pair
Cosine Similarity (768-dim)
339ns
2.27 Gelem/s
Per pair
Hamming (768-bit)
4.5ns
40 GiB/s
Binary quantized
Search (10k, k=10)
938us
1.07 Kelem/s
HNSW + SIMD
Native Benchmarks (cargo bench)
Dot Product by Dimension
Dimension
Time (ns)
Throughput
128
55
2.33 Gelem/s
256
106
2.41 Gelem/s
384
188
2.04 Gelem/s
512
254
2.01 Gelem/s
768
374
2.05 Gelem/s
1024
503
2.04 Gelem/s
1536
761
2.02 Gelem/s
L2 Squared Distance by Dimension
Dimension
Time (ns)
Throughput
128
66
1.93 Gelem/s
256
119
2.16 Gelem/s
384
184
2.09 Gelem/s
512
229
2.23 Gelem/s
768
358
2.14 Gelem/s
1024
462
2.22 Gelem/s
1536
693
2.22 Gelem/s
Cosine Similarity by Dimension
Dimension
Time (ns)
Throughput
128
58
2.19 Gelem/s
256
107
2.40 Gelem/s
384
181
2.12 Gelem/s
512
244
2.10 Gelem/s
768
339
2.27 Gelem/s
1024
474
2.16 Gelem/s
1536
732
2.10 Gelem/s
Hamming Distance (Binary Quantization)
Operation
Time
Throughput
768-bit (96 bytes)
4.5ns
40 GiB/s
Batch 10k
79us
127 Melem/s
Search by Collection Size
Collection
Search Time
Throughput
1,000 vectors
380us
2.63 Kelem/s
10,000 vectors
938us
1.07 Kelem/s
Batch Distance Calculations (10k pairs)
Dimension
L2 Time
Dot Time
128
914us
814us
384
2.99ms
3.56ms
768
6.73ms
6.70ms
SIMD Threshold Analysis (W30.1 Fix)
EdgeVec v0.7.0 lowered the WASM SIMD threshold from 256 to 16 dimensions:
Dimension
v0.6.x (threshold=256)
v0.7.0 (threshold=16)
16
Scalar
SIMD
128
Scalar
SIMD
256+
SIMD
SIMD
This means 128-dimensional embeddings (common for small models) now use SIMD.
Performance Targets
Original Targets (DAY_2_TASKS.md)
The original Day 2 spec targeted 2.5x speedup over scalar baseline:
Metric
Original Target
Achieved
Notes
Dot Product (768-dim)
<200ns (2.5x)
374ns
1.3x baseline
L2 Distance (768-dim)
<250ns (2.4x)
358ns
1.7x baseline
Search (10k, k=10)
~2ms (2.5x)
938us
2.1x better than target
Hamming Distance
<40ns (2.5x)
4.5ns
22x better than target
Analysis: The 2.5x speedup targets were based on theoretical SIMD parallelism.
Actual results show distance calculations achieved ~1.5x improvement while Hamming
distance and search far exceeded targets. The scalar baseline on modern x86_64
with auto-vectorization is already highly optimized, limiting additional SIMD gains.
Adjusted Targets (Production)
For v0.7.0 release, we use these validated performance budgets: