This guide explains how to use CrossVector's benchmarking tool to measure and compare performance across different database backends and embedding providers.
The benchmark tool (scripts/benchmark.py) provides comprehensive performance testing for:
- 4 Database Backends: PgVector, AstraDB, Milvus, ChromaDB
- 2 Embedding Providers: OpenAI, Gemini
- 7 Operation Types: Bulk create, individual create, vector search, metadata search, Query DSL operators, updates, deletes
# Test all backends with both embeddings (10 documents)
python scripts/benchmark.py --num-docs 10
# Full benchmark with 1000 documents
python scripts/benchmark.py
# Test specific configuration
python scripts/benchmark.py --backends pgvector milvus --embedding-providers openai --num-docs 100python scripts/benchmark.py [OPTIONS]
Options:
--num-docs INT Number of documents to test (default: 1000)
--backends NAME [NAME ...] Specific backends: pgvector, astradb, milvus, chroma
--embedding-providers NAME Embedding providers: openai, gemini
--skip-slow Skip slow cloud backends (astradb, milvus) for faster testing
--search-limit INT Number of results to return in search operations (default: 100)
--collection-name STR Custom collection name (default: auto-generate with UUID8)
--timeout INT Timeout per backend test in seconds (default: 60)
--output PATH Output file path (default: benchmark.md)
--use-fixtures PATH Path to pre-generated fixtures JSON file
--add-vectors Generate and add vectors to fixture documentsMeasures throughput for batch document upsert with automatic embedding generation.
Metrics:
- Duration (seconds)
- Throughput (docs/sec)
Tests single document creation with embedding generation.
Metrics:
- Average duration per document
Semantic similarity search using embedded queries.
Metrics:
- Average query duration (10 queries tested)
- Queries per second
Filtering without vector similarity (if supported by backend).
Metrics:
- Average query duration
- Support status
Tests all 10 Query DSL operators:
- Comparison:
eq,ne,gt,gte,lt,lte - Array:
in,nin - Logical:
and(&),or(|)
Metrics:
- Average operator query duration
- Number of operators successfully tested
Document update performance.
Metrics:
- Average update duration (100 updates tested)
Batch deletion throughput.
Metrics:
- Duration
- Throughput (docs/sec)
Embedding Providers (at least one required):
# OpenAI
export OPENAI_API_KEY=sk-...
# Gemini
export GEMINI_API_KEY=AI...Database Backends (optional, will skip if not configured):
# PgVector
export PGVECTOR_HOST=localhost
export PGVECTOR_PORT=5432
export PGVECTOR_DATABASE=vector_db
export PGVECTOR_USER=postgres
export PGVECTOR_PASSWORD=postgres
# Or use connection string:
export PGVECTOR_CONNECTION_STRING=postgresql://user:pass@host:port/db
# AstraDB
export ASTRADB_API_ENDPOINT=https://...apps.astra.datastax.com
export ASTRADB_APPLICATION_TOKEN=AstraCS:...
# Milvus
export MILVUS_API_ENDPOINT=https://...
export MILVUS_API_TOKEN=...
# ChromaDB (optional for cloud)
export CHROMA_HOST=api.trychroma.com
export CHROMA_API_KEY=ck-...
export CHROMA_TENANT=...
export CHROMA_DATABASE=TestTest that everything is configured correctly:
python scripts/benchmark.py --num-docs 1 --backends pgvector --embedding-providers openaiCompare all backends with small dataset:
python scripts/benchmark.py --num-docs 10This runs 8 combinations (4 backends × 2 embeddings) with 10 documents each.
Full performance test with larger dataset:
python scripts/benchmark.py --num-docs 1000 --output benchmark_full.mdNote: This will:
- Make ~1000+ API calls to embedding providers
- Take 30-60 minutes depending on network and API rate limits
- Cost approximately $0.10-0.20 in API fees
# Only PgVector with both embeddings
python scripts/benchmark.py --backends pgvector --num-docs 100# All backends with only OpenAI
python scripts/benchmark.py --embedding-providers openai --num-docs 100# PgVector vs Milvus
python scripts/benchmark.py --backends pgvector milvus --num-docs 100Results are saved as a markdown file (default: benchmark.md) with:
- Performance Summary Table - Quick comparison across all combinations
- Detailed Results - Individual metrics for each backend+embedding pair
- Notes - Configuration and methodology details
## Performance Summary
| Backend | Embedding | Model | Dim | Upsert | Search (avg) | Update (avg) | Delete (batch) | Status |
|---------|-----------|-------|-----|--------|--------------|--------------|----------------|--------|
| pgvector | openai | text-embedding-3-small | 1536 | 7.06s | 21.26ms | 6.21ms | 22.63ms | OK |
| astradb | openai | text-embedding-3-small | 1536 | 18.89s | 23.86s | 1.11s | 15.15s | OK |
| milvus | openai | text-embedding-3-small | 1536 | 7.94s | 654.43ms | 569.52ms | 2.17s | OK |
| chroma | openai | text-embedding-3-small | 1536 | 17.08s | 654.76ms | 1.23s | 4.73s | OK |
| pgvector | gemini | models/gemini-embedding-001 | 1536 | 6.65s | 18.72ms | 6.40ms | 20.25ms | OK |
| astradb | gemini | models/gemini-embedding-001 | 1536 | 11.25s | 6.71s | 903.37ms | 15.05s | OK |
| milvus | gemini | models/gemini-embedding-001 | 1536 | 6.14s | 571.90ms | 561.38ms | 1.91s | OK |
| chroma | gemini | models/gemini-embedding-001 | 1536 | 18.93s | 417.28ms | 1.24s | 4.63s | OK |Bulk Create:
- Lower duration = better
- Higher throughput (docs/sec) = better
- Gemini typically slower due to API rate limits
Search:
- Lower average duration = better
- Milvus typically fastest for vector search
- Gemini often faster than OpenAI for search (smaller vectors)
Updates & Deletes:
- Lower duration = better
- PgVector typically fast for updates due to SQL efficiency
Query DSL Operators:
- Should test 10/10 operators successfully
- Duration typically <1ms for metadata operations
- Stable Network: Run benchmarks on stable network connection
- Isolated Environment: Avoid running other heavy processes
- Warm-up: First run may be slower due to cold starts
- Multiple Runs: Run 2-3 times and use median values for important decisions
Be aware of rate limits:
- OpenAI: 3,500 requests/min (Tier 2)
- Gemini: 1,500 requests/min (free tier)
For large benchmarks (--num-docs 1000+), the tool will automatically pace requests.
- If you see Gemini
RESOURCE_EXHAUSTED, rerun with--embedding-providers openaior reduce--num-docs. - To avoid embedding API calls entirely, provide fixtures with vectors (e.g.,
--use-fixtures scripts/benchmark/data/openai_3.json --add-vectors) or let the tool generate static vectors when configured. - Keep long runs to a single backend to reduce concurrent calls (e.g.,
--backends pgvector).
When optimizing performance:
# Before changes
python scripts/benchmark.py --num-docs 100 --output benchmark_before.md
# Make your changes to code
# After changes
python scripts/benchmark.py --num-docs 100 --output benchmark_after.md
# Compare the two markdown files
diff benchmark_before.md benchmark_after.mdOr use a markdown diff tool for better visualization.
If you see:
AstraDB not available: Missing ASTRADB_API_ENDPOINT
Solution: Set the required environment variables or the backend will be skipped.
If you see rate limit errors:
bulk_create failed: Rate limit exceeded
Solutions:
- Reduce
--num-docs - Wait and retry
- Check API quota/billing
If benchmarks are unexpectedly slow:
- Check network latency to database
- Verify database is not under load
- Check API rate limits aren't being hit
- Try reducing
--num-docsfor faster iterations
Modify scripts/benchmark.py to use custom test data:
# In generate_documents() function
SAMPLE_TEXTS = [
"Your custom text 1",
"Your custom text 2",
# ...
]Extend benchmark_backend() method to add custom metrics:
# In BenchmarkRunner.benchmark_backend()
# After existing benchmarks, add:
# Custom metric
print("\nCustom Metric...")
duration, result = benchmark_operation("custom", lambda: engine.custom_operation())
results["custom_metric"] = {"duration": duration}Approximate costs for running benchmarks:
| Documents | OpenAI Cost | Gemini Cost | Total Time |
|---|---|---|---|
| 10 | $0.001 | Free | 2-5 min |
| 100 | $0.01 | Free | 10-15 min |
| 1000 | $0.10 | Free | 30-60 min |
Note: Costs are approximate and depend on:
- Embedding model used
- Document text length
- Current API pricing
For Gemini, the free tier typically covers benchmarking needs.
- Start Small: Always test with
--num-docs 10first - Document Results: Save benchmark outputs with timestamps
- Consistent Environment: Run on same machine/network for comparisons
- Version Control: Commit benchmark results with code changes
- CI/CD Integration: Consider running small benchmarks in CI for regression testing
# Compare PgVector and Milvus with 50 docs
python scripts/benchmark.py --backends pgvector milvus --num-docs 50# Test which embedding is faster for your use case
python scripts/benchmark.py --backends pgvector --num-docs 200# Full benchmark before major release
python scripts/benchmark.py --num-docs 1000 --output release_v1.0_benchmark.md# Test with more documents to stress search performance
python scripts/benchmark.py --backends milvus --num-docs 5000Found a performance issue or want to add a new benchmark metric? See Contributing Guide.