You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-- Default parameters (good starting point)CREATEINDEXidx_hnswON items
USING hnsw (embedding vector_cosine_ops)
WITH (m =16, ef_construction =64);
-- High-quality index (slower build, better recall)CREATEINDEXidx_hnsw_hqON items
USING hnsw (embedding vector_cosine_ops)
WITH (m =32, ef_construction =128);
-- L2 distance indexCREATEINDEXidx_hnsw_l2ON items
USING hnsw (embedding vector_l2_ops);
-- Inner product indexCREATEINDEXidx_hnsw_ipON items
USING hnsw (embedding vector_ip_ops);
-- Halfvec index (pgvector 0.7+)CREATEINDEXidx_hnsw_halfON items
USING hnsw (embedding halfvec_cosine_ops);
IVFFlat
-- Guideline: lists = rows/1000 for <1M rows, sqrt(rows) for >=1M rows-- 100K rows → lists = 100CREATEINDEXidx_ivfflatON items
USING ivfflat (embedding vector_cosine_ops)
WITH (lists =100);
-- 4M rows → lists = 2000CREATEINDEXidx_ivfflat_largeON items
USING ivfflat (embedding vector_cosine_ops)
WITH (lists =2000);
Drop and Rebuild
-- Drop an indexDROPINDEX IF EXISTS idx_hnsw;
-- Rebuild IVFFlat after significant data changesDROPINDEX idx_ivfflat;
CREATEINDEXidx_ivfflatON items
USING ivfflat (embedding vector_cosine_ops)
WITH (lists =500);
-- Rebuild concurrently (no table lock, slower build)CREATEINDEXCONCURRENTLY idx_hnsw ON items
USING hnsw (embedding vector_cosine_ops);
Query-Time Parameter Tuning
-- IVFFlat: increase probes for better recall (default = 1)SETivfflat.probes=10;
-- Guideline: start with sqrt(lists)-- HNSW: increase ef_search for better recall (default = 40)SEThnsw.ef_search=100;
-- Must be >= LIMIT value-- Set per-transaction (resets after transaction ends)BEGIN;
SET LOCAL ivfflat.probes=20;
SELECT*FROM items ORDER BY embedding <=> $1LIMIT10;
COMMIT;
-- Set per-sessionSETivfflat.probes=10; -- lasts until connection closes-- Reset to defaults
RESET ivfflat.probes;
RESET hnsw.ef_search;
Index Build Optimization
-- Increase memory for large index buildsSET maintenance_work_mem ='2GB';
-- Use parallel workers for HNSW builds (pgvector 0.7+)SET max_parallel_maintenance_workers =7;
-- Build the indexCREATEINDEXON items USING hnsw (embedding vector_cosine_ops);
-- Reset after build
RESET maintenance_work_mem;
RESET max_parallel_maintenance_workers;
EXPLAIN ANALYZE Patterns
-- Check index usage
EXPLAIN ANALYZE
SELECT id, content
FROM items
ORDER BY embedding <=>'[0.1, 0.2, ...]'LIMIT10;
-- Good output: "Index Scan using idx_hnsw"-- Bad output: "Seq Scan" (index not used)-- Force index usage for testingSET enable_seqscan = off;
EXPLAIN ANALYZE SELECT ...;
SET enable_seqscan =on;
Distance Operator ↔ Operator Class Matrix
Query Operator
CREATE INDEX Operator Class
Metric
<->
vector_l2_ops
L2 / Euclidean
<=>
vector_cosine_ops
Cosine distance
<#>
vector_ip_ops
Negative inner product
<-> (halfvec)
halfvec_l2_ops
L2 (float16)
<=> (halfvec)
halfvec_cosine_ops
Cosine (float16)
<#> (halfvec)
halfvec_ip_ops
Inner product (float16)
Hybrid Search Snippets
-- Full-text search column (generated, auto-updates)ALTERTABLE items ADD COLUMN content_tsv tsvector
GENERATED ALWAYS AS (to_tsvector('english', content)) STORED;
CREATEINDEXON items USING GIN (content_tsv);
-- Hybrid: keyword filter + vector rankingSELECT id, content,
embedding <=> $1AS distance
FROM items
WHERE content_tsv @@ plainto_tsquery('english', 'search terms')
ORDER BY embedding <=> $1LIMIT10;
-- Pre-filter by metadata + vector rankingSELECT id, content
FROM items
WHERE category ='ai'ORDER BY embedding <=> $1LIMIT10;
Python — Query-Time Tuning
importpsycopgfrompgvector.psycopgimportregister_vectorconn=psycopg.connect("host=... dbname=... sslmode=require")
register_vector(conn)
# Set query-time parameterswithconn.cursor() ascur:
cur.execute("SET hnsw.ef_search = 100")
cur.execute(
"SELECT id, content, embedding <=> %s AS distance ""FROM items ORDER BY embedding <=> %s LIMIT 10",
(query_vec, query_vec)
)
results=cur.fetchall()
Important Defaults
Index Build Defaults
Setting
Default
HNSW m
16
HNSW ef_construction
64
IVFFlat lists
Must be specified
maintenance_work_mem
64 MB
max_parallel_maintenance_workers
2
Query-Time Defaults
Setting
Default
HNSW ef_search
40
IVFFlat probes
1
Parameter Tuning Guidelines
Parameter
Starting Point
Effect of Increase
IVFFlat lists
rows/1000 (< 1M), sqrt(rows) (>= 1M)
More clusters, finer partitioning
IVFFlat probes
sqrt(lists)
Better recall, slower queries
HNSW m
16
Better recall, larger index, slower build
HNSW ef_construction
64
Better recall, slower build
HNSW ef_search
40
Better recall, slower queries
Common Exam Gotchas
Index Selection Gotchas
IVFFlat cannot build on an empty table. k-means needs data for centroids. An empty-table IVFFlat index produces no results.
HNSW can build on empty tables. The graph accommodates incremental inserts.
IVFFlat needs periodic rebuilds after significant data changes; HNSW does not.
HNSW is recommended for most workloads on Azure Flexible Server.
Operator Class Gotchas
Operator class must match the query operator. Index with vector_cosine_ops only accelerates <=> queries, not <->.
halfvec uses separate operator classes (halfvec_cosine_ops, not vector_cosine_ops).
Query Pattern Gotchas
Vector indexes require ORDER BY ... LIMIT k. Without LIMIT, the planner does a sequential scan.
probes = 1 is the IVFFlat default — far too low for production. Always increase probes.
ef_search must be >= LIMIT. If LIMIT 50 but ef_search is 40, results are incomplete.
Build Performance Gotchas
Default maintenance_work_mem (64 MB) is too low for large indexes. Increase to 1-2 GB before building.