Skip to content

Latest commit

 

History

History
180 lines (133 loc) · 5.95 KB

File metadata and controls

180 lines (133 loc) · 5.95 KB

Cheat Sheet - pgvector Embeddings

Azure CLI — Enable pgvector Extension

# Allowlist pgvector in azure.extensions server parameter
az postgres flexible-server parameter set \
  --resource-group myRG --server-name myserver \
  --name azure.extensions --value pgvector

# Verify the parameter
az postgres flexible-server parameter show \
  --resource-group myRG --server-name myserver \
  --name azure.extensions

psql — pgvector Setup

-- Install pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Check installed version
SELECT extversion FROM pg_extension WHERE extname = 'vector';

-- Update pgvector (if newer version available)
ALTER EXTENSION vector UPDATE;

Creating Vector Tables

-- Create a table with a vector column (1536 dims for text-embedding-ada-002)
CREATE TABLE items (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(1536)
);

-- Insert a vector (square-bracket literal format)
INSERT INTO items (content, embedding)
VALUES ('hello world', '[0.1, 0.2, 0.3, ...]');

-- Query nearest neighbors using cosine distance
SELECT id, content, embedding <=> '[0.1, 0.2, 0.3, ...]' AS distance
FROM items
ORDER BY embedding <=> '[0.1, 0.2, 0.3, ...]'
LIMIT 5;

Distance Operators Quick Reference

Operator Name Use Case ORDER BY
<-> L2 / Euclidean distance General-purpose distance ASC (smaller = closer)
<=> Cosine distance Text/semantic similarity ASC (0 = identical)
<#> Negative inner product Maximum inner product search ASC (negative, so ASC = highest similarity)

Index Types

Property IVFFlat HNSW
Create syntax CREATE INDEX ON items USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
Build time Faster Slower
Query speed Fast (with enough lists) Faster
Accuracy Lower (approximate, tunable via probes) Higher (better recall out of the box)
Empty table Cannot build on empty table — needs data first Can build on empty table
When to use Large datasets, acceptable recall, faster index builds Best recall, query-heavy workloads, data arrives incrementally

Operator Classes for Indexes

Operator class Distance operator
vector_l2_ops <->
vector_cosine_ops <=>
vector_ip_ops <#>

Python — pgvector with psycopg

# pip install pgvector psycopg[binary]
from pgvector.psycopg import register_vector
import psycopg

conn = psycopg.connect("host=myserver.postgres.database.azure.com dbname=mydb "
                        "user=adminuser password=secret sslmode=require")
register_vector(conn)

# Insert embedding
cur = conn.cursor()
cur.execute("INSERT INTO items (content, embedding) VALUES (%s, %s)",
            ("hello world", [0.1, 0.2, 0.3]))

# Query nearest neighbors
query_vec = [0.1, 0.2, 0.3]
cur.execute("SELECT id, content, embedding <=> %s AS distance "
            "FROM items ORDER BY embedding <=> %s LIMIT 5",
            (query_vec, query_vec))
results = cur.fetchall()

Python — Azure OpenAI Embeddings

from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint="https://myendpoint.openai.azure.com/",
    api_key="your-api-key",
    api_version="2024-02-01"
)

response = client.embeddings.create(
    input=["text to embed"],
    model="text-embedding-ada-002"
)

embedding = response.data[0].embedding  # list of 1536 floats

Important Defaults

pgvector Dimension Limits

pgvector Version Max Dimensions
0.5.x 2,000
0.7.0+ 16,000

Azure OpenAI Embedding Models

Model Default Dimensions Notes
text-embedding-ada-002 1,536 Fixed dimensions
text-embedding-3-small 1,536 Supports dimensions parameter
text-embedding-3-large 3,072 Supports dimensions parameter (can reduce)

Vector Storage

Formula Example (1536 dims)
4 x dimensions + 8 bytes per vector 4 x 1536 + 8 = 6,152 bytes (~6 KB)

Index Defaults

Setting Default
IVFFlat lists Must be specified (guideline: rows / 1000 for < 1M rows, sqrt(rows) for > 1M)
HNSW m 16
HNSW ef_construction 64
HNSW ef_search (query-time) 40
IVFFlat probes (query-time) 1

Common Exam Gotchas

Extension Setup Gotchas

  1. Must allowlist pgvector in azure.extensions BEFORE running CREATE EXTENSION. Without the server parameter, the extension install fails.
  2. Vector literal format uses square brackets: '[0.1, 0.2, 0.3]' — not curly braces or parentheses.
  3. register_vector(conn) must be called per connection to use Python lists or numpy arrays as vector parameters.

Distance and Similarity Gotchas

  1. Cosine distance <=> returns distance (0 = identical), not similarity (1 = identical). To get similarity: 1 - (embedding <=> query_vec).
  2. Negative inner product <#> is negated so that ORDER BY ASC returns highest similarity first.
  3. Azure OpenAI embeddings are already L2-normalized — cosine distance and inner product give the same ranking.

Dimension and Model Gotchas

  1. pgvector 0.5 supports max 2,000 dimensions; 0.7+ supports max 16,000 dimensions. Know which version is deployed.
  2. text-embedding-3-large outputs 3,072 dimensions by default but supports the dimensions parameter to reduce output size.

Indexing Gotchas

  1. IVFFlat requires data before building the index; HNSW can be built on an empty table. If you create an IVFFlat index on an empty table, queries will return no results.
  2. Increasing IVFFlat probes improves recall but slows queries. Default is 1 — typically too low for production.