The quickstart is the simplest way to get started with DiskANN in SQL Server. It doesn't require any external resource and it is great to start to get familiar with the DiskANN syntax and capabilities. Once you are familiar with the quickstart, you can explore the Wikipedia sample in this folder that provides a complete end-to-end example.
SQL Server 2025 introduces a new VECTOR_SEARCH function that allows you to perform approximate nearest neighbor search using the DiskANN algorithm. This function is designed to work with vector columns in SQL Server, enabling efficient similarity search on high-dimensional data.
The samples in this folder demonstrate how to use the VECTOR_SEARCH function with DiskANN. The samples include:
- Creating a table with a vector column, importing data from a CSV file, and inserting data into the table.
- Creating a approximate vector index on the table using
CREATE VECTOR INDEXstatement. - Performing approximate nearest neighbor search using the
VECTOR_SEARCHfunction. - Performing hybrid search using the
VECTOR_SEARCHfunction along with full-text search. - Semantic Reranking using Cohere rerank model via
sp_invoke_external_rest_endpointfunction. For more details on semantic reranking, refer to the Semantic Reranking Sample. - Use Half-Precision floating points to store embeddings to have a more compact representation of vectors.
- Use the Vectorizer to generate embeddings for text data.
To quickly generate embeddings for existing text data, you can use the Vectorizer, which is available as an sample open-source project here: azure-sql-db-vectorizer
A full end-to-end sample using Streamlit is available here: https://github.com/Azure-Samples/azure-sql-diskann