feat: Add select-algorithm samples for DocumentDB vector index selection (5 languages)#74
Merged
Merged
Conversation
45387bd to
5114591
Compare
This was referenced Apr 30, 2026
This was referenced May 6, 2026
ed818fa to
d0e7e60
Compare
2ec45b0 to
ce6ff95
Compare
Add DocumentDB vector index algorithm selection samples demonstrating HNSW, IVF, and DiskANN index types across TypeScript, Python, Go, Java, and .NET. Each sample creates indexes with documented defaults, performs vector searches, and compares results. CI updated to validate all new samples in the existing workflow matrix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ce6ff95 to
5e0eec4
Compare
- Rename MONGO_CLUSTER_NAME to DOCUMENTDB_CLUSTER_NAME in all 5 language samples - Add DOCUMENTDB_CLUSTER_NAME dual-output in Bicep (preserves backward compat) - Replace Data Explorer cleanup guidance with VS Code extension - Strengthen algorithm guidance: DiskANN recommended for enterprise (16K dims, disk-based) - Remove python-dotenv from pip install (repo rule Azure-Samples#10) - Fix Python filename refs (select_algorithm.py -> compare_all.py) - Revert out-of-scope vector-search-* changes to origin/main Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use 'Azure Databases extension' consistently in all 5 language quickstarts to match the actual marketplace listing (ms-azuretools.vscode-cosmosdb). The section intro previously said 'Azure DocumentDB extension' while the link tab already used the correct 'Azure Databases extension' name. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace Azure Databases extension (ms-azuretools.vscode-cosmosdb) with DocumentDB for VS Code (ms-azuretools.vscode-documentdb) per Khelan's PM feedback to align with recommended developer experience. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tructions - Add rule 14: always use ms-azuretools.vscode-documentdb - Remove ms-azuretools.vscode-cosmosdb exception from rule 1 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create ai/includes/choosing-algorithm.md with enhanced content - Add quick-reference decision table (IVF/DiskANN/HNSW by scenario) - Elevate DiskANN-as-default recommendation with IMPORTANT callout - Add operational benefits: easier backups, faster recovery - Add dimension future-proofing context (models evolving past 8K) - Replace duplicated sections in all 5 quickstarts with include ref - Addresses Khelan Modi feedback points #3 and #4 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Checklist covers branding/naming, tooling references, index selection guidance, and DiskANN-as-default requirements. Derived from Khelan Modi (DocumentDB PM) feedback on PR Azure-Samples#74. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add bounded retry logic (5 attempts, 2s backoff) for index readiness in all 5 languages - Fix Go: validate LOAD_SIZE_BATCH/EMBEDDING_DIMENSIONS > 0, track comparison failures - Fix TypeScript: exit non-zero on total failure, remove 'all' as valid algo/similarity value - Fix Python quickstart: correct download URL path (ai/data/ not data/) - Standardize data file path guidance across all quickstarts - Remove ALGORITHM=all / SIMILARITY=all from all docs (use unset for all combos) - Fix quickstart entrypoints to match actual code (TS, Java, Go, .NET) - Replace .NET appsettings real values with placeholders, document Section__Key overrides - Align copilot-instructions: DiskANN 32/50 for select-algorithm, document naming exception Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…v patterns - Add exit-code-on-all-fail to .NET, Java, Python (matching Go/TS) - Replace all 5 quickstart output blocks with actual output/*.txt content - Fix file tree layouts to match actual project structure - Fix version refs: .NET 8 (not 9), Java 17 (not 21) - Remove dotenv/.env-file patterns (Java dotenv, TS --env-file) - Fix devcontainer extensions: vscode-cosmosdb -> vscode-documentdb - Fix Python CosmosDB branding -> DocumentDB - Standardize TS retry to 6 attempts, remove fixed waits - Make TS scalar indexes optional (skip in compare-all) - Clarify compare-all always runs 9 combos (ignores ALGORITHM/SIMILARITY) - Add Diff column explanation to all quickstarts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Python: updated scores to match pymongo float precision (0.6183/0.5057/0.8735/0.9942) - .NET: added Summary line and Done footer to output - Java: fixed output order (table before cleanup), DISKANN casing, added Summary - devcontainer: removed stale vscode-cosmosdb extension - appsettings.json: reverted to placeholder values Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…count phrasing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds select-algorithm samples demonstrating how to choose the optimal vector index algorithm (HNSW, IVF, DiskANN) for Azure Cosmos DB for MongoDB (vCore) / DocumentDB, in 5 languages: TypeScript, Python, Go, Java, and .NET.
What's included
ai/select-algorithm-typescript/ai/select-algorithm-python/ai/select-algorithm-go/ai/select-algorithm-java/ai/select-algorithm-dotnet/Each sample includes:
CI
validate-samples.yml— Builds all 5 language samples on PR and pushKey parameters (from docs)
Documentation references