Skip to content

Commit 84c6ffa

Browse files
diberryCopilot
andcommitted
fix(java): fix OIDC auth and compilation errors, re-capture all output with UTF-8
- Fix Java OIDC auth: use callback pattern matching vector-search-java - Fix Java compile: pass MongoDatabase to createIndex, handle InterruptedException - Re-run all 5 language samples and capture output with proper UTF-8 encoding - Fix garbled Unicode characters in TypeScript, Python, Go output files Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 609cd61 commit 84c6ffa

6 files changed

Lines changed: 169 additions & 145 deletions

File tree

ai/select-algorithm-go/output/compare_all.txt

Lines changed: 24 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
======================================================================
2-
COMPARE ALL: 3 Algorithms × 3 Similarity Metrics (9 combinations)
2+
COMPARE ALL: 3 Algorithms × 3 Similarity Metrics (9 combinations)
33
======================================================================
44
Query: "luxury hotel near the beach"
55
Top K: 3
@@ -16,24 +16,24 @@ Generating query embedding...
1616
Embedding generated (1536 dimensions)
1717

1818
Running 9 vector searches (create/search/drop per combo)...
19-
Γ£ô vector_ivf_cos (created)
20-
Γ£ù vector_ivf_cos (dropped)
21-
Γ£ô vector_ivf_l2 (created)
22-
Γ£ù vector_ivf_l2 (dropped)
23-
Γ£ô vector_ivf_ip (created)
24-
Γ£ù vector_ivf_ip (dropped)
25-
Γ£ô vector_hnsw_cos (created)
26-
Γ£ù vector_hnsw_cos (dropped)
27-
Γ£ô vector_hnsw_l2 (created)
28-
Γ£ù vector_hnsw_l2 (dropped)
29-
Γ£ô vector_hnsw_ip (created)
30-
Γ£ù vector_hnsw_ip (dropped)
31-
Γ£ô vector_diskann_cos (created)
32-
Γ£ù vector_diskann_cos (dropped)
33-
Γ£ô vector_diskann_l2 (created)
34-
Γ£ù vector_diskann_l2 (dropped)
35-
Γ£ô vector_diskann_ip (created)
36-
Γ£ù vector_diskann_ip (dropped)
19+
vector_ivf_cos (created)
20+
vector_ivf_cos (dropped)
21+
vector_ivf_l2 (created)
22+
vector_ivf_l2 (dropped)
23+
vector_ivf_ip (created)
24+
vector_ivf_ip (dropped)
25+
vector_hnsw_cos (created)
26+
vector_hnsw_cos (dropped)
27+
vector_hnsw_l2 (created)
28+
vector_hnsw_l2 (dropped)
29+
vector_hnsw_ip (created)
30+
vector_hnsw_ip (dropped)
31+
vector_diskann_cos (created)
32+
vector_diskann_cos (dropped)
33+
vector_diskann_l2 (created)
34+
vector_diskann_l2 (dropped)
35+
vector_diskann_ip (created)
36+
vector_diskann_ip (dropped)
3737

3838
====================================================================================================
3939
COMPARISON RESULTS
@@ -50,16 +50,16 @@ Running 9 vector searches (create/search/drop per combo)...
5050
DISKANN L2 Ocean Water Resort &.. 0.8736 Windy Ocean Motel 0.9943 -0.1208
5151
DISKANN IP Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5056 0.1128
5252

53-
🎯 Highest #1 score: IVF/COS (0.6184)
54-
📊 Biggest separation: IVF/COS (diff: 0.1128)
53+
🎯 Highest #1 score: IVF/COS (0.6184)
54+
📊 Biggest separation: IVF/COS (diff: 0.1128)
5555

5656
====================================================================================================
5757
KEY INSIGHTS
5858
====================================================================================================
59-
🔑 All algorithms return the same top results — algorithm choice
59+
🔑 All algorithms return the same top results algorithm choice
6060
affects performance at scale, not accuracy on small datasets.
61-
📐 COS and IP produce identical scores (normalized embeddings).
62-
📏 L2 scores are distances (lower = closer), not similarities.
61+
📐 COS and IP produce identical scores (normalized embeddings).
62+
📏 L2 scores are distances (lower = closer), not similarities.
6363
====================================================================================================
6464

6565
Cleanup: dropping collection 'hotels'...
Lines changed: 55 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,58 +1,68 @@
1-
============================================================
2-
Compare All Algorithms × Metrics
3-
9 combinations: IVF, HNSW, DiskANN × COS, L2, IP
4-
============================================================
1+
==============================================
2+
Azure DocumentDB - Compare All Algorithms
3+
==============================================
4+
Query: "luxury hotel near the beach"
5+
Top K: 3
6+
Metrics: COS, L2, IP
7+
Algos: IVF, HNSW, DiskANN
58

6-
Loaded 50 documents with embeddings
7-
Inserted 50/50 documents
9+
Loading data from: ../data/Hotels_Vector.json
10+
Loaded 50 documents
11+
Inserting 50 documents in batches of 100...
12+
Inserted batch 1-50
13+
Data insertion complete.
814

9-
Query: "luxury hotel near the beach"
10-
Top K: 3
11-
Embedding generated (reused for all searches)
15+
Generating embedding for: "luxury hotel near the beach"
16+
Embedding generated (1536 dimensions)
1217

13-
Running searches (create/search/drop per combo)...
14-
✔ vector_ivf_cos (created)
15-
✗ vector_ivf_cos (dropped)
16-
✔ vector_ivf_l2 (created)
17-
✗ vector_ivf_l2 (dropped)
18-
✔ vector_ivf_ip (created)
19-
✗ vector_ivf_ip (dropped)
20-
✔ vector_hnsw_cos (created)
21-
✗ vector_hnsw_cos (dropped)
22-
✔ vector_hnsw_l2 (created)
23-
✗ vector_hnsw_l2 (dropped)
24-
✔ vector_hnsw_ip (created)
25-
✗ vector_hnsw_ip (dropped)
26-
✔ vector_diskann_cos (created)
27-
✗ vector_diskann_cos (dropped)
28-
✔ vector_diskann_l2 (created)
29-
✗ vector_diskann_l2 (dropped)
30-
✔ vector_diskann_ip (created)
31-
✗ vector_diskann_ip (dropped)
18+
Running searches (create/search/drop per combo)...
19+
✓ vector_ivf_cos (created)
20+
✗ vector_ivf_cos (dropped)
21+
✓ vector_ivf_l2 (created)
22+
✗ vector_ivf_l2 (dropped)
23+
✓ vector_ivf_ip (created)
24+
✗ vector_ivf_ip (dropped)
25+
✓ vector_hnsw_cos (created)
26+
✗ vector_hnsw_cos (dropped)
27+
✓ vector_hnsw_l2 (created)
28+
✗ vector_hnsw_l2 (dropped)
29+
✓ vector_hnsw_ip (created)
30+
✗ vector_hnsw_ip (dropped)
31+
✓ vector_diskann_cos (created)
32+
✗ vector_diskann_cos (dropped)
33+
✓ vector_diskann_l2 (created)
34+
✗ vector_diskann_l2 (dropped)
35+
✓ vector_diskann_ip (created)
36+
✗ vector_diskann_ip (dropped)
37+
38+
Cleanup: dropping comparison collection...
39+
Cleanup: dropped collection 'hotels'
3240

3341
╔════════════════════════════════════════════════════════════════════════════════════════════════════════╗
34-
║ COMPARISON TABLE — All Algorithms × Metrics
42+
║ COMPARISON TABLE — All Algorithms × Metrics ║
3543
╠════════════════════════════════════════════════════════════════════════════════════════════════════════╣
36-
║ ALGO SIMILAR. #1 RESULT #1 SCORE #2 RESULT #2 SCORE DIFF ║
44+
║ ALGO SIMILAR. #1 RESULT #1 SCORE #2 RESULT #2 SCORE DIFF
3745
╠════════════════════════════════════════════════════════════════════════════════════════════════════════╣
38-
║ IVF COS Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5056 0.1128 ║
39-
║ IVF L2 Ocean Water Resort &.. 0.8736 Windy Ocean Motel 0.9943 -0.1208
40-
║ IVF IP Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5056 0.1128
41-
║ HNSW COS Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5056 0.1128 ║
42-
║ HNSW L2 Ocean Water Resort &.. 0.8736 Windy Ocean Motel 0.9943 -0.1208
43-
║ HNSW IP Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5056 0.1128
44-
DiskANN COS Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5056 0.1128 ║
45-
DiskANN L2 Ocean Water Resort &.. 0.8736 Windy Ocean Motel 0.9943 -0.1208
46-
DiskANN IP Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5056 0.1128
46+
║ IVF COS Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5057 0.1128
47+
║ IVF L2 Ocean Water Resort &.. 0.8735 Windy Ocean Motel 0.9942 -0.1207
48+
║ IVF IP Ocean Water Resort &.. 0.6183 Windy Ocean Motel 0.5056 0.1127
49+
║ HNSW COS Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5057 0.1128
50+
║ HNSW L2 Ocean Water Resort &.. 0.8735 Windy Ocean Motel 0.9942 -0.1207
51+
║ HNSW IP Ocean Water Resort &.. 0.6183 Windy Ocean Motel 0.5056 0.1127
52+
DISKANN COS Ocean Water Resort &.. 0.6184 Windy Ocean Motel 0.5057 0.1128
53+
DISKANN L2 Ocean Water Resort &.. 0.8735 Windy Ocean Motel 0.9942 -0.1207
54+
DISKANN IP Ocean Water Resort &.. 0.6183 Windy Ocean Motel 0.5056 0.1127
4755
╠════════════════════════════════════════════════════════════════════════════════════════════════════════╣
48-
🎯 Highest score: IVF/COS (0.6184) ║
49-
📊 Biggest separation: 0.1128 ║
56+
Highest score: IVF/COS (0.6184)
57+
Biggest separation: 0.1128
5058
╠════════════════════════════════════════════════════════════════════════════════════════════════════════╣
5159
║ KEY INSIGHTS ║
52-
🔑 All algorithms return the same top results — algorithm choice ║
60+
All algorithms return the same top results — algorithm choice
5361
║ affects performance at scale, not accuracy on small datasets. ║
54-
📐 COS and IP produce identical scores (normalized embeddings). ║
55-
📏 L2 scores are distances (lower = closer), not similarities. ║
62+
COS and IP produce identical scores (normalized embeddings).
63+
L2 scores are distances (lower = closer), not similarities.
5664
╚════════════════════════════════════════════════════════════════════════════════════════════════════════╝
5765

58-
Cleanup: dropped collection 'hotels'
66+
==============================================
67+
Comparison complete.
68+
==============================================

ai/select-algorithm-java/src/main/java/com/azure/documentdb/selectalgorithm/CompareAll.java

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -81,9 +81,9 @@ public static void run() {
8181
String indexName = String.format("vector_%s_%s", algo, metric.toLowerCase());
8282

8383
// Create index for this combo
84-
createIndex(collection, vectorField, dimensions, algo, metric);
84+
createIndex(database, collection, vectorField, dimensions, algo, metric);
8585
System.out.printf(" ✓ %s (created)%n", indexName);
86-
Thread.sleep(2000);
86+
try { Thread.sleep(2000); } catch (InterruptedException ie) { Thread.currentThread().interrupt(); }
8787

8888
// Search
8989
List<Document> searchResults = performSearch(
@@ -132,7 +132,7 @@ public static void run() {
132132
printComparisonTable(results, topK);
133133
}
134134

135-
private static void createIndex(MongoCollection<Document> collection,
135+
private static void createIndex(MongoDatabase database, MongoCollection<Document> collection,
136136
String vectorField, int dimensions,
137137
String algo, String metric) {
138138
String indexName = String.format("vector_%s_%s", algo, metric.toLowerCase());
@@ -164,7 +164,7 @@ private static void createIndex(MongoCollection<Document> collection,
164164
.append("indexes", List.of(indexDefinition));
165165

166166
try {
167-
collection.getDatabase().runCommand(command);
167+
database.runCommand(command);
168168
} catch (Exception e) {
169169
// Idempotent: ignore if index already exists
170170
if (!e.getMessage().contains("already exists")) {

ai/select-algorithm-java/src/main/java/com/azure/documentdb/selectalgorithm/Utils.java

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -44,15 +44,29 @@ public static MongoClient getMongoClient() {
4444
throw new IllegalStateException("MONGO_CLUSTER_NAME environment variable is required");
4545
}
4646

47-
String connectionUri = String.format(
48-
"mongodb+srv://%s.global.mongocluster.cosmos.azure.com/", clusterName);
47+
String managedIdentityPrincipalId = getEnv("AZURE_MANAGED_IDENTITY_CLIENT_ID", "");
4948

50-
DefaultAzureCredential credential = new DefaultAzureCredentialBuilder().build();
49+
DefaultAzureCredential azureCredential = new DefaultAzureCredentialBuilder().build();
50+
51+
MongoCredential.OidcCallback callback = (MongoCredential.OidcCallbackContext context) -> {
52+
var token = azureCredential.getToken(
53+
new com.azure.core.credential.TokenRequestContext()
54+
.addScopes("https://ossrdbms-aad.database.windows.net/.default")
55+
).block();
56+
57+
if (token == null) {
58+
throw new RuntimeException("Failed to obtain Azure AD token");
59+
}
60+
61+
return new MongoCredential.OidcCallbackResult(token.getToken());
62+
};
5163

5264
MongoCredential mongoCredential = MongoCredential.createOidcCredential(null)
53-
.withMechanism(MongoCredential.MONGODB_OIDC_MECHANISM)
54-
.withMechanismProperty("ENVIRONMENT", "azure")
55-
.withMechanismProperty("TOKEN_RESOURCE", "https://ossrdbms-aad.database.windows.net");
65+
.withMechanismProperty("OIDC_CALLBACK", callback);
66+
67+
String connectionUri = String.format(
68+
"mongodb+srv://%s@%s.mongocluster.cosmos.azure.com/?authMechanism=MONGODB-OIDC&tls=true&retrywrites=false&maxIdleTimeMS=120000",
69+
managedIdentityPrincipalId, clusterName);
5670

5771
MongoClientSettings settings = MongoClientSettings.builder()
5872
.applyConnectionString(new ConnectionString(connectionUri))

0 commit comments

Comments
 (0)