You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
export NUM_PER_BATCH=4096# default batch size for enVector
105
105
```
106
106
107
-
## Run Benchmark
107
+
## Run Our ANN Benchmark
108
+
109
+
We provide enVector-customized ANN, called "GAS", designed to perform efficient IVF-FLAT-based ANN search with the encrypted index.
110
+
We evaluated enVector on two benchmark datasets that we provided:
111
+
-`PUBMED768D400K`
112
+
-`BLOOMBERG768D368K`
113
+
114
+
Run the provided shell scripts (`./scripts/run_benchmark.sh`) as the following:
115
+
116
+
```bash
117
+
./scripts/run_benchmark.sh --type flat # FLAT
118
+
./scripts/run_benchmark.sh --type ivf # IVF-FLAT with enVector-customized ANN (GAS)
119
+
```
120
+
121
+
For more details, please refer to `run_benchmark.sh` or `envector_{benchmark}_config.yml` in scripts directory for benchmarks with enVector with ANN (GAS), or you can use the following command:
108
122
109
-
Refer to `./scripts/run_benchmark.sh` or `./scripts/envector_benchmark_config.yml` for benchmarks with enVector with ANN (VCT), or use the following command:
110
123
111
124
```bash
112
-
export NUM_PER_BATCH=500000 # set to the database size for efficiency with IVF_FLAT
125
+
export NUM_PER_BATCH=500000 # set to the database size when IVF_FLAT
--centroids-path "./centroids/kmeans_centroids.npy"\ # centroids built by sklearn, etc.
178
+
--nlist 250 \
179
+
--nprobe 6
180
+
```
181
+
182
+
Note that, the benchmark provided by VectorDBBench, including Performance1536D500K, uses **unknown** embedding model (just notified as openai's one), we cannot use our GAS approach for ANN.
183
+
184
+
### CLI Options
185
+
186
+
enVector Types for VectorDBBench
187
+
-`envectorflat`: FLAT as index type for enVector
188
+
-`envectorivfflat`: IVF_FLAT as index type for enVector
189
+
190
+
Common Options for enVector
191
+
-`--uri`: enVector server URI
192
+
-`--eval-mode`: FHE evaluation mode on server. Use `mm` for enhanced performance.
193
+
194
+
ANN Options for enVector
195
+
-`--nlist`: Number of coarse clusters for IVF_FLAT
196
+
-`--nprobe`: Number of clusters to scan during search for IVF_FLAT
197
+
-`--train-centroids`: whether to use trained centroids for IVF_FLAT
198
+
-`--centroids-path`: path to the trained centroids
199
+
-`--is-vct`: whether to use VCT approach for IVF_GAS
200
+
-`--vct-path`: path to the trained VCT metadata for IVF_GAS
201
+
202
+
Benchmark Options:
203
+
follows conventions of VectorDBBench,
204
+
see details in [VectorDBBench Options](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file#custom-dataset-for-performance-case)
0 commit comments