|
1 | 1 | # SISAP 2026 — deglib |
2 | 2 |
|
3 | | -The original C++/Python implementation has been moved to the **[python/](file:///c:/Lang/Python/sisap26-deglib/python)** subdirectory. |
| 3 | +Submission for the [SISAP 2026 Indexing Challenge](https://sisap-challenges.github.io/2026/). |
| 4 | +The index is a [**Dynamic Exploration Graph (DEG)**](https://github.com/Visual-Computing/DynamicExplorationGraph/tree/evp) |
| 5 | +combined with [**EVP (Equi-Voronoi Polytope) quantization**](https://github.com/MetricSearch/metric_space_rust), |
| 6 | +implemented in C++ (`cpp/`) and driven by the official baseline Python harness |
| 7 | +(`submission/`). Everything ships as a **single Docker container** that TIRA builds |
| 8 | +from this repo. |
4 | 9 |
|
5 | | -You can find the documentation for this code in the legacy README: |
6 | | -* **[python/README.md](file:///c:/Lang/Python/sisap26-deglib/python/README.md)** |
| 10 | +## Approach |
7 | 11 |
|
8 | | -The root directory is reserved for the upcoming Rust implementation. |
| 12 | +- **Index:** deglib's even-regular exploration graph, built once per run, then a |
| 13 | + parameter sweep produces several operating points (build/recall trade-offs) from |
| 14 | + that single build. |
| 15 | +- **Task 1** (k-NN self-join, scored on **build + search time**): graph mode `mode4` — |
| 16 | + EVP build + EVP explore + exact FP16 inner-product rerank. Here **search = explore + |
| 17 | + rerank**. Task 1 ranks on the *total* (build + search), which `search.py` packs into |
| 18 | + the `buildtime` attribute with `querytime` = 0 — so `buildtime` is the sum, not a |
| 19 | + claim that search is part of building. Search is a real component, often ≈ half the |
| 20 | + total. |
| 21 | +- **Task 2** (MIPS search, scored on **query time**): graph mode `mode5` — |
| 22 | + L2-converted FP32 build (with FLAS pre-sort) + FP16 inner-product search. |
| 23 | +- **Task 3** (sparse SPLADE) is out of scope and skipped cleanly (exit 0) so the |
| 24 | + mandatory spot-check CI stays green. |
| 25 | + |
| 26 | +The C++ binary computes neighbors **and** distances during search; the thin Python |
| 27 | +entrypoint adapts the output to the official result format. Per-dataset parameters |
| 28 | +live in `TASK1_PROFILES` / `TASK2_PROFILES` in [`submission/search.py`](submission/search.py) |
| 29 | +— unknown datasets fail fast rather than silently using bad parameters. |
| 30 | + |
| 31 | +## Challenge tasks & constraints |
| 32 | + |
| 33 | +Both tasks run under the same hard limits: **8 vCPUs, 24 GB RAM, ≤ 8 h, read-only |
| 34 | +dataset, no internet** in the container (the eval node is an AMD EPYC 7F72, no |
| 35 | +AVX-512). The goal is **≥ 0.8 average recall**; among the operating points that reach |
| 36 | +it, the fastest on the scored metric wins. |
| 37 | + |
| 38 | +| | Task 1 | Task 2 | |
| 39 | +|----------------|-----------------------------------------------|----------------------------------------------------| |
| 40 | +| Dataset family | Wikipedia BGE-M3 (FP16, 1024-dim, normalized) | Llama-Dev (FP32, 128-dim) | |
| 41 | +| Problem | k-NN **graph** self-join, k = 15 | k-NN **search**, k = 30 | |
| 42 | +| Distance | inner product | inner product (via L2 lift) | |
| 43 | +| Scored metric | build + search wall-clock (`buildtime`) | query time (`querytime`) | |
| 44 | +| Build threads | all 8 | **1** — graph built single-threaded, per the rules | |
| 45 | + |
| 46 | +### Datasets |
| 47 | + |
| 48 | +| Task | Variant | File | Vectors | |
| 49 | +|------|--------------|-------------------------------------------|------------------------------------------| |
| 50 | +| 1 | spot-check | `benchmark-dev-gooaq-small.h5` | 10,000 (384-dim — off-family smoke test) | |
| 51 | +| 1 | small (dev) | `benchmark-dev-wikipedia-bge-m3-small.h5` | 200,000 | |
| 52 | +| 1 | large (eval) | `benchmark-dev-wikipedia-bge-m3.h5` | 6,350,000 | |
| 53 | +| 2 | spot-check | `benchmark-dev-llama-small.h5` | 14,000 | |
| 54 | +| 2 | dev/eval | `llama-dev.h5` | 256,921 | |
| 55 | + |
| 56 | +## Graph modes |
| 57 | + |
| 58 | +The `deglib_sisap` binary implements seven graph modes per task (`mode1`…`mode7`). |
| 59 | +The profiles in `search.py` currently use **`mode4` (Task 1)** and **`mode5` (Task 2)**; |
| 60 | +⭐ marks the strongest submission candidates (the other ⭐, `mode7`, is a close |
| 61 | +alternative that is implemented but not wired into a profile). All modes share the |
| 62 | +same save-mode contract (one result file per operating point holding neighbor ids |
| 63 | +**and** distances), so they are drop-in benchmark alternatives. |
| 64 | + |
| 65 | +**Task 1** — EVP variants |
| 66 | + |
| 67 | +| Mode | Name | Description | |
| 68 | +|-------------|--------------------------------|----------------------------------------------| |
| 69 | +| mode1 | fp16 | FP16 build + FP16 explore | |
| 70 | +| mode2 | evp-linear | EVP quantization + brute-force linear search | |
| 71 | +| mode3 | evp | EVP build + EVP explore (no rerank) | |
| 72 | +| **mode4** ⭐ | evp-rerank | EVP build + EVP explore + FP16 rerank | |
| 73 | +| mode5 | evp-build-fp16-external-search | EVP build + FP16 external graph search | |
| 74 | +| mode6 | evp-asymmetric | EVP build + asymmetric FP16-vs-EVP search | |
| 75 | +| mode7 ⭐ | evp-asymmetric-rerank | EVP build + asymmetric search + FP16 rerank | |
| 76 | + |
| 77 | +**Task 2** — L2-lift variants |
| 78 | + |
| 79 | +| Mode | Name | Description | |
| 80 | +|-------------|-------------------------|-----------------------------------------| |
| 81 | +| mode1 | baseline | FP32 build + FP32 inner-product explore | |
| 82 | +| mode2 | fp16-build-fp16-explore | FP16 build + FP16 IP explore | |
| 83 | +| mode3 | baseline-fp16 | FP32 build + FP16 IP explore | |
| 84 | +| mode4 | l2-converted | FP32 L2(d+1) build + FP32 L2 explore | |
| 85 | +| **mode5** ⭐ | l2-fp16-ip | FP32 L2(d+1) build + FP16 IP explore | |
| 86 | +| mode6 | l2-fp16-l2 | FP32 L2(d+1) build + FP16 L2 explore | |
| 87 | +| mode7 ⭐ | l2-fp16-d2 | FP32 L2(d+2) build + FP16 L2 explore | |
| 88 | + |
| 89 | +## Repository layout |
| 90 | + |
| 91 | +| Path | Contents | |
| 92 | +|---------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------| |
| 93 | +| [`cpp/`](cpp/) | deglib (DEG) C++ library and the SISAP binary under `cpp/sisap/` (`task1.cpp`, `task2.cpp`, `sisap.cpp`, per-mode headers in `task1/`, `task2/`). | |
| 94 | +| [`submission/`](submission/) | TIRA entrypoint `search.py` plus the vendored baseline harness (`eval.py`, `datasets.py`, `plot.py`, `show_operating_points.py`, `data/*/config.json`). | |
| 95 | +| [`Dockerfile`](Dockerfile) | Two-stage image: build the binary (AVX2), then a thin Python runtime that runs `search.py`. | |
| 96 | +| [`.github/workflows/ci.yml`](.github/workflows/ci.yml) | Builds the image and runs all three spot-checks through the exact TIRA command schema, then evaluates + plots. | |
| 97 | +| `python/` | Legacy reference implementation (not used by the submission). | |
| 98 | + |
| 99 | +## How it runs on TIRA |
| 100 | + |
| 101 | +TIRA builds the image from the repo, mounts the dataset (no internet inside the |
| 102 | +container), and invokes: |
| 103 | + |
| 104 | +```bash |
| 105 | +python3 /app/search.py \ |
| 106 | + --input $inputDataset/*.h5 \ |
| 107 | + --task-description $inputDataset/config.json \ |
| 108 | + --output $outputDir |
| 109 | +``` |
| 110 | + |
| 111 | +`search.py` reads the task config, decompresses the input on the fly when needed |
| 112 | +(the C++ HDF5 reader only handles contiguous datasets, so gzip/chunked inputs are |
| 113 | +materialized to an uncompressed temp file via `h5py`), drives the binary once per |
| 114 | +profile, and writes one result file per operating point. |
| 115 | + |
| 116 | +## Output format |
| 117 | + |
| 118 | +One HDF5 file per operating point under `$outputDir`, each with: |
| 119 | + |
| 120 | +- datasets `knns` (1-based neighbor ids; if a query returns fewer than k candidates |
| 121 | + the padding slots are the node's own id for Task 1 and `0` for Task 2 — both |
| 122 | + harmless, since the evaluator scores by set membership) and `dists` (float), both |
| 123 | + the same shape — **`n × (k+1)` for Task 1**, **`n × k` for Task 2**; |
| 124 | +- root attributes `algo`, `dataset`, `task`, `buildtime`, `querytime`, `params`. |
| 125 | + |
| 126 | +Task 1 prepends the self-reference in column 0 (the extra `+1` column), matching the |
| 127 | +ground-truth layout the evaluator uses; Task 2 has no self column. Only `knns` is |
| 128 | +scored — `recall = mean_i |knns[i,:k] ∩ gt[i,:k]| / k`. |
| 129 | + |
| 130 | +## Build & run locally |
| 131 | + |
| 132 | +```bash |
| 133 | +# Build the submission image |
| 134 | +docker build -t sisap-deglib . |
| 135 | + |
| 136 | +# Run one task the way TIRA does (dataset dir holds the .h5 + config.json) |
| 137 | +mkdir -p results |
| 138 | +docker run --rm --cpus=8 --memory=24g \ |
| 139 | + -v "$PWD/your-dataset-dir:/app/data/ds:ro" \ |
| 140 | + -v "$PWD/results:/app/results:rw" \ |
| 141 | + sisap-deglib \ |
| 142 | + python3 /app/search.py --input '/app/data/ds/*.h5' \ |
| 143 | + --task-description /app/data/ds/config.json --output /app/results |
| 144 | + |
| 145 | +# Score the results against the dataset ground truth (run from submission/, |
| 146 | +# like CI does, so eval.py can import the harness modules) |
| 147 | +cd submission && PYTHONPATH=. python3 eval.py --results ../results res.csv |
| 148 | +``` |
| 149 | + |
| 150 | +### Building just the C++ binary |
| 151 | + |
| 152 | +```bash |
| 153 | +cmake -S cpp -B cpp/build -DCMAKE_BUILD_TYPE=Release -DFORCE_AVX2=ON |
| 154 | +cmake --build cpp/build --target deglib_sisap -j"$(nproc)" |
| 155 | + |
| 156 | +# Usage: deglib_sisap <task1|task2> <input.h5> <mode> [options] |
| 157 | +# Save mode writes one .bin per operating point into the --output directory: |
| 158 | +cpp/build/bin/deglib_sisap task2 dataset.h5 mode5 \ |
| 159 | + --no-recall --output results_dir \ |
| 160 | + --k-top 30 --max-dist 5000,7000 --eps-search 0.18,0.2 --flas |
| 161 | +``` |
| 162 | + |
| 163 | +`--march`/AVX note: the build is pinned to **AVX2 (no AVX-512)** because the eval |
| 164 | +node is an AMD EPYC 7F72 (Zen 2) with 8 vCPU / 24 GB RAM and no AVX-512. |
| 165 | + |
| 166 | +## Continuous integration |
| 167 | + |
| 168 | +On every push the CI builds the image and runs all three spot-checks through the |
| 169 | +same command schema TIRA uses, under the eval node's resource limits |
| 170 | +(`--cpus=8 --memory=24g`), then runs `eval.py` / `plot.py` / |
| 171 | +`show_operating_points.py`. There is no hard recall gate — it builds, runs and |
| 172 | +reports, which is what the challenge requires for a valid public submission. |
0 commit comments