Skip to content

Commit d2103c3

Browse files
committed
cleanup readmes
1 parent 86e7258 commit d2103c3

3 files changed

Lines changed: 109 additions & 77 deletions

File tree

README.md

Lines changed: 63 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -2,37 +2,33 @@
22

33
Submission for the [SISAP 2026 Indexing Challenge](https://sisap-challenges.github.io/2026/).
44
The index is a [**Dynamic Exploration Graph (DEG)**](https://github.com/Visual-Computing/DynamicExplorationGraph/tree/evp)
5-
combined with [**EVP (Equi-Voronoi Polytope) quantization**](https://github.com/MetricSearch/metric_space_rust),
5+
combined with [**EVP (Equi-Voronoi Polytope) quantization**](https://github.com/MetricSearch/metric_space_rust) and
6+
[**FLAS (Fast Linear Assignment Sorter)**](https://github.com/Visual-Computing/LAS_FLAS) for optimized insertion order,
67
implemented in C++ (`cpp/`) and driven by the official baseline Python harness
7-
(`submission/`). Everything ships as a **single Docker container** that TIRA builds
8+
(`submission/`). Everything ships as a **single Docker container** built by TIRA
89
from this repo.
910

1011
## Approach
1112

12-
- **Index:** deglib's even-regular exploration graph, built once per run, then a
13-
parameter sweep produces several operating points (build/recall trade-offs) from
14-
that single build.
15-
- **Task 1** (k-NN self-join, scored on **build + search time**): graph mode `mode4`
16-
EVP build + EVP explore + exact FP16 inner-product rerank. Here **search = explore +
17-
rerank**. Task 1 ranks on the *total* (build + search), which `search.py` packs into
18-
the `buildtime` attribute with `querytime` = 0 — so `buildtime` is the sum, not a
19-
claim that search is part of building. Search is a real component, often ≈ half the
20-
total.
21-
- **Task 2** (MIPS search, scored on **query time**): graph mode `mode5`
22-
L2-converted FP32 build (with FLAS pre-sort) + FP16 inner-product search.
23-
- **Task 3** (sparse SPLADE) is out of scope and skipped cleanly (exit 0) so the
24-
mandatory spot-check CI stays green.
13+
We configure the even-regular Dynamic Exploration Graph (deglib) library to target the specific constraints and scoring metrics of the two tasks:
14+
15+
- **Task 1** (k-NN self-join, scored on **total build + search time**):
16+
We build the graph using EVP-quantized representations (`EvpBits` metric). Since this is a self-join, every database element has a corresponding vertex in the graph. We optimize the search by starting the traversal directly at the target vertex's position, bypassing the entry-point routing phase. We evaluate two configurations on this graph:
17+
- **`mode4` (evp-rerank):** The traversal walks the local neighborhood starting from the target vertex using quantized `EvpBits` distances. The retrieved candidates are then reranked using exact FP16 inner products.
18+
- **`mode7` (evp-asymmetric-rerank):** The traversal walks the local neighborhood starting from the target vertex using an asymmetric distance function (the vertex's original FP16 vector vs. the EVP-quantized vertices in the graph), followed by exact FP16 inner-product reranking of the retrieved candidates.
19+
- **Task 2** (MIPS search, scored on **query time**):
20+
To perform maximum inner product search (MIPS) on the deglib graph, we transform the inner product into an L2-similarity search by extending the vectors' dimensionality. We build the graph once and sweep both `eps_search` and `max_dist` on the built graph to produce multiple operating points. We evaluate two configurations:
21+
- **`mode5` (l2-fp16-ip):** Vectors are extended to $d+1$ dimensions to transform inner product to L2 distance during the build (speeded up by pre-sorting vectors using **FLAS**). The query search is performed using fast FP16 inner-product exploration.
22+
- **`mode7` (l2-fp16-d2):** Vectors are extended to $d+2$ dimensions for the graph build (also utilizing FLAS). Query search is performed using fast FP16 L2 distance exploration.
2523

2624
The C++ binary computes neighbors **and** distances during search; the thin Python
27-
entrypoint adapts the output to the official result format. Per-dataset parameters
28-
live in `TASK1_PROFILES` / `TASK2_PROFILES` in [`submission/search.py`](submission/search.py)
29-
— unknown datasets fail fast rather than silently using bad parameters.
25+
entrypoint adapts the output to the official result format.
3026

3127
## Challenge tasks & constraints
3228

3329
Both tasks run under the same hard limits: **8 vCPUs, 24 GB RAM, ≤ 8 h, read-only
3430
dataset, no internet** in the container (the eval node is an AMD EPYC 7F72, no
35-
AVX-512). The goal is **≥ 0.8 average recall**; among the operating points that reach
31+
AVX-512). The goal is **≥ 0.8 average recall**; among the operating points reaching
3632
it, the fastest on the scored metric wins.
3733

3834
| | Task 1 | Task 2 |
@@ -41,7 +37,7 @@ it, the fastest on the scored metric wins.
4137
| Problem | k-NN **graph** self-join, k = 15 | k-NN **search**, k = 30 |
4238
| Distance | inner product | inner product (via L2 lift) |
4339
| Scored metric | build + search wall-clock (`buildtime`) | query time (`querytime`) |
44-
| Build threads | all 8 | **1** — graph built single-threaded, per the rules |
40+
| Build threads | all 8 | 1 (configured build thread) |
4541

4642
### Datasets
4743

@@ -53,79 +49,70 @@ it, the fastest on the scored metric wins.
5349
| 2 | spot-check | `benchmark-dev-llama-small.h5` | 14,000 |
5450
| 2 | dev/eval | `llama-dev.h5` | 256,921 |
5551

56-
## Graph modes
57-
58-
The `deglib_sisap` binary implements seven graph modes per task (`mode1``mode7`).
59-
The profiles in `search.py` currently use **`mode4` (Task 1)** and **`mode5` (Task 2)**;
60-
⭐ marks the strongest submission candidates (the other ⭐, `mode7`, is a close
61-
alternative that is implemented but not wired into a profile). All modes share the
62-
same save-mode contract (one result file per operating point holding neighbor ids
63-
**and** distances), so they are drop-in benchmark alternatives.
64-
65-
**Task 1** — EVP variants
66-
67-
| Mode | Name | Description |
68-
|-------------|--------------------------------|----------------------------------------------|
69-
| mode1 | fp16 | FP16 build + FP16 explore |
70-
| mode2 | evp-linear | EVP quantization + brute-force linear search |
71-
| mode3 | evp | EVP build + EVP explore (no rerank) |
72-
| **mode4**| evp-rerank | EVP build + EVP explore + FP16 rerank |
73-
| mode5 | evp-build-fp16-external-search | EVP build + FP16 external graph search |
74-
| mode6 | evp-asymmetric | EVP build + asymmetric FP16-vs-EVP search |
75-
| mode7 ⭐ | evp-asymmetric-rerank | EVP build + asymmetric search + FP16 rerank |
76-
77-
**Task 2** — L2-lift variants
78-
79-
| Mode | Name | Description |
80-
|-------------|-------------------------|-----------------------------------------|
81-
| mode1 | baseline | FP32 build + FP32 inner-product explore |
82-
| mode2 | fp16-build-fp16-explore | FP16 build + FP16 IP explore |
83-
| mode3 | baseline-fp16 | FP32 build + FP16 IP explore |
84-
| mode4 | l2-converted | FP32 L2(d+1) build + FP32 L2 explore |
85-
| **mode5**| l2-fp16-ip | FP32 L2(d+1) build + FP16 IP explore |
86-
| mode6 | l2-fp16-l2 | FP32 L2(d+1) build + FP16 L2 explore |
87-
| mode7 ⭐ | l2-fp16-d2 | FP32 L2(d+2) build + FP16 L2 explore |
88-
8952
## Repository layout
9053

9154
| Path | Contents |
9255
|---------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
9356
| [`cpp/`](cpp/) | deglib (DEG) C++ library and the SISAP binary under `cpp/sisap/` (`task1.cpp`, `task2.cpp`, `sisap.cpp`, per-mode headers in `task1/`, `task2/`). |
94-
| [`submission/`](submission/) | TIRA entrypoint `search.py` plus the vendored baseline harness (`eval.py`, `datasets.py`, `plot.py`, `show_operating_points.py`, `data/*/config.json`). |
95-
| [`Dockerfile`](Dockerfile) | Two-stage image: build the binary (AVX2), then a thin Python runtime that runs `search.py`. |
57+
| [`submission/`](submission/) | TIRA entrypoint `search.py` and evaluation tools (see [submission/README.md](submission/README.md)). |
58+
| [`Dockerfile`](Dockerfile) | Two-stage image: build the C++ binary (AVX2), then a thin Python runtime running `search.py`. |
9659
| [`.github/workflows/ci.yml`](.github/workflows/ci.yml) | Builds the image and runs all three spot-checks through the exact TIRA command schema, then evaluates + plots. |
9760
| `python/` | Legacy reference implementation (not used by the submission). |
9861

99-
## How it runs on TIRA
62+
## Submission via TIRA
63+
64+
Submissions are handled through TIRA ([tira.io/task-overview/sisap-2026](https://www.tira.io/task-overview/sisap-2026)), which provides a reproducible, containerized evaluation framework. Code submissions for SISAP 2026 are handled only through TIRA.
65+
66+
### Step 1 — Register your team
67+
68+
1. Sign up or log in at [tira.io](https://www.tira.io) (GitHub login supported).
69+
2. Navigate to [tira.io/task-overview/sisap-2026](https://www.tira.io/task-overview/sisap-2026) and click **Register**.
70+
3. Optionally add team members via [tira.io/g?type=my](https://www.tira.io/g?type=my).
71+
72+
### Step 2 — Verify locally
73+
74+
To test the containerized submission pipeline locally on your machine, ensure you have the virtual environment activated (or use uv/pip to install the `tira` client):
75+
76+
```bash
77+
# Install/update the tira client
78+
uv pip install --upgrade tira
79+
80+
# Run a dry run against one of the spot-check datasets:
81+
.venv/bin/tira-cli code-submission \
82+
--path . \
83+
--command 'python3 /app/search.py --input $inputDataset/*.h5 --task-description $inputDataset/config.json --output $outputDir' \
84+
--task sisap-2026 \
85+
--dataset task-1-spot-check-20260602-training \
86+
--dry-run
87+
```
88+
*(On Windows, use `.\.venv\Scripts\tira-cli` instead of `.venv/bin/tira-cli`)*
89+
90+
Use `task-2-spot-check-20260602-training` if your approach only targets Task 2.
91+
92+
### Step 3 — Authenticate and submit
10093

101-
TIRA builds the image from the repo, mounts the dataset (no internet inside the
102-
container), and invokes:
94+
Retrieve your authentication token from the TIRA task page (**Submit****Code Submissions****New Submission****I want to submit from my local machine**), then:
10395

10496
```bash
105-
python3 /app/search.py \
106-
--input $inputDataset/*.h5 \
107-
--task-description $inputDataset/config.json \
108-
--output $outputDir
97+
.venv/bin/tira-cli login --token AUTH-TOKEN
98+
.venv/bin/tira-cli verify-installation --task sisap-2026 --team YOUR-TEAM
99+
100+
.venv/bin/tira-cli code-submission \
101+
--path . \
102+
--command 'python3 /app/search.py --input $inputDataset/*.h5 --task-description $inputDataset/config.json --output $outputDir' \
103+
--task sisap-2026 \
104+
--dataset task-1-spot-check-20260602-training
109105
```
110106

111-
`search.py` reads the task config, decompresses the input on the fly when needed
112-
(the C++ HDF5 reader only handles contiguous datasets, so gzip/chunked inputs are
113-
materialized to an uncompressed temp file via `h5py`), drives the binary once per
114-
profile, and writes one result file per operating point.
107+
### Step 4 — Trigger evaluation in the TIRA UI
115108

116-
## Output format
109+
1. Navigate to the task page, click **Submit****Code Submissions**.
110+
2. Select your submission, choose a dataset and hardware configuration.
111+
3. The organizers will handle execution on all datasets once your submission looks correct.
117112

118-
One HDF5 file per operating point under `$outputDir`, each with:
113+
---
119114

120-
- datasets `knns` (1-based neighbor ids; if a query returns fewer than k candidates
121-
the padding slots are the node's own id for Task 1 and `0` for Task 2 — both
122-
harmless, since the evaluator scores by set membership) and `dists` (float), both
123-
the same shape — **`n × (k+1)` for Task 1**, **`n × k` for Task 2**;
124-
- root attributes `algo`, `dataset`, `task`, `buildtime`, `querytime`, `params`.
125115

126-
Task 1 prepends the self-reference in column 0 (the extra `+1` column), matching the
127-
ground-truth layout the evaluator uses; Task 2 has no self column. Only `knns` is
128-
scored — `recall = mean_i |knns[i,:k] ∩ gt[i,:k]| / k`.
129116

130117
## Build & run locally
131118

cpp/readme.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,4 +123,30 @@ Once compiled, the executable `deglib_sisap` can be run from the build output di
123123
* `--output <path>`: Path to write retrieved neighbor indices to a binary `.ivecs` file.
124124
* `--flas` (Task 2 only): Enables FLAS pre-sorting of training vectors before graph building.
125125

126-
126+
## Graph modes
127+
128+
The `deglib_sisap` binary implements seven graph modes per task (`mode1``mode7`). All modes share the same save-mode contract (writing one result file per operating point holding neighbor IDs and distances), so they are drop-in alternatives.
129+
130+
### Task 1 — EVP variants
131+
132+
| Mode | Name | Description |
133+
|-------------|--------------------------------|----------------------------------------------|
134+
| mode1 | fp16 | FP16 build + FP16 explore |
135+
| mode2 | evp-linear | EVP quantization + brute-force linear search |
136+
| mode3 | evp | EVP build + EVP explore (no rerank) |
137+
| **mode4**| evp-rerank | EVP build + EVP explore + FP16 rerank |
138+
| mode5 | evp-build-fp16-external-search | EVP build + FP16 external graph search |
139+
| mode6 | evp-asymmetric | EVP build + asymmetric FP16-vs-EVP search |
140+
| mode7 ⭐ | evp-asymmetric-rerank | EVP build + asymmetric search + FP16 rerank |
141+
142+
### Task 2 — L2-lift variants
143+
144+
| Mode | Name | Description |
145+
|-------------|-------------------------|-----------------------------------------|
146+
| mode1 | baseline | FP32 build + FP32 inner-product explore |
147+
| mode2 | fp16-build-fp16-explore | FP16 build + FP16 IP explore |
148+
| mode3 | baseline-fp16 | FP32 build + FP16 IP explore |
149+
| mode4 | l2-converted | FP32 L2(d+1) build + FP32 L2 explore |
150+
| **mode5**| l2-fp16-ip | FP32 L2(d+1) build + FP16 IP explore |
151+
| mode6 | l2-fp16-l2 | FP32 L2(d+1) build + FP16 L2 explore |
152+
| mode7 ⭐ | l2-fp16-d2 | FP32 L2(d+2) build + FP16 L2 explore |

submission/README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Submission Harness
2+
3+
This directory contains the Python runner and evaluation tools for the SISAP 2026 Indexing Challenge submission.
4+
5+
## Internal Execution Details
6+
7+
When running under TIRA, `search.py` reads the task config, decompresses the input on the fly when needed (the C++ HDF5 reader only handles contiguous datasets, so gzip/chunked inputs are materialized to an uncompressed temp file via `h5py`), drives the binary once per profile, and writes one result file per operating point.
8+
9+
## Output Format
10+
11+
One HDF5 file is generated per operating point under `$outputDir`. Each file contains:
12+
13+
- **Datasets:**
14+
- `knns` (1-based neighbor IDs; if a query returns fewer than $k$ candidates, padding slots are the vertex's own ID for Task 1 and `0` for Task 2. This padding is harmless, as the evaluator scores by set membership).
15+
- `dists` (float).
16+
- Both datasets have the same shape: **`n × (k+1)` for Task 1**, **`n × k` for Task 2**.
17+
- **Root Attributes:** `algo`, `dataset`, `task`, `buildtime`, `querytime`, `params`.
18+
19+
Task 1 prepends the self-reference in column 0 (the extra `+1` column), matching the ground-truth layout the evaluator uses. Task 2 has no self column. Only `knns` is scored: `recall = mean_i |knns[i,:k] ∩ gt[i,:k]| / k`.

0 commit comments

Comments
 (0)