cleanup readmes

Neiko2002 · Neiko2002 · commit d2103c3bde7c · 2026-06-16T20:29:25.000+02:00
diff --git a/README.md b/README.md
@@ -2,37 +2,33 @@
 
 Submission for the [SISAP 2026 Indexing Challenge](https://sisap-challenges.github.io/2026/).
 The index is a [**Dynamic Exploration Graph (DEG)**](https://github.com/Visual-Computing/DynamicExplorationGraph/tree/evp)
-combined with [**EVP (Equi-Voronoi Polytope) quantization**](https://github.com/MetricSearch/metric_space_rust),
+combined with [**EVP (Equi-Voronoi Polytope) quantization**](https://github.com/MetricSearch/metric_space_rust) and
+[**FLAS (Fast Linear Assignment Sorter)**](https://github.com/Visual-Computing/LAS_FLAS) for optimized insertion order,
 implemented in C++ (`cpp/`) and driven by the official baseline Python harness
-(`submission/`). Everything ships as a **single Docker container** that TIRA builds
+(`submission/`). Everything ships as a **single Docker container** built by TIRA
 from this repo.
 
 ## Approach
 
-- **Index:** deglib's even-regular exploration graph, built once per run, then a
-  parameter sweep produces several operating points (build/recall trade-offs) from
-  that single build.
-- **Task 1** (k-NN self-join, scored on **build + search time**): graph mode `mode4` —
-  EVP build + EVP explore + exact FP16 inner-product rerank. Here **search = explore +
-  rerank**. Task 1 ranks on the *total* (build + search), which `search.py` packs into
-  the `buildtime` attribute with `querytime` = 0 — so `buildtime` is the sum, not a
-  claim that search is part of building. Search is a real component, often ≈ half the
-  total.
-- **Task 2** (MIPS search, scored on **query time**): graph mode `mode5` —
-  L2-converted FP32 build (with FLAS pre-sort) + FP16 inner-product search.
-- **Task 3** (sparse SPLADE) is out of scope and skipped cleanly (exit 0) so the
-  mandatory spot-check CI stays green.
+We configure the even-regular Dynamic Exploration Graph (deglib) library to target the specific constraints and scoring metrics of the two tasks:
+
+- **Task 1** (k-NN self-join, scored on **total build + search time**):
+  We build the graph using EVP-quantized representations (`EvpBits` metric). Since this is a self-join, every database element has a corresponding vertex in the graph. We optimize the search by starting the traversal directly at the target vertex's position, bypassing the entry-point routing phase. We evaluate two configurations on this graph:
+  - **`mode4` (evp-rerank):** The traversal walks the local neighborhood starting from the target vertex using quantized `EvpBits` distances. The retrieved candidates are then reranked using exact FP16 inner products.
+  - **`mode7` (evp-asymmetric-rerank):** The traversal walks the local neighborhood starting from the target vertex using an asymmetric distance function (the vertex's original FP16 vector vs. the EVP-quantized vertices in the graph), followed by exact FP16 inner-product reranking of the retrieved candidates.
+- **Task 2** (MIPS search, scored on **query time**):
+  To perform maximum inner product search (MIPS) on the deglib graph, we transform the inner product into an L2-similarity search by extending the vectors' dimensionality. We build the graph once and sweep both `eps_search` and `max_dist` on the built graph to produce multiple operating points. We evaluate two configurations:
+  - **`mode5` (l2-fp16-ip):** Vectors are extended to $d+1$ dimensions to transform inner product to L2 distance during the build (speeded up by pre-sorting vectors using **FLAS**). The query search is performed using fast FP16 inner-product exploration.
+  - **`mode7` (l2-fp16-d2):** Vectors are extended to $d+2$ dimensions for the graph build (also utilizing FLAS). Query search is performed using fast FP16 L2 distance exploration.
 
 The C++ binary computes neighbors **and** distances during search; the thin Python
-entrypoint adapts the output to the official result format. Per-dataset parameters
-live in `TASK1_PROFILES` / `TASK2_PROFILES` in [`submission/search.py`](submission/search.py)
-— unknown datasets fail fast rather than silently using bad parameters.
+entrypoint adapts the output to the official result format. 
 
 ## Challenge tasks & constraints
 
 Both tasks run under the same hard limits: **8 vCPUs, 24 GB RAM, ≤ 8 h, read-only
 dataset, no internet** in the container (the eval node is an AMD EPYC 7F72, no
-AVX-512). The goal is **≥ 0.8 average recall**; among the operating points that reach
+AVX-512). The goal is **≥ 0.8 average recall**; among the operating points reaching
 it, the fastest on the scored metric wins.
 
 |                | Task 1                                        | Task 2                                             |
@@ -41,7 +37,7 @@ it, the fastest on the scored metric wins.
 | Problem        | k-NN **graph** self-join, k = 15              | k-NN **search**, k = 30                            |
 | Distance       | inner product                                 | inner product (via L2 lift)                        |
 | Scored metric  | build + search wall-clock (`buildtime`)       | query time (`querytime`)                           |
-| Build threads  | all 8                                         | **1** — graph built single-threaded, per the rules |
+| Build threads  | all 8                                         | 1 (configured build thread)                        |
 
 ### Datasets
 
@@ -53,79 +49,70 @@ it, the fastest on the scored metric wins.
 | 2    | spot-check   | `benchmark-dev-llama-small.h5`            | 14,000                                   |
 | 2    | dev/eval     | `llama-dev.h5`                            | 256,921                                  |
 
-## Graph modes
-
-The `deglib_sisap` binary implements seven graph modes per task (`mode1`…`mode7`).
-The profiles in `search.py` currently use **`mode4` (Task 1)** and **`mode5` (Task 2)**;
-⭐ marks the strongest submission candidates (the other ⭐, `mode7`, is a close
-alternative that is implemented but not wired into a profile). All modes share the
-same save-mode contract (one result file per operating point holding neighbor ids
-**and** distances), so they are drop-in benchmark alternatives.
-
-**Task 1** — EVP variants
-
-| Mode        | Name                           | Description                                  |
-|-------------|--------------------------------|----------------------------------------------|
-| mode1       | fp16                           | FP16 build + FP16 explore                    |
-| mode2       | evp-linear                     | EVP quantization + brute-force linear search |
-| mode3       | evp                            | EVP build + EVP explore (no rerank)          |
-| **mode4** ⭐ | evp-rerank                     | EVP build + EVP explore + FP16 rerank        |
-| mode5       | evp-build-fp16-external-search | EVP build + FP16 external graph search       |
-| mode6       | evp-asymmetric                 | EVP build + asymmetric FP16-vs-EVP search    |
-| mode7 ⭐     | evp-asymmetric-rerank          | EVP build + asymmetric search + FP16 rerank  |
-
-**Task 2** — L2-lift variants
-
-| Mode        | Name                    | Description                             |
-|-------------|-------------------------|-----------------------------------------|
-| mode1       | baseline                | FP32 build + FP32 inner-product explore |
-| mode2       | fp16-build-fp16-explore | FP16 build + FP16 IP explore            |
-| mode3       | baseline-fp16           | FP32 build + FP16 IP explore            |
-| mode4       | l2-converted            | FP32 L2(d+1) build + FP32 L2 explore    |
-| **mode5** ⭐ | l2-fp16-ip              | FP32 L2(d+1) build + FP16 IP explore    |
-| mode6       | l2-fp16-l2              | FP32 L2(d+1) build + FP16 L2 explore    |
-| mode7 ⭐     | l2-fp16-d2              | FP32 L2(d+2) build + FP16 L2 explore    |
-
 ## Repository layout
 
 | Path                                                    | Contents                                                                                                                                                |
 |---------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
 | [`cpp/`](cpp/)                                          | deglib (DEG) C++ library and the SISAP binary under `cpp/sisap/` (`task1.cpp`, `task2.cpp`, `sisap.cpp`, per-mode headers in `task1/`, `task2/`).       |
-| [`submission/`](submission/)                            | TIRA entrypoint `search.py` plus the vendored baseline harness (`eval.py`, `datasets.py`, `plot.py`, `show_operating_points.py`, `data/*/config.json`). |
-| [`Dockerfile`](Dockerfile)                              | Two-stage image: build the binary (AVX2), then a thin Python runtime that runs `search.py`.                                                             |
+| [`submission/`](submission/)                            | TIRA entrypoint `search.py` and evaluation tools (see [submission/README.md](submission/README.md)). |
+| [`Dockerfile`](Dockerfile)                              | Two-stage image: build the C++ binary (AVX2), then a thin Python runtime running `search.py`. |
 | [`.github/workflows/ci.yml`](.github/workflows/ci.yml)  | Builds the image and runs all three spot-checks through the exact TIRA command schema, then evaluates + plots.                                          |
 | `python/`                                               | Legacy reference implementation (not used by the submission).                                                                                           |
 
-## How it runs on TIRA
+## Submission via TIRA
+
+Submissions are handled through TIRA ([tira.io/task-overview/sisap-2026](https://www.tira.io/task-overview/sisap-2026)), which provides a reproducible, containerized evaluation framework. Code submissions for SISAP 2026 are handled only through TIRA.
+
+### Step 1 — Register your team
+
+1. Sign up or log in at [tira.io](https://www.tira.io) (GitHub login supported).
+2. Navigate to [tira.io/task-overview/sisap-2026](https://www.tira.io/task-overview/sisap-2026) and click **Register**.
+3. Optionally add team members via [tira.io/g?type=my](https://www.tira.io/g?type=my).
+
+### Step 2 — Verify locally
+
+To test the containerized submission pipeline locally on your machine, ensure you have the virtual environment activated (or use uv/pip to install the `tira` client):
+
+```bash
+# Install/update the tira client
+uv pip install --upgrade tira
+
+# Run a dry run against one of the spot-check datasets:
+.venv/bin/tira-cli code-submission \
+    --path . \
+    --command 'python3 /app/search.py --input $inputDataset/*.h5 --task-description $inputDataset/config.json --output $outputDir' \
+    --task sisap-2026 \
+    --dataset task-1-spot-check-20260602-training \
+    --dry-run
+```
+*(On Windows, use `.\.venv\Scripts\tira-cli` instead of `.venv/bin/tira-cli`)*
+
+Use `task-2-spot-check-20260602-training` if your approach only targets Task 2.
+
+### Step 3 — Authenticate and submit
 
-TIRA builds the image from the repo, mounts the dataset (no internet inside the
-container), and invokes:
+Retrieve your authentication token from the TIRA task page (**Submit** → **Code Submissions** → **New Submission** → **I want to submit from my local machine**), then:
 
 ```bash
-python3 /app/search.py \
-    --input $inputDataset/*.h5 \
-    --task-description $inputDataset/config.json \
-    --output $outputDir
+.venv/bin/tira-cli login --token AUTH-TOKEN
+.venv/bin/tira-cli verify-installation --task sisap-2026 --team YOUR-TEAM
+
+.venv/bin/tira-cli code-submission \
+    --path . \
+    --command 'python3 /app/search.py --input $inputDataset/*.h5 --task-description $inputDataset/config.json --output $outputDir' \
+    --task sisap-2026 \
+    --dataset task-1-spot-check-20260602-training
 ```
 
-`search.py` reads the task config, decompresses the input on the fly when needed
-(the C++ HDF5 reader only handles contiguous datasets, so gzip/chunked inputs are
-materialized to an uncompressed temp file via `h5py`), drives the binary once per
-profile, and writes one result file per operating point.
+### Step 4 — Trigger evaluation in the TIRA UI
 
-## Output format
+1. Navigate to the task page, click **Submit** → **Code Submissions**.
+2. Select your submission, choose a dataset and hardware configuration.
+3. The organizers will handle execution on all datasets once your submission looks correct.
 
-One HDF5 file per operating point under `$outputDir`, each with:
+---
 
-- datasets `knns` (1-based neighbor ids; if a query returns fewer than k candidates
-  the padding slots are the node's own id for Task 1 and `0` for Task 2 — both
-  harmless, since the evaluator scores by set membership) and `dists` (float), both
-  the same shape — **`n × (k+1)` for Task 1**, **`n × k` for Task 2**;
-- root attributes `algo`, `dataset`, `task`, `buildtime`, `querytime`, `params`.
 
-Task 1 prepends the self-reference in column 0 (the extra `+1` column), matching the
-ground-truth layout the evaluator uses; Task 2 has no self column. Only `knns` is
-scored — `recall = mean_i |knns[i,:k] ∩ gt[i,:k]| / k`.
 
 ## Build & run locally
 
diff --git a/cpp/readme.md b/cpp/readme.md
@@ -123,4 +123,30 @@ Once compiled, the executable `deglib_sisap` can be run from the build output di
 * `--output <path>`: Path to write retrieved neighbor indices to a binary `.ivecs` file.
 * `--flas` (Task 2 only): Enables FLAS pre-sorting of training vectors before graph building.
 
-
+## Graph modes
+
+The `deglib_sisap` binary implements seven graph modes per task (`mode1`…`mode7`). All modes share the same save-mode contract (writing one result file per operating point holding neighbor IDs and distances), so they are drop-in alternatives.
+
+### Task 1 — EVP variants
+
+| Mode        | Name                           | Description                                  |
+|-------------|--------------------------------|----------------------------------------------|
+| mode1       | fp16                           | FP16 build + FP16 explore                    |
+| mode2       | evp-linear                     | EVP quantization + brute-force linear search |
+| mode3       | evp                            | EVP build + EVP explore (no rerank)          |
+| **mode4** ⭐ | evp-rerank                     | EVP build + EVP explore + FP16 rerank        |
+| mode5       | evp-build-fp16-external-search | EVP build + FP16 external graph search       |
+| mode6       | evp-asymmetric                 | EVP build + asymmetric FP16-vs-EVP search    |
+| mode7 ⭐     | evp-asymmetric-rerank          | EVP build + asymmetric search + FP16 rerank  |
+
+### Task 2 — L2-lift variants
+
+| Mode        | Name                    | Description                             |
+|-------------|-------------------------|-----------------------------------------|
+| mode1       | baseline                | FP32 build + FP32 inner-product explore |
+| mode2       | fp16-build-fp16-explore | FP16 build + FP16 IP explore            |
+| mode3       | baseline-fp16           | FP32 build + FP16 IP explore            |
+| mode4       | l2-converted            | FP32 L2(d+1) build + FP32 L2 explore    |
+| **mode5** ⭐ | l2-fp16-ip              | FP32 L2(d+1) build + FP16 IP explore    |
+| mode6       | l2-fp16-l2              | FP32 L2(d+1) build + FP16 L2 explore    |
+| mode7 ⭐     | l2-fp16-d2              | FP32 L2(d+2) build + FP16 L2 explore    |
diff --git a/submission/README.md b/submission/README.md
@@ -0,0 +1,19 @@
+# Submission Harness
+
+This directory contains the Python runner and evaluation tools for the SISAP 2026 Indexing Challenge submission.
+
+## Internal Execution Details
+
+When running under TIRA, `search.py` reads the task config, decompresses the input on the fly when needed (the C++ HDF5 reader only handles contiguous datasets, so gzip/chunked inputs are materialized to an uncompressed temp file via `h5py`), drives the binary once per profile, and writes one result file per operating point.
+
+## Output Format
+
+One HDF5 file is generated per operating point under `$outputDir`. Each file contains:
+
+- **Datasets:**
+  - `knns` (1-based neighbor IDs; if a query returns fewer than $k$ candidates, padding slots are the vertex's own ID for Task 1 and `0` for Task 2. This padding is harmless, as the evaluator scores by set membership).
+  - `dists` (float).
+  - Both datasets have the same shape: **`n × (k+1)` for Task 1**, **`n × k` for Task 2**.
+- **Root Attributes:** `algo`, `dataset`, `task`, `buildtime`, `querytime`, `params`.
+
+Task 1 prepends the self-reference in column 0 (the extra `+1` column), matching the ground-truth layout the evaluator uses. Task 2 has no self column. Only `knns` is scored: `recall = mean_i |knns[i,:k] ∩ gt[i,:k]| / k`.