Skip to content

Commit 62a4c34

Browse files
committed
ci and readme cleanup
1 parent d7ca058 commit 62a4c34

2 files changed

Lines changed: 16 additions & 15 deletions

File tree

.github/workflows/ci.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,6 @@ jobs:
2626
dl() { mkdir -p "submission/data/$1"; curl -sfL --retry 5 --retry-delay 2 --retry-all-errors -o "submission/data/$1/$2" "$BASE/$1/$2"; }
2727
dl task-1-spot-check benchmark-dev-gooaq-small.h5
2828
dl task-2-spot-check benchmark-dev-llama-small.h5
29-
dl task-3-spot-check benchmark-dev-fiqa-small.h5
3029
3130
- name: Run all spot-checks (TIRA command schema)
3231
run: |
@@ -35,7 +34,7 @@ jobs:
3534
# GitHub's hosted runner has only 2 — so cap --cpus to what's available.
3635
CPUS=$(( $(nproc) < 8 ? $(nproc) : 8 ))
3736
echo "Using --cpus=$CPUS (host has $(nproc) CPUs; TIRA uses 8)"
38-
for dir in task-1-spot-check task-2-spot-check task-3-spot-check; do
37+
for dir in task-1-spot-check task-2-spot-check; do
3938
echo "=== $dir ==="
4039
mkdir -p "results/$dir"
4140
docker run --rm --user "$(id -u):$(id -g)" \

README.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -112,11 +112,9 @@ uv run tira-cli code-submission \
112112
### Step 4 — Trigger evaluation in the TIRA UI
113113

114114
1. Navigate to the task page, click **Submit****Code Submissions**.
115-
2. Select your submission, choose a dataset and hardware configuration.
115+
2. Select your submission, choose a dataset and hardware configuration and trigger a run.
116116
3. The organizers will handle execution on all datasets once your submission looks correct.
117117

118-
---
119-
120118

121119

122120
## Build & run locally
@@ -125,23 +123,30 @@ uv run tira-cli code-submission \
125123
# Build the submission image
126124
docker build -t sisap-deglib .
127125

128-
# Run one task the way TIRA does (your-dataset-dir holds the .h5 and config-dir the config.json)
126+
# Run one task the way TIRA does e.g. task-2-spot-check (<your-dataset-dir> holds the .h5)
129127
mkdir -p results
130128
docker run --rm --cpus=8 --memory=24g \
131-
-v "$PWD/your-dataset-dir:/app/dataset:ro" \
129+
-v "$PWD/<your-dataset-dir>/:/app/dataset:ro" \
132130
-v "$PWD/results:/app/results:rw" \
133131
sisap-deglib \
134-
python3 /app/search.py --input '/app/dataset/*.h5' \
135-
--task-description /app/data/config-dir/config.json --output /app/results
132+
python3 /app/search.py --input '/app/dataset/benchmark-dev-llama-small.h5' \
133+
--task-description /app/data/task-2-spot-check/config.json --output /app/results
134+
135+
# Install/update workspace dependencies
136+
uv sync
137+
138+
# Score the results against the dataset ground truth (saves res_task2.csv under results/)
139+
uv --directory submission run eval.py --results ../results ../results/res_task2.csv
136140

137-
# Score the results against the dataset ground truth
138-
uv --directory submission run eval.py --results ../results ../res.csv
141+
# Plot the recall-vs-QPS curve (saves result_*.png under results/)
142+
uv --directory submission run plot.py --task task2 ../results/res_task2.csv
143+
mv submission/result_*.png results/
139144
```
140145

141146
### Building just the C++ binary
142147

143148
```bash
144-
cmake -S cpp -B cpp/build -DCMAKE_BUILD_TYPE=Release -DFORCE_AVX2=ON
149+
cmake -S cpp -B cpp/build -DCMAKE_BUILD_TYPE=Release
145150
cmake --build cpp/build --target deglib_sisap -j"$(nproc)"
146151

147152
# Usage: deglib_sisap <task1|task2> <input.h5> <mode> [options]
@@ -151,9 +156,6 @@ cpp/build/bin/deglib_sisap task2 dataset.h5 mode5 \
151156
--k-top 30 --max-dist 5000,7000 --eps-search 0.18,0.2 --flas
152157
```
153158

154-
`--march`/AVX note: the build is pinned to **AVX2 (no AVX-512)** because the eval
155-
node is an AMD EPYC 7F72 (Zen 2) with 8 vCPU / 24 GB RAM and no AVX-512.
156-
157159
## Continuous integration
158160

159161
On every push the CI builds the image and runs all three spot-checks through the

0 commit comments

Comments
 (0)