Skip to content

Commit 1c8fc5a

Browse files
authored
Merge TeFlow into codebase (#36)
* chore(visualization): refactor the open3d visualization, merge fn together. * reset fire class uage directly. * add save screenshot easily with multi-view. * sync view point through diff windows. * visual lidar center tf if set slc to True. * fix(flow): add index_flow for 2hz gt view etc. * hotfix: voteflow cuda lib skip compile if pre-install already. * !eval_mask: this will change the av2-val score for all methods. * as we found out the av2 provided eval_mask at first few frames sometimes it include ground points * and some ground points have ground truth flows because of bbx labeling etc. * but the method trend should be safe since all methods here set ground points flow as pose_flow so all of them have same error if we include ground points. the dataset changes maybe revert later since it's only for av2. * !big changes on loss caculators. * Add teflowLoss into the codebase * update chamfer3D with CUDA stream-style batch busy compute. AI summary: - Added automatic collection of self-supervised loss function names in `src/lossfuncs/__init__.py`. - Improved documentation and structure of self-supervised loss functions in `src/lossfuncs/selfsupervise.py`. - Refactored loss calculation logic in `src/trainer.py` to support new self-supervised loss functions. - Introduced `ssl_loss_calculator` method for handling self-supervised losses. - Updated training step to differentiate between self-supervised and supervised loss calculations. - Enhanced error handling during training and validation steps to skip problematic batches. * docs(apptainer): update apptainer env for diff cluster env. update slurm and command for teflow * update train with rename jobid if it's self-supervised loss. * feat: update v3 challenge format for a test also * fix(data): in case overlap bbx in nus, add groud seg in waymo * update training script. * doc(rerun): update rerun visulization scripts * docs: update slurm file * loss(ssl): update comment to paper link * refactor(chamfer): streamline loss functions and update version to 1.0.6 * docs: update link comment for chamfer dis speed test
1 parent 3a10c77 commit 1c8fc5a

File tree

23 files changed

+988
-450
lines changed

23 files changed

+988
-450
lines changed

README.md

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ It is also an official implementation of the following papers (sorted by the tim
1414
- **TeFlow: Enabling Multi-frame Supervision for Self-Supervised Feed-forward Scene Flow Estimation**
1515
*Qingwen Zhang, Chenhan Jiang, Xiaomeng Zhu, Yunqi Miao, Yushan Zhang, Olov Andersson, Patric Jensfelt*
1616
Conference on Computer Vision and Pattern Recognition (**CVPR**) 2026
17-
[ Strategy ] [ Self-Supervised ] - [ [arXiv](https://arxiv.org/abs/2602.19053) ] [ [Project]() ]
17+
[ Strategy ] [ Self-Supervised ] - [ [arXiv](https://arxiv.org/abs/2602.19053) ] [ [Project](https://github.com/Kin-Zhang/TeFlow) ]→ [here](#teflow)
1818

1919
- **DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method**
2020
*Qingwen Zhang, Xiaomeng Zhu, Yushan Zhang, Yixi Cai, Olov Andersson, Patric Jensfelt*
@@ -96,10 +96,10 @@ You always can choose [Docker](https://en.wikipedia.org/wiki/Docker_(software))
9696

9797
```bash
9898
# option 1: pull from docker hub
99-
docker pull zhangkin/opensf
99+
docker pull zhangkin/opensf:full
100100

101101
# run container
102-
docker run -it --net=host --gpus all -v /dev/shm:/dev/shm -v /home/kin/data:/home/kin/data --name opensf zhangkin/opensf /bin/zsh
102+
docker run -it --net=host --gpus all -v /dev/shm:/dev/shm -v /home/kin/data:/home/kin/data --name opensf zhangkin/opensf:full /bin/zsh
103103

104104
# and better to read your own gpu device info to compile the cuda extension again:
105105
cd /home/kin/workspace/OpenSceneFlow && git pull
@@ -149,7 +149,9 @@ Train DeltaFlow with the leaderboard submit config. [Runtime: Around 18 hours in
149149

150150
```bash
151151
# total bz then it's 10x2 under above training setup.
152-
python train.py model=deltaFlow optimizer.lr=2e-3 epochs=20 batch_size=2 num_frames=5 loss_fn=deflowLoss train_aug=True "voxel_size=[0.15, 0.15, 0.15]" "point_cloud_range=[-38.4, -38.4, -3, 38.4, 38.4, 3]" +optimizer.scheduler.name=WarmupCosLR +optimizer.scheduler.max_lr=2e-3 +optimizer.scheduler.total_steps=20000
152+
python train.py model=deltaflow optimizer.lr=2e-3 epochs=20 batch_size=2 num_frames=5 \
153+
loss_fn=deflowLoss train_aug=True "voxel_size=[0.15, 0.15, 0.15]" "point_cloud_range=[-38.4, -38.4, -3, 38.4, 38.4, 3]" \
154+
optimizer.lr=2e-4 +optimizer.scheduler.name=WarmupCosLR +optimizer.scheduler.max_lr=2e-3 +optimizer.scheduler.warmup_epochs=2
153155

154156
# Pretrained weight can be downloaded through (av2), check all other datasets in the same folder.
155157
wget https://huggingface.co/kin-zhang/OpenSceneFlow/resolve/main/deltaflow/deltaflow-av2.ckpt
@@ -206,6 +208,19 @@ Train Feed-forward SSL methods (e.g. SeFlow/SeFlow++/VoteFlow etc), we needed to
206208
1) process auto-label process for training. Check [dataprocess/README.md#self-supervised-process](dataprocess/README.md#self-supervised-process) for more details. We provide these inside the demo dataset already.
207209
2) specify the loss function, we set the config here for our best model in the leaderboard.
208210

211+
#### TeFlow
212+
213+
```bash
214+
# [Runtime: Around 20 hours in 10x3080Ti GPUs.]
215+
python train.py model=deltaflow epochs=15 batch_size=2 num_frames=5 train_aug=True \
216+
loss_fn=teflowLoss "voxel_size=[0.15, 0.15, 0.15]" "point_cloud_range=[-38.4, -38.4, -3, 38.4, 38.4, 3]" \
217+
+ssl_label=seflow_auto "+add_seloss={chamfer_dis: 1.0, static_flow_loss: 1.0, dynamic_chamfer_dis: 1.0, cluster_based_pc0pc1: 1.0}" \
218+
optimizer.name=Adam optimizer.lr=2e-3 +optimizer.scheduler.name=StepLR +optimizer.scheduler.step_size=9 +optimizer.scheduler.gamma=0.5
219+
220+
# Pretrained weight can be downloaded through (av2), check all other datasets in the same folder.
221+
wget https://huggingface.co/kin-zhang/OpenSceneFlow/resolve/main/teflow/teflow-av2.ckpt
222+
```
223+
209224
#### SeFlow
210225

211226
```bash
@@ -217,6 +232,7 @@ wget https://huggingface.co/kin-zhang/OpenSceneFlow/resolve/main/seflow_best.ckp
217232
```
218233

219234
#### VoteFlow
235+
220236
Extra pakcges needed for VoteFlow, [pytorch3d](https://pytorch3d.org/) (prefer 0.7.7) and [torch-scatter](https://github.com/rusty1s/pytorch_scatter?tab=readme-ov-file) (prefer 2.1.2):
221237

222238
```bash
@@ -280,6 +296,7 @@ python eval.py checkpoint=/home/kin/seflow_best.ckpt data_mode=test leaderboard_
280296
```
281297

282298
### **📊 Range-Wise Metric (New!)**
299+
283300
In [SSF paper](https://arxiv.org/abs/2501.17821), we introduce a new distance-based evaluation metric for scene flow estimation. Below is an example output for SSF with point_cloud_range to 204.8m and voxel_size=0.2m. Check more long-range result in [SSF paper](https://arxiv.org/abs/2501.17821).
284301

285302
| Distance | Static | Dynamic | NumPointsStatic | NumPointsDynamic |
@@ -293,6 +310,7 @@ In [SSF paper](https://arxiv.org/abs/2501.17821), we introduce a new distance-ba
293310

294311

295312
### Submit result to public leaderboard
313+
296314
To submit your result to the public Leaderboard, if you select `data_mode=test`, it should be a zip file for you to submit to the leaderboard.
297315
Note: The leaderboard result in DeFlow&SeFlow main paper is [version 1](https://eval.ai/web/challenges/challenge-page/2010/evaluation), as [version 2](https://eval.ai/web/challenges/challenge-page/2210/overview) is updated after DeFlow&SeFlow.
298316

@@ -337,13 +355,13 @@ For exporting easy comparsion with ground truth and other methods, we also provi
337355
python tools/visualization.py vis --res_name "['flow', 'seflow_best']" --data_dir /home/kin/data/av2/preprocess_v2/sensor/vis
338356
```
339357

340-
**Tips**: To quickly create qualitative results for all methods, you can use multiple results comparison mode, select a good viewpoint and then save screenshots for all frames by pressing `P` key. You will found all methods' results are saved in the output folder (default is `logs/imgs`). Enjoy it!
358+
**Tips**: To quickly create qualitative results for all methods, you can use multiple results comparison mode, select a good viewpoint and then save screenshots for all frames by pressing the `P` key. You will find all methods' results saved in the output folder (default: `logs/imgs`). [IrfanView](https://www.irfanview.com/) can help you easily crop the images in batch. Enjoy!
341359

342360

343-
_Rerun_: Another way to interact with [rerun](https://github.com/rerun-io/rerun) but please only vis scene by scene, not all at once.
361+
_Rerun_: Another way to interact with [rerun](https://github.com/rerun-io/rerun), here we vis scene by scene, you can also specify the result name to compare with GT or other methods.
344362

345363
```bash
346-
python tools/visualization_rerun.py --data_dir /home/kin/data/av2/h5py/demo/train --res_name "['flow', 'deflow']"
364+
python tools/visualization_rerun.py --scene_file /home/kin/data/av2/h5py/demo/val/25e5c600-36fe-3245-9cc0-40ef91620c22.h5 --res_name "['flow', 'deflow']"
347365
```
348366

349367
https://github.com/user-attachments/assets/07e8d430-a867-42b7-900a-11755949de21

assets/README.md

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,30 @@ Then follow [this stackoverflow answers](https://stackoverflow.com/questions/596
5151
```bash
5252
cd OpenSceneFlow && docker build -f Dockerfile -t zhangkin/opensf .
5353
```
54-
54+
55+
### To Apptainer container
56+
57+
If you want to build a **minimal** training env for Apptainer container, you can use the following command:
58+
```bash
59+
apptainer build opensf.sif assets/opensf.def
60+
# zhangkin/opensf:full is created by Dockerfile
61+
```
62+
63+
Then run as a Python env with:
64+
```bash
65+
PYTHON="apptainer run --nv --writable-tmpfs opensf.sif"
66+
$PYTHON train.py
67+
```
68+
69+
<!--
70+
In case the compile package not working for your CUDA cability, add following code to the `assets/opensf.def` file before `exec`:
71+
```bash
72+
echo "Running pip install for local CUDA modules..."
73+
/opt/conda/bin/pip install /workspace/assets/cuda/chamfer3D
74+
/opt/conda/bin/pip install /workspace/assets/cuda/mmcv
75+
``` -->
76+
77+
5578
## Installation
5679

5780
We will use conda to manage the environment with mamba for faster package installation.
@@ -77,10 +100,11 @@ Checking important packages in our environment now:
77100
```bash
78101
mamba activate opensf
79102
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.version.cuda)"
80-
python -c "import lightning.pytorch as pl; print(pl.__version__)"
103+
python -c "import lightning.pytorch as pl; print('pl version:', pl.__version__)"
104+
python -c "import spconv.pytorch as spconv; print('spconv import successfully')"
81105
python -c "from assets.cuda.mmcv import Voxelization, DynamicScatter;print('successfully import on our lite mmcv package')"
82106
python -c "from assets.cuda.chamfer3D import nnChamferDis;print('successfully import on our chamfer3D package')"
83-
python -c "from av2.utils.io import read_feather; print('av2 package ok')"
107+
python -c "from av2.utils.io import read_feather; print('av2 package ok') "
84108
```
85109

86110

@@ -98,6 +122,7 @@ python -c "from av2.utils.io import read_feather; print('av2 package ok')"
98122
2. In cluster have error: `pandas ImportError: /lib64/libstdc++.so.6: version 'GLIBCXX_3.4.29' not found`
99123
Solved by `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/proj/berzelius-2023-154/users/x_qinzh/mambaforge/lib`
100124

125+
4. nvidia channel cannot put into env.yaml file otherwise, the cuda-toolkit will always be the latest one, for me (2025-04-30) I struggling on an hour and get nvcc -V also 12.8 at that time. py=3.10 for cuda >=12.1. (seems it's nvidia cannot be in the channel list???); py<3.10 for cuda <=11.8.0: otherwise 10x, 20x series GPU won't work on cuda compiler. (half precision)
101126

102127
3. torch_scatter problem: `OSError: /home/kin/mambaforge/envs/opensf-v2/lib/python3.10/site-packages/torch_scatter/_version_cpu.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE`
103128
Solved by install the torch-cuda version: `pip install https://data.pyg.org/whl/torch-2.0.0%2Bcu118/torch_scatter-2.1.2%2Bpt20cu118-cp310-cp310-linux_x86_64.whl`

0 commit comments

Comments
 (0)