Skip to content

Commit 3d53932

Browse files
authored
Merge two optimization-based (NSFP, FastNSF) methods into codebase from HiMo project (#14)
* feat: nus data initial haven't finished yet. * env(mmcv): fix pip install in case we need pip install -e rather than use setup.py directly. * env(mmcv): remove ext in the load module for the setup changes. in case old env cannot load, keep two options here. * feat(nsfp, fastnsf): update two methods module files. * chore(runner): for all optimization-based methods. * try to speed up if we have more gpus, but I haven't test on multi-gpu and multi-node session. Will test later once I got the setup. * upgrade training env to python 3.10 and with cuda 11.8 for afterward env upgrading etc. * docs(readme): update readme and fix typo from copilot reviewers. * hotfix: multi-gpu metric evaluation * need set cuda device during spawn. * apply coplit pull request review. * finished checking on 2-gpu setting. (but not yet test on multi-nodes) * env(opensf): update environment yaml to py310, add requirement.txt for if we already have torch. * as dockerfile shown. * fix(docker): torch is on the base env in the new dockerfile. * hotfix(nsfp): we can remove pytorch3d dependence. * update env file (check all good) * update dockerfile * fix(sftool): argoverse 2 need 0.2.1 for py38, revert to previous version. * fix some issues on test on runner * but haven't test it on multi-gpu. * env(data): update two more datasets used in the himo projects. * update header info about HiMo * github(action): stale issues. * fix(runner): save mode passed 1-gpu test now. * visualization result also checked, corrected. * todo tested on multi-gpu to see if it works. * docs(README): update readme and running nsfp and fastnsf. * docs: add contributing README for afterward users. * style(header): update header info.
1 parent 6226b07 commit 3d53932

28 files changed

+1060
-288
lines changed

.github/issue_stale.yaml

Lines changed: 0 additions & 16 deletions
This file was deleted.

.github/workflows/issue_stale.yaml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
name: Close inactive issues
2+
on:
3+
schedule:
4+
- cron: "35 11 * * 5"
5+
6+
env:
7+
DAYS_BEFORE_ISSUE_STALE: 30
8+
DAYS_BEFORE_ISSUE_CLOSE: 14
9+
10+
jobs:
11+
close-issues:
12+
runs-on: ubuntu-latest
13+
permissions:
14+
issues: write
15+
pull-requests: write
16+
steps:
17+
- uses: actions/stale@v5
18+
with:
19+
days-before-issue-stale: ${{ env.DAYS_BEFORE_ISSUE_STALE }}
20+
days-before-issue-close: ${{ env.DAYS_BEFORE_ISSUE_CLOSE }}
21+
stale-issue-label: "stale"
22+
stale-issue-message: |
23+
This issue is stale because it has been open for ${{ env.DAYS_BEFORE_ISSUE_STALE }} days with no activity.
24+
It will be closed if no further activity occurs. Let us know if you still need help!
25+
close-issue-message: |
26+
This issue is being closed because it has been stale for ${{ env.DAYS_BEFORE_ISSUE_CLOSE }} days with no activity.
27+
If you still need help, please feel free to leave comments.
28+
days-before-pr-stale: -1
29+
days-before-pr-close: -1
30+
repo-token: ${{ secrets.GITHUB_TOKEN }}

CONTRIBUTING.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Contributing to OpenSceneFlow
2+
3+
We want to make contributing to this project as easy and transparent as possible. We welcome any contributions, from bug fixes to new features. If you're interested in adding your own scene flow method, this guide will walk you through the process.
4+
5+
## Adding a New Method
6+
7+
Here is a quick guide to integrating a new method into the OpenSceneFlow codebase.
8+
9+
### 1. Data Preparation
10+
11+
All data is expected to be processed into the `.h5` format. Each file represents a scene, and within the file, each data sample is indexed by a `timestamp` key.
12+
13+
For more details on the data processing pipeline, please see the [Data Processing README](./dataprocess/README.md#process).
14+
15+
### 2. Model Implementation
16+
17+
All model source files are located in [`src/models`](./src/models). When adding your model, please remember to import your new model class in the [`src/models/__init__.py`](./src/models/__init__.py) file. Don't forget to add your model conf files in [`conf/model`](./conf/model).
18+
19+
* **For Feed-Forward Methods:** You can use `deflow` and `fastflow3d` as implementation examples.
20+
* **For Optimization-Based Methods:** Please refer to `nsfp` and `fastnsf` for guidance on structure and integration. A detailed example can be found in the [NSFP model file](./src/models/nsfp.py).
21+
22+
### 3. Custom Loss Functions
23+
24+
All loss functions are defined in [`src/lossfuncs.py`](./src/lossfuncs.py). If your model requires a new loss function, you can add it to this file by following the pattern of the existing functions. SeFlow provided a self-supervised loss function example for all feed-forward methods. Feel free to check.
25+
26+
### 4.1 Training a Feed-Forward Model
27+
28+
1. Add a configuration file for your new model in the [`conf/model`](./conf/model) directory.
29+
2. Begin training by running the following command:
30+
```bash
31+
python train.py model=your_model_name
32+
```
33+
3. **Note:** If your model's output dictionary (`res_dict`) has a different structure from the existing models, you may need to add a new pattern in the `training_step` and `validation_step` methods in the main training script.
34+
35+
### 4.2 Running an Optimization-Based Model
36+
37+
Our framework supports multi-GPU execution for optimization-based methods out of the box. You can follow the structure of existing methods like NSFP to run and evaluate your model.
38+
39+
-----
40+
41+
Once the steps above are completed, other parts of the framework, such as evaluation (`eval`) and visualization (`save`), should integrate with your new model accordingly.
42+
43+
Thank you for your contribution!

Dockerfile

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,8 @@
1-
# check more: https://hub.docker.com/r/nvidia/cuda
2-
FROM nvidia/cuda:11.7.1-devel-ubuntu20.04
1+
FROM pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel
32
ENV DEBIAN_FRONTEND noninteractive
3+
LABEL maintainer="Qingwen Zhang <https://kin-zhang.github.io/>"
44

5-
RUN apt update && apt install -y git curl vim rsync htop
6-
7-
RUN curl -o ~/miniforge3.sh -LO https://github.com/conda-forge/miniforge/releases/latest/download/miniforge3-Linux-x86_64.sh && \
8-
chmod +x ~/miniforge3.sh && \
9-
~/miniforge3.sh -b -p /opt/conda && \
10-
rm ~/miniforge3.sh && \
11-
/opt/conda/bin/conda clean -ya && /opt/conda/bin/conda init bash
5+
RUN apt update && apt install -y git tmux curl vim rsync libgl1 libglib2.0-0 ca-certificates
126

137
# install zsh and oh-my-zsh
148
RUN apt update && apt install -y wget git zsh tmux vim g++
@@ -20,15 +14,23 @@ RUN sh -c "$(wget -O- https://github.com/deluan/zsh-in-docker/releases/download/
2014
-p https://github.com/zsh-users/zsh-syntax-highlighting
2115

2216
RUN printf "y\ny\ny\n\n" | bash -c "$(curl -fsSL https://raw.githubusercontent.com/Kin-Zhang/Kin-Zhang/main/scripts/setup_ohmyzsh.sh)"
23-
RUN /opt/conda/bin/conda init zsh && /opt/conda/bin/mamba init zsh
17+
RUN /opt/conda/bin/conda init zsh
2418

2519
# change to conda env
2620
ENV PATH /opt/conda/bin:$PATH
21+
RUN /opt/conda/bin/conda config --set solver libmamba
2722

28-
RUN mkdir -p /home/kin/workspace && cd /home/kin/workspace && git clone https://github.com/KTH-RPL/OpenSceneFlow.git
23+
RUN mkdir -p /home/kin/workspace && cd /home/kin/workspace && git clone https://github.com/Kin-Zhang/OpenSceneFlow
2924
WORKDIR /home/kin/workspace/OpenSceneFlow
30-
RUN apt-get update && apt-get install libgl1 -y
25+
3126
# need read the gpu device info to compile the cuda extension
32-
RUN cd /home/kin/workspace/OpenSceneFlow && /opt/conda/bin/mamba env create -f environment.yaml
33-
# environment for dataprocessing inlucdes data-api
34-
RUN cd /home/kin/workspace/OpenSceneFlow && /opt/conda/bin/mamba env create -f envsftool.yaml
27+
RUN /opt/conda/bin/pip install -r /home/kin/workspace/OpenSceneFlow/requirements.txt
28+
RUN /opt/conda/bin/pip install FastGeodis --no-build-isolation
29+
RUN /opt/conda/bin/pip install --no-cache-dir -e ./assets/cuda/chamfer3D && /opt/conda/bin/pip install --no-cache-dir -e ./assets/cuda/mmcv
30+
31+
# environment for dataprocessing includes data-api
32+
RUN /opt/conda/bin/conda env create -f envsftool.yaml
33+
RUN /opt/conda/envs/sftool/bin/pip install numpy==1.22
34+
35+
# clean up apt cache
36+
RUN rm -rf /var/lib/apt/lists/* && rm -rf /root/.cache/pip

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
BSD 3-Clause License
22

3-
Copyright (c) 2024, Robotics, Perception and Learning @KTH
3+
Copyright (c) 2024, Qingwen Zhang, Robotics, Perception and Learning @KTH
44

55
Redistribution and use in source and binary forms, with or without
66
modification, are permitted provided that the following conditions are met:

README.md

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,13 @@ Additionally, *OpenSceneFlow* integrates following excellent works: [ICLR'24 Zer
4343

4444
- [x] [FastFlow3D](https://arxiv.org/abs/2103.01306): RA-L 2021, a basic backbone model.
4545
- [x] [ZeroFlow](https://arxiv.org/abs/2305.10424): ICLR 2024, their pre-trained weight can covert into our format easily through [the script](tools/zerof2ours.py).
46-
- [ ] [NSFP](https://arxiv.org/abs/2111.01253): NeurIPS 2021, faster 3x than original version because of [our CUDA speed up](assets/cuda/README.md), same (slightly better) performance. Done coding, public after review.
47-
- [ ] [FastNSF](https://arxiv.org/abs/2304.09121): ICCV 2023. SSL optimization-based. Done coding, public after review.
46+
- [x] [NSFP](https://arxiv.org/abs/2111.01253): NeurIPS 2021, faster 3x than original version because of [our CUDA speed up](assets/cuda/README.md), same (slightly better) performance.
47+
- [x] [FastNSF](https://arxiv.org/abs/2304.09121): ICCV 2023. SSL optimization-based.
4848
- [ ] [ICP-Flow](https://arxiv.org/abs/2402.17351): CVPR 2024. SSL optimization-based. Done coding, public after review.
4949

5050
</details>
5151

52-
💡: Want to learn how to add your own network in this structure? Check [Contribute section](assets/README.md#contribute) and know more about the code. Fee free to pull request and your bibtex [here](#cite-us).
52+
💡: Want to learn how to add your own network in this structure? Check [Contribute section](CONTRIBUTING.md#adding-a-new-method) and know more about the code. Fee free to pull request and your bibtex [here](#cite-us).
5353

5454
---
5555

@@ -76,6 +76,7 @@ cd OpenSceneFlow && mamba env create -f environment.yaml
7676
# You may need export your LD_LIBRARY_PATH with env lib
7777
# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/kin/mambaforge/lib
7878
```
79+
We also provide [requirements.txt](requirements.txt), please check usage through [Dockerfile](Dockerfile).
7980

8081
### Docker (Recommended for Isolation)
8182

@@ -86,11 +87,13 @@ You always can choose [Docker](https://en.wikipedia.org/wiki/Docker_(software))
8687
docker pull zhangkin/opensf
8788

8889
# run container
89-
docker run -it --gpus all -v /dev/shm:/dev/shm -v /home/kin/data:/home/kin/data --name opensceneflow zhangkin/opensf /bin/zsh
90+
docker run -it --net=host --gpus all -v /dev/shm:/dev/shm -v /home/kin/data:/home/kin/data --name opensf zhangkin/opensf /bin/zsh
91+
9092
# and better to read your own gpu device info to compile the cuda extension again:
93+
cd /home/kin/workspace/OpenSceneFlow && git pull
9194
cd /home/kin/workspace/OpenSceneFlow/assets/cuda/mmcv && /opt/conda/envs/opensf/bin/python ./setup.py install
9295
cd /home/kin/workspace/OpenSceneFlow/assets/cuda/chamfer3D && /opt/conda/envs/opensf/bin/python ./setup.py install
93-
96+
cd /home/kin/workspace/OpenSceneFlow
9497
mamba activate opensf
9598
```
9699

@@ -119,7 +122,7 @@ Some tips before running the code:
119122
* If you want to use [wandb](wandb.ai), replace all `entity="kth-rpl",` to your own entity otherwise tensorboard will be used locally.
120123
* Set correct data path by passing the config, e.g. `train_data=/home/kin/data/av2/h5py/demo/train val_data=/home/kin/data/av2/h5py/demo/val`.
121124

122-
And free yourself from trainning, you can download the pretrained weight from [HuggingFace](https://huggingface.co/kin-zhang/OpenSceneFlow) and we provided the detail `wget` command in each model section.
125+
And free yourself from trainning, you can download the pretrained weight from [HuggingFace](https://huggingface.co/kin-zhang/OpenSceneFlow) and we provided the detail `wget` command in each model section. For optimization-based method, it's train-free so you can directly run with [3. Evaluation](#3-evaluation) (check more in the evaluation section).
123126

124127
```bash
125128
mamba activate opensf
@@ -143,6 +146,8 @@ wget https://huggingface.co/kin-zhang/OpenSceneFlow/resolve/main/flow4d_best.ckp
143146
Extra pakcges needed for SSF model:
144147
```bash
145148
pip install mmengine-lite torch-scatter
149+
# torch-scatter might not working, then reinstall by:
150+
pip install https://data.pyg.org/whl/torch-2.0.0%2Bcu118/torch_scatter-2.1.2%2Bpt20cu118-cp310-cp310-linux_x86_64.whl
146151
```
147152

148153
Train SSF with the leaderboard submit config. [Runtime: Around 6 hours in 8x A100 GPUs.]
@@ -194,9 +199,12 @@ You can view Wandb dashboard for the training and evaluation results or upload r
194199
Since in training, we save all hyper-parameters and model checkpoints, the only thing you need to do is to specify the checkpoint path. Remember to set the data path correctly also.
195200

196201
```bash
197-
# it will directly prints all metric
202+
# (feed-forward): load ckpt and run it, it will directly prints all metric
198203
python eval.py checkpoint=/home/kin/seflow_best.ckpt av2_mode=val
199204

205+
# (optimization-based): it might need take really long time, maybe tmux for run it.
206+
python eval.py model=nsfp
207+
200208
# it will output the av2_submit.zip or av2_submit_v2.zip for you to submit to leaderboard
201209
python eval.py checkpoint=/home/kin/seflow_best.ckpt av2_mode=test leaderboard_version=1
202210
python eval.py checkpoint=/home/kin/seflow_best.ckpt av2_mode=test leaderboard_version=2
@@ -238,7 +246,10 @@ evalai challenge 2210 phase 4396 submit --file av2_submit_v2.zip --large --priva
238246
We provide a script to visualize the results of the model also. You can specify the checkpoint path and the data path to visualize the results. The step is quite similar to evaluation.
239247

240248
```bash
249+
# (feed-forward): load ckpt
241250
python save.py checkpoint=/home/kin/seflow_best.ckpt dataset_path=/home/kin/data/av2/preprocess_v2/sensor/vis
251+
# (optimization-based): change another model by passing model name.
252+
python eval.py model=nsfp dataset_path=/home/kin/data/av2/h5py/demo/val
242253

243254
# The output of above command will be like:
244255
Model: DeFlow, Checkpoint from: /home/kin/model_zoo/v2/seflow_best.ckpt
@@ -252,6 +263,11 @@ python tools/visualization.py --res_name 'seflow_best' --data_dir /home/kin/data
252263

253264
https://github.com/user-attachments/assets/f031d1a2-2d2f-4947-a01f-834ed1c146e6
254265

266+
For exporting easy comparsion with ground truth and other methods, we also provided multi-visulization open3d window:
267+
```bash
268+
python tools/visualization.py --mode mul --res_name "['flow', 'seflow_best']" --data_dir /home/kin/data/av2/preprocess_v2/sensor/vis
269+
```
270+
255271
Or another way to interact with [rerun](https://github.com/rerun-io/rerun) but please only vis scene by scene, not all at once.
256272

257273
```bash

assets/README.md

Lines changed: 3 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Then follow [this stackoverflow answers](https://stackoverflow.com/questions/596
4949

5050
3. Then you can build the docker image:
5151
```bash
52-
cd OpenSceneFlow && docker build -t zhangkin/OpenSceneFlow .
52+
cd OpenSceneFlow && docker build -f Dockerfile -t zhangkin/opensf .
5353
```
5454

5555
## Installation
@@ -98,12 +98,5 @@ python -c "from assets.cuda.chamfer3D import nnChamferDis;print('successfully im
9898
Solved by `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/proj/berzelius-2023-154/users/x_qinzh/mambaforge/lib`
9999

100100

101-
## Contribute
102-
103-
If you want to contribute to new model, here are tips you can follow:
104-
1. Dataloader: we believe all data could be process to `.h5`, we named as different scene and inside a scene, the key of each data is timestamp. Check [dataprocess/README.md](../dataprocess/README.md#process) for more details.
105-
2. Model: All model files can be found [here: src/models](../src/models). You can view deflow and fastflow3d to know how to implement a new model. Don't forget to add to the `__init__.py` [file to import class](../src/models/__init__.py).
106-
3. Loss: All loss files can be found [here: src/lossfuncs.py](../src/lossfuncs.py). There are three loss functions already inside the file, you can add a new one following the same pattern.
107-
4. Training: Once you have implemented the model, you can add the model to the config file [here: conf/model](../conf/model) and train the model using the command `python train.py model=your_model_name`. One more note here may: if your res_dict from model output is different, you may need add one pattern in `def training_step` and `def validation_step`.
108-
109-
All others like eval and vis will be changed according to the model you implemented as you follow the above steps.
101+
3. torch_scatter problem: `OSError: /home/kin/mambaforge/envs/opensf-v2/lib/python3.10/site-packages/torch_scatter/_version_cpu.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE`
102+
Solved by install the torch-cuda version: `pip install https://data.pyg.org/whl/torch-2.0.0%2Bcu118/torch_scatter-2.1.2%2Bpt20cu118-cp310-cp310-linux_x86_64.whl`

assets/cuda/mmcv/scatter_points.py

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,24 @@
88

99
# from utils import ext_loader
1010
import importlib
11-
def load_ext(name, funcs):
12-
ext = importlib.import_module('mmcv.' + name)
13-
for fun in funcs:
14-
assert hasattr(ext, fun), f'{fun} miss in module {name}'
15-
return ext
1611

12+
def load_ext(possible_names, funcs):
13+
"""Try loading module from list of possible names, return first matching."""
14+
for name in possible_names:
15+
try:
16+
ext = importlib.import_module('mmcv' + name)
17+
missing = [f for f in funcs if not hasattr(ext, f)]
18+
if missing:
19+
print(f"Missing functions in 'mmcv{name}': {missing}")
20+
continue
21+
return ext # success
22+
except (ModuleNotFoundError, ImportError) as e:
23+
print(f"Failed to import mmcv{name}: {e}")
24+
raise ImportError(f"Could not load mmcv extension with functions: {funcs}")
25+
26+
# Usage
1727
ext_module = load_ext(
18-
'_ext',
28+
['', '._ext'],
1929
['dynamic_point_to_voxel_forward', 'dynamic_point_to_voxel_backward'])
2030

2131

assets/cuda/mmcv/setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
version='1.0.1',
99
ext_modules=[
1010
CUDAExtension(
11-
name='mmcv._ext',
11+
name='mmcv',
1212
sources=[
1313
"/".join(__file__.split("/")[:-1] + ["scatter_points_cuda.cu"]),
1414
"/".join(__file__.split("/")[:-1] + ["scatter_points.cpp"]),

assets/cuda/mmcv/voxelize.py

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,25 @@
88

99
# from utils import ext_loader
1010
import importlib
11-
def load_ext(name, funcs):
12-
ext = importlib.import_module('mmcv.' + name)
13-
for fun in funcs:
14-
assert hasattr(ext, fun), f'{fun} miss in module {name}'
15-
return ext
1611

12+
def load_ext(possible_names, funcs):
13+
"""Try loading module from list of possible names, return first matching."""
14+
for name in possible_names:
15+
try:
16+
ext = importlib.import_module('mmcv' + name)
17+
missing = [f for f in funcs if not hasattr(ext, f)]
18+
if missing:
19+
print(f"Missing functions in 'mmcv{name}': {missing}")
20+
continue
21+
return ext # success
22+
except (ModuleNotFoundError, ImportError) as e:
23+
print(f"Failed to import mmcv{name}: {e}")
24+
raise ImportError(f"Could not load mmcv extension with functions: {funcs}")
25+
26+
# Usage
1727
ext_module = load_ext(
18-
'_ext', ['dynamic_voxelize_forward', 'hard_voxelize_forward'])
28+
['', '._ext'],
29+
['dynamic_voxelize_forward', 'hard_voxelize_forward'])
1930

2031

2132
class _Voxelization(Function):

0 commit comments

Comments
 (0)