You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge two optimization-based (NSFP, FastNSF) methods into codebase from HiMo project (#14)
* feat: nus data initial haven't finished yet.
* env(mmcv): fix pip install
in case we need pip install -e rather than use setup.py directly.
* env(mmcv): remove ext in the load module for the setup changes.
in case old env cannot load, keep two options here.
* feat(nsfp, fastnsf): update two methods module files.
* chore(runner): for all optimization-based methods.
* try to speed up if we have more gpus, but I haven't test on multi-gpu and multi-node session. Will test later once I got the setup.
* upgrade training env to python 3.10 and with cuda 11.8 for afterward env upgrading etc.
* docs(readme): update readme and fix typo from copilot reviewers.
* hotfix: multi-gpu metric evaluation
* need set cuda device during spawn.
* apply coplit pull request review.
* finished checking on 2-gpu setting. (but not yet test on multi-nodes)
* env(opensf): update environment yaml to py310, add requirement.txt for if we already have torch.
* as dockerfile shown.
* fix(docker): torch is on the base env in the new dockerfile.
* hotfix(nsfp): we can remove pytorch3d dependence.
* update env file (check all good)
* update dockerfile
* fix(sftool): argoverse 2 need 0.2.1 for py38, revert to previous version.
* fix some issues on test on runner
* but haven't test it on multi-gpu.
* env(data): update two more datasets used in the himo projects.
* update header info about HiMo
* github(action): stale issues.
* fix(runner): save mode passed 1-gpu test now.
* visualization result also checked, corrected.
* todo tested on multi-gpu to see if it works.
* docs(README): update readme and running nsfp and fastnsf.
* docs: add contributing README for afterward users.
* style(header): update header info.
We want to make contributing to this project as easy and transparent as possible. We welcome any contributions, from bug fixes to new features. If you're interested in adding your own scene flow method, this guide will walk you through the process.
4
+
5
+
## Adding a New Method
6
+
7
+
Here is a quick guide to integrating a new method into the OpenSceneFlow codebase.
8
+
9
+
### 1. Data Preparation
10
+
11
+
All data is expected to be processed into the `.h5` format. Each file represents a scene, and within the file, each data sample is indexed by a `timestamp` key.
12
+
13
+
For more details on the data processing pipeline, please see the [Data Processing README](./dataprocess/README.md#process).
14
+
15
+
### 2. Model Implementation
16
+
17
+
All model source files are located in [`src/models`](./src/models). When adding your model, please remember to import your new model class in the [`src/models/__init__.py`](./src/models/__init__.py) file. Don't forget to add your model conf files in [`conf/model`](./conf/model).
18
+
19
+
***For Feed-Forward Methods:** You can use `deflow` and `fastflow3d` as implementation examples.
20
+
***For Optimization-Based Methods:** Please refer to `nsfp` and `fastnsf` for guidance on structure and integration. A detailed example can be found in the [NSFP model file](./src/models/nsfp.py).
21
+
22
+
### 3. Custom Loss Functions
23
+
24
+
All loss functions are defined in [`src/lossfuncs.py`](./src/lossfuncs.py). If your model requires a new loss function, you can add it to this file by following the pattern of the existing functions. SeFlow provided a self-supervised loss function example for all feed-forward methods. Feel free to check.
25
+
26
+
### 4.1 Training a Feed-Forward Model
27
+
28
+
1. Add a configuration file for your new model in the [`conf/model`](./conf/model) directory.
29
+
2. Begin training by running the following command:
30
+
```bash
31
+
python train.py model=your_model_name
32
+
```
33
+
3. **Note:** If your model's output dictionary (`res_dict`) has a different structure from the existing models, you may need to add a new pattern in the `training_step` and `validation_step` methods in the main training script.
34
+
35
+
### 4.2 Running an Optimization-Based Model
36
+
37
+
Our framework supports multi-GPU execution for optimization-based methods out of the box. You can follow the structure of existing methods like NSFP to run and evaluate your model.
38
+
39
+
-----
40
+
41
+
Once the steps above are completed, other parts of the framework, such as evaluation (`eval`) and visualization (`save`), should integrate with your new model accordingly.
Copy file name to clipboardExpand all lines: README.md
+23-7Lines changed: 23 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,13 +43,13 @@ Additionally, *OpenSceneFlow* integrates following excellent works: [ICLR'24 Zer
43
43
44
44
-[x][FastFlow3D](https://arxiv.org/abs/2103.01306): RA-L 2021, a basic backbone model.
45
45
-[x][ZeroFlow](https://arxiv.org/abs/2305.10424): ICLR 2024, their pre-trained weight can covert into our format easily through [the script](tools/zerof2ours.py).
46
-
-[][NSFP](https://arxiv.org/abs/2111.01253): NeurIPS 2021, faster 3x than original version because of [our CUDA speed up](assets/cuda/README.md), same (slightly better) performance. Done coding, public after review.
47
-
-[][FastNSF](https://arxiv.org/abs/2304.09121): ICCV 2023. SSL optimization-based. Done coding, public after review.
46
+
-[x][NSFP](https://arxiv.org/abs/2111.01253): NeurIPS 2021, faster 3x than original version because of [our CUDA speed up](assets/cuda/README.md), same (slightly better) performance.
-[ ][ICP-Flow](https://arxiv.org/abs/2402.17351): CVPR 2024. SSL optimization-based. Done coding, public after review.
49
49
50
50
</details>
51
51
52
-
💡: Want to learn how to add your own network in this structure? Check [Contribute section](assets/README.md#contribute) and know more about the code. Fee free to pull request and your bibtex [here](#cite-us).
52
+
💡: Want to learn how to add your own network in this structure? Check [Contribute section](CONTRIBUTING.md#adding-a-new-method) and know more about the code. Fee free to pull request and your bibtex [here](#cite-us).
We also provide [requirements.txt](requirements.txt), please check usage through [Dockerfile](Dockerfile).
79
80
80
81
### Docker (Recommended for Isolation)
81
82
@@ -86,11 +87,13 @@ You always can choose [Docker](https://en.wikipedia.org/wiki/Docker_(software))
86
87
docker pull zhangkin/opensf
87
88
88
89
# run container
89
-
docker run -it --gpus all -v /dev/shm:/dev/shm -v /home/kin/data:/home/kin/data --name opensceneflow zhangkin/opensf /bin/zsh
90
+
docker run -it --net=host --gpus all -v /dev/shm:/dev/shm -v /home/kin/data:/home/kin/data --name opensf zhangkin/opensf /bin/zsh
91
+
90
92
# and better to read your own gpu device info to compile the cuda extension again:
93
+
cd /home/kin/workspace/OpenSceneFlow && git pull
91
94
cd /home/kin/workspace/OpenSceneFlow/assets/cuda/mmcv && /opt/conda/envs/opensf/bin/python ./setup.py install
92
95
cd /home/kin/workspace/OpenSceneFlow/assets/cuda/chamfer3D && /opt/conda/envs/opensf/bin/python ./setup.py install
93
-
96
+
cd /home/kin/workspace/OpenSceneFlow
94
97
mamba activate opensf
95
98
```
96
99
@@ -119,7 +122,7 @@ Some tips before running the code:
119
122
* If you want to use [wandb](wandb.ai), replace all `entity="kth-rpl",` to your own entity otherwise tensorboard will be used locally.
120
123
* Set correct data path by passing the config, e.g. `train_data=/home/kin/data/av2/h5py/demo/train val_data=/home/kin/data/av2/h5py/demo/val`.
121
124
122
-
And free yourself from trainning, you can download the pretrained weight from [HuggingFace](https://huggingface.co/kin-zhang/OpenSceneFlow) and we provided the detail `wget` command in each model section.
125
+
And free yourself from trainning, you can download the pretrained weight from [HuggingFace](https://huggingface.co/kin-zhang/OpenSceneFlow) and we provided the detail `wget` command in each model section. For optimization-based method, it's train-free so you can directly run with [3. Evaluation](#3-evaluation) (check more in the evaluation section).
Train SSF with the leaderboard submit config. [Runtime: Around 6 hours in 8x A100 GPUs.]
@@ -194,9 +199,12 @@ You can view Wandb dashboard for the training and evaluation results or upload r
194
199
Since in training, we save all hyper-parameters and model checkpoints, the only thing you need to do is to specify the checkpoint path. Remember to set the data path correctly also.
195
200
196
201
```bash
197
-
# it will directly prints all metric
202
+
#(feed-forward): load ckpt and run it, it will directly prints all metric
We provide a script to visualize the results of the model also. You can specify the checkpoint path and the data path to visualize the results. The step is quite similar to evaluation.
Solved by `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/proj/berzelius-2023-154/users/x_qinzh/mambaforge/lib`
99
99
100
100
101
-
## Contribute
102
-
103
-
If you want to contribute to new model, here are tips you can follow:
104
-
1. Dataloader: we believe all data could be process to `.h5`, we named as different scene and inside a scene, the key of each data is timestamp. Check [dataprocess/README.md](../dataprocess/README.md#process) for more details.
105
-
2. Model: All model files can be found [here: src/models](../src/models). You can view deflow and fastflow3d to know how to implement a new model. Don't forget to add to the `__init__.py`[file to import class](../src/models/__init__.py).
106
-
3. Loss: All loss files can be found [here: src/lossfuncs.py](../src/lossfuncs.py). There are three loss functions already inside the file, you can add a new one following the same pattern.
107
-
4. Training: Once you have implemented the model, you can add the model to the config file [here: conf/model](../conf/model) and train the model using the command `python train.py model=your_model_name`. One more note here may: if your res_dict from model output is different, you may need add one pattern in `def training_step` and `def validation_step`.
108
-
109
-
All others like eval and vis will be changed according to the model you implemented as you follow the above steps.
Solved by install the torch-cuda version: `pip install https://data.pyg.org/whl/torch-2.0.0%2Bcu118/torch_scatter-2.1.2%2Bpt20cu118-cp310-cp310-linux_x86_64.whl`
0 commit comments