You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
, and build xformers from source / install pre-built wheel.
18
17
19
-
Manually modify `pad_nd` according to: https://github.com/Project-MONAI/MONAI/issues/7842
18
+
## Data Preparation
19
+
20
+
Download the datasets (MIMIC-CXR, CT-RATE, etc.) and extract them to `data/origin/<data type>/<dataset name>`, where `<data type>` can be `local` for image datasets with localization (bounding boxes, segmentation) annotations and `vision-language` for VQA and radiology report datasets.
21
+
22
+
Then execute pre-processing scripts for each dataset. For instance, for MIMIC-CXR, execute the script at `scripts/data/vl/MIMIC-CXR/MIMIC-CXR.py` to pre-process the data. After the pre-processing is finished, the pre-processed data are placed at `data/processed/vision-language/MIMIC-CXR`, where `<split>.json` specifies the data items for each split.
23
+
24
+
## Training
25
+
26
+
[THUDM/cogvlm-chat-hf](https://huggingface.co/THUDM/cogvlm-chat-hf) is used as the base VLM.
27
+
28
+
The example commands for running the three-stage training of VividMed are as follows. Please adapt your number of devices and batch size accordingly. Note that we disable `torch.compile` due the compatibility issue with our dependencies, enable it if you're ready to address it.
0 commit comments