Skip to content

Commit c453abf

Browse files
authored
Add LMM Probe Results reproduction guide (#79)
1 parent 0953409 commit c453abf

1 file changed

Lines changed: 49 additions & 0 deletions

File tree

README.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,55 @@ We train the model on a mixed dataset comprising 740K samples from LLaVA-OneVisi
115115
</picture>
116116
</p>
117117

118+
#### Reproducing LMM Probe Results
119+
120+
> [!NOTE]
121+
> **To reproduce these results, use the `llava_next` folder contents.**
122+
123+
<details>
124+
<summary>Click to expand reproduction instructions</summary>
125+
126+
1. **Navigate to the LLaVA-NeXT directory:**
127+
```bash
128+
cd llava_next
129+
```
130+
131+
2. **Setup the environment:**
132+
```bash
133+
# Using Docker (recommended)
134+
docker build -t ov_encoder_llava:26.01 .
135+
docker run -it --gpus all --ipc host --net host --privileged \
136+
-v "$(pwd)":/workspace/OV-Encoder-Llava \
137+
-w /workspace/OV-Encoder-Llava \
138+
ov_encoder_llava:26.01 bash
139+
```
140+
141+
3. **Prepare training data:**
142+
- Follow the [training data preparation guide](llava_next/README.md#training-data-preparation) to convert your video data to codec format
143+
- The training dataset should include:
144+
- 740K samples from LLaVA-OneVision
145+
- 800K samples from LLaVA-Video SFT
146+
147+
4. **Run Stage-2 fine-tuning:**
148+
```bash
149+
# Configure the training script with your data paths
150+
bash scripts/sft_ov_encoder.sh
151+
```
152+
153+
5. **Evaluate the model:**
154+
```bash
155+
# For video benchmarks
156+
bash scripts/precompute_codec_patch/preprocess_video_benchmark.sh videomme
157+
TASKS="videomme" bash scripts/eval/eval_ov_encoder.sh
158+
159+
# For image benchmarks
160+
TASKS="ai2d,chartqa,docvqa_val" bash scripts/eval/eval_ov_encoder.sh
161+
```
162+
163+
For detailed documentation on training data format, evaluation setup, and troubleshooting, refer to the [LLaVA-NeXT README](llava_next/README.md).
164+
165+
</details>
166+
118167
## ⚡ Quick Start
119168

120169
> [!IMPORTANT]

0 commit comments

Comments
 (0)