Skip to content

Commit 7a700de

Browse files
Copilotanxiangsir
andauthored
Add LLaVA-NeXT evaluation documentation to README (#78)
* Initial plan * Add LLaVA-NeXT evaluation section to README Co-authored-by: anxiangsir <31175974+anxiangsir@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: anxiangsir <31175974+anxiangsir@users.noreply.github.com>
1 parent cdc7456 commit 7a700de

1 file changed

Lines changed: 55 additions & 0 deletions

File tree

README.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,61 @@ Training configurations and hyperparameters will be documented soon. For now, pl
321321

322322
## 📊 Evaluation
323323

324+
### LLaVA-NeXT Evaluation
325+
326+
To evaluate the OneVision Encoder as a vision backbone for LLaVA-NeXT multimodal models, we use the lmms-eval framework with various vision-language benchmarks.
327+
328+
#### Setup
329+
330+
Navigate to the llava_next directory and follow the setup instructions:
331+
332+
<details>
333+
<summary>Click to expand LLaVA-NeXT evaluation setup</summary>
334+
335+
```bash
336+
cd llava_next
337+
338+
# Using Docker (recommended)
339+
docker build -t ov_encoder_llava:26.01 .
340+
docker run -it --gpus all --ipc host --net host --privileged \
341+
-v "$(pwd)":/workspace/OV-Encoder-Llava \
342+
-w /workspace/OV-Encoder-Llava \
343+
ov_encoder_llava:26.01 bash
344+
```
345+
346+
</details>
347+
348+
#### Running Evaluation
349+
350+
For image benchmarks (ChartQA, DocVQA, AI2D, OCRBench, etc.):
351+
352+
<details>
353+
<summary>Click to expand evaluation commands</summary>
354+
355+
```bash
356+
# Evaluate on image benchmarks
357+
TASKS="ai2d,chartqa,docvqa_val" bash scripts/eval/eval_ov_encoder.sh
358+
```
359+
360+
</details>
361+
362+
For video benchmarks (VideoMME, MVBench, PerceptionTest, etc.), run each benchmark separately:
363+
364+
<details>
365+
<summary>Click to expand video evaluation commands</summary>
366+
367+
```bash
368+
# Preprocess video benchmark (one-time setup)
369+
bash scripts/precompute_codec_patch/preprocess_video_benchmark.sh videomme
370+
371+
# Run evaluation
372+
TASKS="videomme" bash scripts/eval/eval_ov_encoder.sh
373+
```
374+
375+
</details>
376+
377+
For more details, refer to the [LLaVA-NeXT documentation](llava_next/README.md).
378+
324379
### Attentive Probe Evaluation
325380

326381
#### Chunk-wise Sampling Evaluation

0 commit comments

Comments
 (0)