Skip to content

Commit f20d6bd

Browse files
authored
Merge pull request #4 from EvolvingLMMs-Lab/copilot/update-gif-and-method-display
Add method.jpg and case GIFs to README
2 parents bc8574a + c34fa59 commit f20d6bd

1 file changed

Lines changed: 21 additions & 0 deletions

File tree

README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,12 @@
2626

2727
OneVision Encoder is a vision encoder designed for multimodal large language models, featuring efficient video representation with sparse video input. This project provides training code, data processing tools, and model evaluation utilities.
2828

29+
### Method Overview
30+
31+
<p align="center">
32+
<img src="asset/method.jpg" alt="OneVision Encoder Method Overview" width="800" style="max-width: 100%;">
33+
</p>
34+
2935
### Input Method Comparison
3036

3137
<table>
@@ -50,6 +56,21 @@ OneVision Encoder is a vision encoder designed for multimodal large language mod
5056
<img src="pages/images/global_contrastive_comparison.gif" alt="Global Contrastive Comparison" width="800" style="max-width: 100%;">
5157
</p>
5258

59+
### Case Demonstrations
60+
61+
<table>
62+
<tr>
63+
<td align="center">
64+
<img src="asset/case4.gif" alt="Case 4 Demonstration" width="400"><br>
65+
<b>Case 4</b>
66+
</td>
67+
<td align="center">
68+
<img src="asset/case6.gif" alt="Case 6 Demonstration" width="400"><br>
69+
<b>Case 6</b>
70+
</td>
71+
</tr>
72+
</table>
73+
5374
### Pre-training Tips
5475

5576
1. **Scale-up is the final step** - Maximize model capabilities before scaling, and ensure generalization phenomena emerge

0 commit comments

Comments
 (0)