Merge pull request #4 from EvolvingLMMs-Lab/copilot/update-gif-and-method-display

anxiangsir · web-flow · commit f20d6bd8721e · 2025-12-24T23:30:28.000+08:00
Add method.jpg and case GIFs to README
diff --git a/README.md b/README.md
@@ -26,6 +26,12 @@
 
 OneVision Encoder is a vision encoder designed for multimodal large language models, featuring efficient video representation with sparse video input. This project provides training code, data processing tools, and model evaluation utilities.
 
+### Method Overview
+
+<p align="center">
+  <img src="asset/method.jpg" alt="OneVision Encoder Method Overview" width="800" style="max-width: 100%;">
+</p>
+
 ### Input Method Comparison
 
 <table>
@@ -50,6 +56,21 @@ OneVision Encoder is a vision encoder designed for multimodal large language mod
   <img src="pages/images/global_contrastive_comparison.gif" alt="Global Contrastive Comparison" width="800" style="max-width: 100%;">
 </p>
 
+### Case Demonstrations
+
+<table>
+  <tr>
+    <td align="center">
+      <img src="asset/case4.gif" alt="Case 4 Demonstration" width="400"><br>
+      <b>Case 4</b>
+    </td>
+    <td align="center">
+      <img src="asset/case6.gif" alt="Case 6 Demonstration" width="400"><br>
+      <b>Case 6</b>
+    </td>
+  </tr>
+</table>
+
 ### Pre-training Tips
 
 1. **Scale-up is the final step** - Maximize model capabilities before scaling, and ensure generalization phenomena emerge