You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/model_card.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,11 +25,11 @@
25
25
26
26
## Key Features
27
27
28
+
-**Unified Vision Foundation**: A single base model for consistent understanding of images, videos, and OCR.
28
29
-**Codec-Style Patch Selection**: Instead of sampling sparse frames densely (all patches from few frames), OneVision Encoder samples dense frames sparsely (important patches from many frames).
29
30
-**3D Rotary Position Embedding**: Uses a 4:6:6 split for temporal, height, and width dimensions to capture spatiotemporal relationships.
30
31
-**Global Contrastive Learning**: Trained with a 2M concept bank for better-separated semantic clusters.
31
32
-**Native Resolution Support**: Supports native resolution input without tiling or cropping.
32
-
-**Flash Attention 2**: Efficient attention implementation for improved performance and memory efficiency.
0 commit comments