Skip to content

Commit d7d82a4

Browse files
authored
Update README.md (#71)
1 parent c088a85 commit d7d82a4

File tree

1 file changed

+1
-3
lines changed

1 file changed

+1
-3
lines changed

README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,10 +57,8 @@ Coupled with global contrastive learning over a 2M-scale concept memory bank, On
5757

5858
- **Unified Vision Foundation**: A single base model for consistent understanding of images, videos, and OCR.
5959
- **Codec-Style Patch Selection**: Instead of sampling sparse frames densely (all patches from few frames), OneVision Encoder samples dense frames sparsely (important patches from many frames).
60-
- **3D Rotary Position Embedding**: Uses a 4:6:6 split for temporal, height, and width dimensions to capture spatiotemporal relationships.
60+
- **3D Rotary Position && Native Resolution**: Uses a 4:6:6 split for temporal, height, and width dimensions to capture spatiotemporal relationships. Supports native resolution input without tiling or cropping.
6161
- **Global Contrastive Learning**: Trained with a 2M concept bank for better-separated semantic clusters.
62-
- **Native Resolution Support**: Supports native resolution input without tiling or cropping.
63-
- **Open Training Data & Pipeline**: In addition to the model and code, we will open-source the curated training dataset and the full data processing pipeline.
6462

6563
### Video Processing Pipeline
6664

0 commit comments

Comments
 (0)