<img width="1524" height="295" alt="Image" src="https://github.com/user-attachments/assets/4b9dcb00-833d-4e5e-9a93-92244c588c9e" /> The following is the reproduction script: from datasets import load_dataset dataset = load_dataset( "webdataset", data_dir="/external/datasets/llava-next-780k-webdataset/", split="train", streaming=True, features=None )
from datasets import load_dataset
dataset = load_dataset(
"webdataset",
data_dir="/external/datasets/llava-next-780k-webdataset/",
split="train",
streaming=True,
features=None
)