Skip to content

Commit 26a4936

Browse files
Blaizzyclaude
andcommitted
Port Qwen3-VL processor to torch-free numpy implementation
HF's Qwen3-VL image/video processors hard-require torch/torchvision. Inline the numpy port adapted from mlx-vlm (commit 1bf7742, unreleased) so mlx-embeddings can run without torch installed — including real checkpoints like mlx-community/Qwen3-VL-2B-Instruct-4bit. Drops the AutoImageProcessor.from_pretrained(use_fast=False) path, the _UnsupportedVideoProcessor stub, and the object.__new__(Qwen3VLProcessor) trick. Processor.from_pretrained now delegates to the local torch-free Qwen3VLProcessor.from_pretrained, which reads processor_config.json / preprocessor_config.json / video_preprocessor_config.json directly and builds numpy Qwen3VLImageProcessor / Qwen3VLVideoProcessor. Small fixes on top of the mlx-vlm source: - Flatten list-of-list image/video batches (HF's apply_chat_template nests them that way). - Treat explicit None in preprocessor_config.json (min_pixels/max_pixels) the same as missing — the 2B Instruct checkpoint ships nulls alongside valid size.shortest_edge/longest_edge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent ea6d739 commit 26a4936

2 files changed

Lines changed: 655 additions & 134 deletions

File tree

0 commit comments

Comments
 (0)