Skip to content

Commit 0f7df68

Browse files
authored
add generate_vlm_hidden_for_draft_model_batch.sh for vlm (#247)
1 parent 0e0a6e8 commit 0f7df68

3 files changed

Lines changed: 59 additions & 0 deletions

File tree

docs/source/features/speculative_decoding/eagle/vlm_eagle.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,15 @@ bash scripts/speculative/hunyuan_ocr/generate_vlm_hidden_for_draft_model.sh
8888
# For Qwen3-VL series
8989
bash scripts/speculative/qwen3_vl/generate_vlm_hidden_for_draft_model.sh
9090
```
91+
- 离线hidden_states采集时,如果由于pixel_values数组太长导致 OverflowError: There was an overflow with type <class 'list'>. 请采用分batch处理方式见:
92+
93+
```shell
94+
# For HunyuanOCR
95+
bash scripts/speculative/hunyuan_ocr/generate_vlm_hidden_for_draft_model_batch.sh
96+
# For Qwen3-VL series
97+
bash scripts/speculative/qwen3_vl/generate_vlm_hidden_for_draft_model_batch.sh
98+
```
99+
91100
> 注意:qwen3_vl系列模型生成hidden states需要更新transformers>=5.0.0,
92101
或者cherry-pick: https://github.com/huggingface/transformers/pull/42609,
93102
否则抓取的hidden states不可用!!!
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#!/bin/bash
2+
3+
DATASET_PATH=train_data
4+
MODEL_NAME=tencent/HunyuanOCR
5+
TARGET_BACKEND=hf
6+
MODEL_MAX_LENGTH=8192
7+
CHAT_TEMPLATE_TYPE=hunyuan_vl
8+
OUTPUT_DIR=train_data_hidden_states
9+
10+
for ((i=0; i<32; i++)); do
11+
DATASET_PATH=$DATASET_PATH/split_$i.jsonl
12+
OUTPUT_DIR=$OUTPUT_DIR/split_$i
13+
torchrun --nproc_per_node=8 \
14+
tools/generate_hidden_for_draft_model.py \
15+
--modal_type VLM \
16+
--dataset_path $DATASET_PATH \
17+
--model_name $MODEL_NAME \
18+
--target_backend $TARGET_BACKEND \
19+
--torch_dtype bfloat16 \
20+
--model_max_length $MODEL_MAX_LENGTH \
21+
--chat_template_type $CHAT_TEMPLATE_TYPE \
22+
--outdir $OUTPUT_DIR \
23+
--target_model_type hunyuan_vl \
24+
--num_proc 8
25+
done
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#!/bin/bash
2+
3+
DATASET_PATH=train_data
4+
MODEL_NAME=Qwen/Qwen3-VL-4B-Instruct
5+
TARGET_BACKEND=hf
6+
MODEL_MAX_LENGTH=8192
7+
CHAT_TEMPLATE_TYPE=qwen3_vl
8+
OUTPUT_DIR=train_data_hidden_states
9+
10+
for ((i=0; i<32; i++)); do
11+
DATASET_PATH=$DATASET_PATH/split_$i.jsonl
12+
OUTPUT_DIR=$OUTPUT_DIR/split_$i
13+
torchrun --nproc_per_node=8 \
14+
tools/generate_hidden_for_draft_model.py \
15+
--modal_type VLM \
16+
--dataset_path $DATASET_PATH \
17+
--model_name $MODEL_NAME \
18+
--target_backend $TARGET_BACKEND \
19+
--torch_dtype bfloat16 \
20+
--model_max_length $MODEL_MAX_LENGTH \
21+
--chat_template_type $CHAT_TEMPLATE_TYPE \
22+
--outdir $OUTPUT_DIR \
23+
--target_model_type qwen3_vl \
24+
--num_proc 8
25+
done

0 commit comments

Comments
 (0)