File tree Expand file tree Collapse file tree
docs/source/features/speculative_decoding/eagle Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -88,6 +88,15 @@ bash scripts/speculative/hunyuan_ocr/generate_vlm_hidden_for_draft_model.sh
8888# For Qwen3-VL series
8989bash scripts/speculative/qwen3_vl/generate_vlm_hidden_for_draft_model.sh
9090```
91+ - 离线hidden_states采集时,如果由于pixel_values数组太长导致 OverflowError: There was an overflow with type <class 'list'>. 请采用分batch处理方式见:
92+
93+ ``` shell
94+ # For HunyuanOCR
95+ bash scripts/speculative/hunyuan_ocr/generate_vlm_hidden_for_draft_model_batch.sh
96+ # For Qwen3-VL series
97+ bash scripts/speculative/qwen3_vl/generate_vlm_hidden_for_draft_model_batch.sh
98+ ```
99+
91100> 注意:qwen3_vl系列模型生成hidden states需要更新transformers> =5.0.0,
92101 或者cherry-pick: https://github.com/huggingface/transformers/pull/42609,
93102 否则抓取的hidden states不可用!!!
Original file line number Diff line number Diff line change 1+ #! /bin/bash
2+
3+ DATASET_PATH=train_data
4+ MODEL_NAME=tencent/HunyuanOCR
5+ TARGET_BACKEND=hf
6+ MODEL_MAX_LENGTH=8192
7+ CHAT_TEMPLATE_TYPE=hunyuan_vl
8+ OUTPUT_DIR=train_data_hidden_states
9+
10+ for (( i= 0 ; i< 32 ; i++ )) ; do
11+ DATASET_PATH=$DATASET_PATH /split_$i .jsonl
12+ OUTPUT_DIR=$OUTPUT_DIR /split_$i
13+ torchrun --nproc_per_node=8 \
14+ tools/generate_hidden_for_draft_model.py \
15+ --modal_type VLM \
16+ --dataset_path $DATASET_PATH \
17+ --model_name $MODEL_NAME \
18+ --target_backend $TARGET_BACKEND \
19+ --torch_dtype bfloat16 \
20+ --model_max_length $MODEL_MAX_LENGTH \
21+ --chat_template_type $CHAT_TEMPLATE_TYPE \
22+ --outdir $OUTPUT_DIR \
23+ --target_model_type hunyuan_vl \
24+ --num_proc 8
25+ done
Original file line number Diff line number Diff line change 1+ #! /bin/bash
2+
3+ DATASET_PATH=train_data
4+ MODEL_NAME=Qwen/Qwen3-VL-4B-Instruct
5+ TARGET_BACKEND=hf
6+ MODEL_MAX_LENGTH=8192
7+ CHAT_TEMPLATE_TYPE=qwen3_vl
8+ OUTPUT_DIR=train_data_hidden_states
9+
10+ for (( i= 0 ; i< 32 ; i++ )) ; do
11+ DATASET_PATH=$DATASET_PATH /split_$i .jsonl
12+ OUTPUT_DIR=$OUTPUT_DIR /split_$i
13+ torchrun --nproc_per_node=8 \
14+ tools/generate_hidden_for_draft_model.py \
15+ --modal_type VLM \
16+ --dataset_path $DATASET_PATH \
17+ --model_name $MODEL_NAME \
18+ --target_backend $TARGET_BACKEND \
19+ --torch_dtype bfloat16 \
20+ --model_max_length $MODEL_MAX_LENGTH \
21+ --chat_template_type $CHAT_TEMPLATE_TYPE \
22+ --outdir $OUTPUT_DIR \
23+ --target_model_type qwen3_vl \
24+ --num_proc 8
25+ done
You can’t perform that action at this time.
0 commit comments