@@ -42,15 +42,20 @@ We have pushed the processed train set to huggingface:
4242### 3. Training
4343
44441 )
45+
4546``` bash
4647BiLLM_START_INDEX=31 WANDB_MODE=disabled CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master_port=1234 train.py \
4748--train_name_or_path SeanLee97/all_nli_angle_format_b \
4849--save_dir ckpts/bellm-llama-7b-nli \
49- --model_name NousResearch/Llama-2-7b-hf \
50- --ibn_w 1.0 --cosine_w 0.0 --angle_w 0.0 --learning_rate 5e-4 --maxlen 60 \
51- --is_llm 1 --apply_lora 1 --lora_r 32 --lora_alpha 32 --lora_dropout 0.1 \
50+ --model_name NousResearch/Llama-2-7b-chat-hf \
51+ --prompt_template ' The representative word for sentence {text} is:"' \
52+ --pooling_strategy avg \
53+ --ibn_w 20.0 --cosine_w 0.0 --angle_w 1.0 --learning_rate 2e-4 --maxlen 60 \
54+ --apply_lora 1 --lora_r 64 --lora_alpha 128 --lora_dropout 0.1 \
55+ --is_llm 1 --apply_billm 1 --billm_model_class LlamaForCausalLM \
5256--push_to_hub 0 \
53- --save_steps 200 --batch_size 256 --seed 42 --load_kbit 4 --gradient_accumulation_steps 4 --epochs 1 --fp16 1
57+ --logging_steps 5 --save_steps 50 --warmup_steps 80 --batch_size 256 --seed 42 --load_kbit 4 \
58+ --gradient_accumulation_steps 32 --epochs 3 --fp16 1
5459```
5560
5661If you want to push the model to HuggingFace automatically, you can add following extra arguments:
@@ -72,7 +77,7 @@ BiLLM_START_INDEX=31 WANDB_MODE=disabled CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun -
7277--ibn_w 1.0 --cosine_w 0.0 --angle_w 0.0 --learning_rate 2e-4 --maxlen 60 \
7378--is_llm 1 --apply_lora 1 --lora_r 32 --lora_alpha 32 --lora_dropout 0.1 \
7479--push_to_hub 0 \
75- --save_steps 200 --batch_size 256 --seed 42 --load_kbit 4 --gradient_accumulation_steps 64 --epochs 1 --fp16 1
80+ --save_steps 200 --batch_size 256 --seed 42 --load_kbit 4 --gradient_accumulation_steps 32 --epochs 3 --fp16 1
7681```
7782
7883
0 commit comments