Skip to content

Higher memory usage on GPU0 than other GPUs when finetuning univla-7b on Libero datasets #79

Description

@PuzhenYuan
Image

Thanks for your brilliant open-sourced work!

I have some questions when fine-tuning univla-7b on Libero datasets.
I found that gpu0 has a higher memory usage than other gpus. Could you explains why or give me some advice on fixing this issue?

My CLI command is as follows:

torchrun --standalone --nnodes 1 --nproc-per-node 8 finetune_libero.py \
    --vla_path /path/to/checkpoints/univla \
    --lam_path /path/to/checkpoints/lam-stage-2.ckpt \
    --data_root_dir /path/to/dataset/libero/modified_libero_rlds \
    --dataset_name libero_spatial_no_noops \
    --run_root_dir /path/to/UniVLA/runs \
    --adapter_tmp_dir /path/to/runs/adapter_tmp \
    --batch_size 8 \
    --max_steps 30005 \
    --save_steps 5000 \
    --learning_rate 3.5e-4 \
    --grad_accumulation_steps 1 \
    --image_aug True \
    --shuffle_buffer_size 10000 \
    --save_latest_checkpoint_only False \
    --run_id_note libero_spatial

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions