Skip to content

Feature/vlm bbox finetuning#949

Open
gokul-tqagi wants to merge 8 commits into
Physical-Intelligence:mainfrom
TorqueAGI-AIBrain:feature/vlm-bbox-finetuning
Open

Feature/vlm bbox finetuning#949
gokul-tqagi wants to merge 8 commits into
Physical-Intelligence:mainfrom
TorqueAGI-AIBrain:feature/vlm-bbox-finetuning

Conversation

@gokul-tqagi
Copy link
Copy Markdown

No description provided.

gokul-tqagi and others added 8 commits May 5, 2026 15:19
Self-contained experiment setup under experiments/ for fine-tuning
pi0.5 on the lipbalm pick task using only the right arm from
bimanual ALOHA WidowX recordings.

- Data processing script extracts right arm (indices 7:14) and keeps
  cam_high + cam_right_wrist from 3 raw bimanual datasets
- Custom single-arm transforms following the DROID 2-camera pattern
- YAML config for training hyperparams and remote server targeting
- train.sh / eval.sh sync code+data to SSH servers and launch jobs
- No existing openpi source files are modified
- Switch to JAX backend for LoRA fine-tuning (PyTorch trainer
  lacks freeze_filter support)
- Add convert_to_lerobot_v21.py for v3.0 -> v2.1 format conversion
  (required by OpenPI's pinned LeRobot 0.1.0)
- Add validate_dataset.py to verify frame counts, action/state
  shapes, image integrity, and value match against source data
- Patch LeRobot get_safe_version to skip Hub check for local datasets
- Fix HF_LEROBOT_HOME env var (was LEROBOT_HOME, deprecated)
- batch_size=16 fits L40S 46GB with LoRA + JAX
- Evaluate checkpoint against dataset episodes with action
  prediction MAE per joint
- Patch LeRobot Hub check for local datasets (same as run_train)
- Support configurable episode/frame sampling
…ipeline

- Move run_train.py → train/, run_eval.py + export_results.py → eval/
- Fix all import paths (sys.path, __file__-relative) for new locations
- Update train.sh/eval.sh script paths and rsync excludes
- Add marker_pick.yaml config (74 episodes, 4 datasets)
- Generalize config.py: remove lipbalm defaults, derive prompt/name from YAML
- Fix convert_to_lerobot_v21: add convert_and_merge() for multi-dataset conversion
- Make process_single_arm.py --datasets required (no hardcoded defaults)
- Add .gitignore for results/, processed data, logs
- Rewrite README as generic OpenPI LoRA fine-tuning workflow guide
Pi0.5 LoRA fine-tuning pipeline for single-arm ALOHA tasks
  `<loc_NNNN>` bbox tokens (lost during π0.5 Stage-2) while preserving its
  action-generation quality, via a joint α·CE(loc tokens) + β·MSE(flow-matching)
  objective with the action expert kept frozen as an MSE anchor.
@jimmyt857 jimmyt857 removed their request for review May 22, 2026 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants