Feature/vlm bbox finetuning#949
Open
gokul-tqagi wants to merge 8 commits into
Open
Conversation
Self-contained experiment setup under experiments/ for fine-tuning pi0.5 on the lipbalm pick task using only the right arm from bimanual ALOHA WidowX recordings. - Data processing script extracts right arm (indices 7:14) and keeps cam_high + cam_right_wrist from 3 raw bimanual datasets - Custom single-arm transforms following the DROID 2-camera pattern - YAML config for training hyperparams and remote server targeting - train.sh / eval.sh sync code+data to SSH servers and launch jobs - No existing openpi source files are modified
- Switch to JAX backend for LoRA fine-tuning (PyTorch trainer lacks freeze_filter support) - Add convert_to_lerobot_v21.py for v3.0 -> v2.1 format conversion (required by OpenPI's pinned LeRobot 0.1.0) - Add validate_dataset.py to verify frame counts, action/state shapes, image integrity, and value match against source data - Patch LeRobot get_safe_version to skip Hub check for local datasets - Fix HF_LEROBOT_HOME env var (was LEROBOT_HOME, deprecated) - batch_size=16 fits L40S 46GB with LoRA + JAX
- Evaluate checkpoint against dataset episodes with action prediction MAE per joint - Patch LeRobot Hub check for local datasets (same as run_train) - Support configurable episode/frame sampling
…ipeline - Move run_train.py → train/, run_eval.py + export_results.py → eval/ - Fix all import paths (sys.path, __file__-relative) for new locations - Update train.sh/eval.sh script paths and rsync excludes - Add marker_pick.yaml config (74 episodes, 4 datasets) - Generalize config.py: remove lipbalm defaults, derive prompt/name from YAML - Fix convert_to_lerobot_v21: add convert_and_merge() for multi-dataset conversion - Make process_single_arm.py --datasets required (no hardcoded defaults) - Add .gitignore for results/, processed data, logs - Rewrite README as generic OpenPI LoRA fine-tuning workflow guide
Pi0.5 LoRA fine-tuning pipeline for single-arm ALOHA tasks
`<loc_NNNN>` bbox tokens (lost during π0.5 Stage-2) while preserving its action-generation quality, via a joint α·CE(loc tokens) + β·MSE(flow-matching) objective with the action expert kept frozen as an MSE anchor.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.