🤖 Robotic Ultrasound Training with GR00T-N1#
This repository provides a complete workflow for training GR00T-N1 models for robotic ultrasound applications. NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills. This cross-embodiment model takes multimodal input, including language and images, to perform manipulation tasks in diverse environments.
📋 Table of Contents#
- Overview
- Installation
- Data Collection
- Data Conversion
- Running Training
- Understanding Outputs
- References
🔍 Overview#
This workflow enables you to:
- Collect the robot trajectories and camera image data while the robot is performing a ultrasound scan
- Convert the collected HDF5 data to LeRobot format
- Fine-tune a GR00T-N1 model
- Deploy the trained model for inference
🛠️ Installation#
First, install the necessary environment and dependencies using our provided script:
# Install environment with dependencies
./tools/env_setup_robot_us.sh --policy gr00tn1
This script:
- Clones Isaac-GR00T repository
- Installs LeRobot and other dependencies
📊 Data Collection#
To train a GR00T-N1 model, you\'ll need to collect robotic ultrasound data. We provide a state machine implementation in the simulation environment that can generate training episodes that emulate a liver ultrasound scan.
See the simulation README for more information on how to collect data.
🔄 Data Conversion#
GR00T-N1 uses the LeRobot data format for training. To facilitate this, we provide a script that converts your HDF5 data into the required format. The script is located at:
workflows/robotic_ultrasound/scripts/training/convert_hdf5_to_lerobot.py
To run the conversion, navigate to the training folder (located one level above this GR00T-N1-specific README), and execute the following command:
python convert_hdf5_to_lerobot.py /path/to/your/hdf5/data --feature_builder_type gr00tn1
Replace /path/to/your/hdf5/data with the actual path to your dataset.
Key Arguments & Differences for GR00T-N1:
data_dir: Path to the directory containing HDF5 files. (default:path-to-i4h-workflows/workflows/robotic_ultrasound/scripts/simulation/data/hdf5/date-task-name)--feature_builder_type: Crucial for GR00T-N1. Set this togr00tn1. This ensures:- The correct feature keys are used (e.g.,
actionfor actions,observation.images.roomfor room camera, etc., as defined inGR00TN1FeatureDict). - The
modality.jsonfile specific to GR00T-N1 is copied to the output dataset\'smeta/directory. This file defines the expected data modalities and their properties for the GR00T-N1 model. - Default
include_videoforGR00TN1FeatureDictisTrue. --repo_id: Name for your dataset (default: "i4h/robotic_ultrasound")--task_prompt: Text description of the task (default: "Perform a liver ultrasound.")--image_shape: Shape of the image data as a comma-separated string, e.g., \'224,224,3\' (default: "224,224,3").--include_depth,--include_seg,--include_video: Flags to include respective data types. Note thatGR00TN1FeatureDictdefaultsinclude_videotoTrue.state_shapeandactions_shape: These are no longer direct CLI arguments. They default to(7,)and(6,)respectively within theGR00TN1FeatureDictclass but can be modified in the script if needed.
The script will:
- Create a LeRobot dataset with the specified name
- Process each HDF5 file and extract observations, states, and actions using the GR00T-N1 specific feature mapping.
- Save the data in LeRobot format.
- Copy
modality.jsonto~/.cache/huggingface/lerobot/<repo_id>/meta/modality.json. - Consolidate the dataset for training.
The converted dataset will be saved in ~/.cache/huggingface/lerobot/<repo_id>.
🚀 Running Training#
To start training with a GR00T-N1 configuration:
Please move to the current gr00t_n1 folder and execute:
python train.py --data_config single_panda_us --dataset_path `data_path`
Arguments:
--data_config: Data configuration name to use for traing(e.g.,single_panda_us)--dataset_path: Path to the training data--output_dir: Path to save the checkpoint
📊 Understanding Outputs#
Checkpoints#
Training checkpoints are typically saved in output_dir directory with a structure like:
output_dir/
├── config.json
├── experiment_cfg/
├── checkpoint-500/
├── checkpoint-1000/
└── ...
Each numbered directory contains a checkpoint saved at that training step.
References#
- NVIDIA Isaac GR00T N1: For more information on the GR00T N1 foundation model, refer to Isaac-GR00T.
- LeRobot Data Format: The data conversion process utilizes the LeRobot format. For details on LeRobot, see the LeRobot GitHub repository.