Data Recording & GR00T Post-Training#

Record demonstration data from an RL specialist policy and use it to fine-tune a GR00T vision-language-action model.

Overview#

Step 1: Record

Run RL policy to collect demonstrations (HDF5).

AGILE

Step 2: Convert

Transform HDF5 to LeRobot-compatible format.

AGILE

Step 3: Fine-tune

Post-train GR00T on the converted dataset.

Isaac-GR00T

Step 4: Evaluate

Closed-loop inference in simulation.

AGILE

Prerequisites#

Trained RL policy checkpoint (from scripts/train.py)
Isaac Lab environment with G1-PickPlace-Tracking-v0-Record task
Isaac-GR00T-N1.5 repository for fine-tuning

Step 1: Record Demonstration Data#

Setup RL environment with appropriate observation group for required data collection. As an example, G1-PickPlace-Tracking-v0-Record environment captures RGB images, proprioceptive states, and actions from an RL policy. Observations are defined in RecordObservationsCfg.

python scripts/record.py \
    --task G1-PickPlace-Tracking-v0-Record \
    --checkpoint <path/to/rl/checkpoint.pt> \
    --record \
    --record_output data/recording \
    --num_envs 100 \
    --num_steps 300 \
    --enable_cameras

Argument	Description
`--task`	Environment with camera sensor (`G1-PickPlace-Tracking-v0-Record`)
`--checkpoint`	Path to trained RL policy
`--record`	Enable HDF5 recording
`--record_output`	Output directory (creates `data.h5` inside)
`--num_envs`	Parallel environments (more = faster collection)
`--num_steps`	Total simulation steps to run

Output: data/recording/data.h5 containing episodes with observations and actions.

Inspect Recorded Data#

Use the provided notebook to visualize trajectories and verify data quality:

jupyter notebook scripts/data_recording/inspect_data.ipynb

File	Description
`data_recorder.py`	`MultiEnvDataRecorder` API
`inspect_data.ipynb`	Visualization notebook
`convert_to_gr00t.py`	HDF5 to LeRobot converter

Step 2: Convert to GR00T Format#

Convert the HDF5 dataset to LeRobot-compatible format for GR00T training:

python scripts/data_recording/convert_to_gr00t.py \
    -i data/recording/data.h5 \
    -o data/gr00t \
    --task "pick up the object"

This generates:

meta/ – Dataset metadata (episodes.jsonl, info.json, tasks.jsonl)
videos/ – MP4 videos from RGB observations
data/ – Parquet files with states and actions

Step 3: Configure Data Pipeline#

Once the GR00T dataset is generated, configure the data modalities for GR00T model training. The modality configuration specifies which observation channels (state, video, actions) are used and how they map to the training pipeline. You can tune this to find the best-performing combination for your task.

For the pick-place task, edit the modality.json file in the GR00T dataset directory:

{
  "state": {
    "base_ang_vel": { "start": 0, "end": 3 },
    "joint_pos": { "start": 6, "end": 23 },
    "joint_vel": { "start": 23, "end": 38 }
  },
  "action": {
    "action": { "start": 0, "end": 21 }
  },
  "video": {
    "image": { "original_key": "observation.images.image" }
  },
  "annotation": {
    "human.action.task_description": {}
  }
}

Next, add a corresponding data config class to gr00t/experiment/data_config.py in the Isaac-GR00T repository. This class defines the data keys, transform pipeline, and observation/action horizons used during training.

Register in DATA_CONFIG_MAP:

DATA_CONFIG_MAP["g1_pp_tracking_sim"] = G1PPTrackingSimDataConfig()

Step 4: Fine-tune GR00T#

Run fine-tuning in the Isaac-GR00T-N1.5 repository:

python scripts/gr00t_finetune.py \
    --dataset-path <path/to/gr00t/data> \
    --data-config g1_pp_tracking_sim \
    --video-backend torchvision_av \
    --num-gpus 1 \
    --max-steps 10000 \
    --output-dir outputs/gr00t-pp-tracking

Tip

Increase --max-steps and add more data for better performance.

Step 5: Closed-Loop Evaluation#

Launch GR00T N1.5 Inference Server#

In the Isaac-GR00T-N1.5 repository:

python scripts/inference_service.py \
    --server \
    --model_path <path/to/finetuned/checkpoint> \
    --embodiment-tag new_embodiment \
    --data-config g1_pp_tracking_sim \
    --denoising-steps 4 \
    --port 6666

Run Simulation Client#

In this repository, connect to the server and run closed-loop evaluation:

python scripts/record.py \
    --task G1-PickPlace-Tracking-v0-GR00T-Inference \
    --gr00t \
    --gr00t_host localhost \
    --gr00t_port 6666 \
    --gr00t_task_description "Pick up object" \
    --gr00t_action_horizon 4 \
    --num_envs 1 \
    --enable_cameras

Argument	Description
`--gr00t`	Use GR00T policy instead of RL checkpoint
`--gr00t_host`	Inference server hostname
`--gr00t_port`	Inference server port
`--gr00t_task_description`	Language instruction for the task
`--gr00t_action_horizon`	Steps to execute per action chunk

Note

Use --num_envs 1 for easier debugging. Increase for throughput testing.