Accurate, stable, and powerful humanoid soccer shooting via motion-guided curriculum reinforcement learning.
🌎English | 🇨🇳中文
RoboNaldo trains Unitree G1 soccer-shooting policies in Isaac Lab.
This repository contains the simulation training code. For real-world hardware deployment, export the trained policy to ONNX and use it with Deploy Repo.
source/whole_body_tracking/: Isaac Lab extension, G1 robot config, tracking task, rewards, observations, commands, and PPO config.source/whole_body_tracking/whole_body_tracking/tasks/tracking/yaml/: public right-foot task presets.scripts/rsl_rl/: direct train, play, and evaluation entrypoints.docs/: detailed setup, task-parameter, and reward references.
2026-06Training and deployment code release.
Install Isaac Sim and Isaac Lab first. This codebase follows the Isaac Lab extension layout and is intended for the Isaac Lab Python environment.
Recommended baseline:
| Dependency | Version |
|---|---|
| Isaac Sim | 4.5.0 |
| Isaac Lab | 2.1.0 |
| Python | 3.10 |
Install the upstream BeyondMimic repository first in the same Isaac Lab Python environment:
git clone https://github.com/HybridRobotics/whole_body_tracking.git
cd whole_body_tracking
python -m pip install -e .Then return to this repository before installing the RoboNaldo extension.
From the repository root:
python -m pip install -e source/whole_body_trackingThe Unitree G1 description is not committed to this repository. Download it before creating the environment from the same asset source used by BeyondMimic:
mkdir -p source/whole_body_tracking/whole_body_tracking/assets
curl -L -o unitree_description.tar.gz https://storage.googleapis.com/qiayuanl_robot_descriptions/unitree_description.tar.gz
tar -xzf unitree_description.tar.gz -C source/whole_body_tracking/whole_body_tracking/assets/
rm unitree_description.tar.gz
test -f source/whole_body_tracking/whole_body_tracking/assets/unitree_description/urdf/g1/main.urdfThe code resolves this path through whole_body_tracking/assets.py, where
ASSET_DIR points to source/whole_body_tracking/whole_body_tracking/assets.
Do not add an assets/__init__.py; unlike the upstream BeyondMimic setup, this
repository already provides the Python module that owns ASSET_DIR.
The downloaded source/.../assets/ directory is ignored by .gitignore and
should not be committed. The soccer ball is created with Isaac Lab native
SphereCfg, so no separate ball mesh is required.
Training requires one retargeted kick motion in RoboNaldo NPZ format. This repository includes the open-source right-foot kick reference CSV retargeted by GVHMR+GMR:
motions/right_kick_reference.csv
You can replace it with your own motion. The included CSV has 612 frames at
50 Hz. Each row follows the csv_to_npz.py input layout: root position, root
quaternion in xyzw, then 29 Unitree G1 joint positions.
| Item | How it is used |
|---|---|
| Reference CSV | committed at motions/right_kick_reference.csv |
| Converted motion NPZ | local file under motions/ or WandB registry artifact |
| Saved task preset | archived as expanded params/task_params.yaml in each run |
| Checkpoint | local model_*.pt or WandB run path via --wandb_path |
Convert the included reference CSV into NPZ:
python scripts/csv_to_npz.py \
--input_file motions/right_kick_reference.csv \
--input_fps 50 \
--output_name right_kick \
--headlessOptional: upload the converted NPZ to a W&B registry:
python scripts/upload_npz.py \
--artifact_path motions/right_kick.npz \
--entity <entity> \
--name right_kickRoboNaldo uses a staged curriculum. Continue each stage from the previous stage checkpoint. Train the tracking prior on a plane first; after it tracks reliably, optionally fine-tune that checkpoint with the mixed-terrain tracking preset for robustness.
| Stage | Purpose | Right-foot preset |
|---|---|---|
| Stage 1a | Plane motion-tracking prior, no task reward | right_kick/tracking_params.yaml |
| Stage 1b, optional | Mixed-terrain tracking robustness fine-tune | right_kick/tracking_mixed_params.yaml |
| Stage 2a | Small-range static-ball adaptation | right_kick/task_params_1.yaml |
| Stage 2b | Wider stationary-ball shooting | right_kick/task_params_2.yaml |
| Stage 3 | Dynamic incoming-ball shooting with jump trigger/adaptive sampling | right_kick/task_params_3.yaml |
For training, use scripts/rsl_rl/train.py directly:
python scripts/rsl_rl/train.py \
--task Tracking-Body-Frame-Flat-G1-v0 \
--motion_file motions/right_kick.npz \
--yaml right_kick/tracking_params.yaml \
--headless \
--logger wandb \
--log_project_name kick \
--run_name right_kick_trackingUse --registry_name <entity>/wandb-registry-motions/right_kick:latest instead
of --motion_file if you keep motions in a W&B artifact registry. For different
stages, change the --yaml argument value to switch among presets.
Resume training:
python scripts/rsl_rl/train.py \
--task Tracking-Body-Frame-Flat-G1-v0 \
--motion_file motions/right_kick.npz \
--yaml <yaml_file> \
--resume True \
--load_run <plane_tracking_run_folder> \
--checkpoint model_<iter>.pt \
--headlessIt is recommended to use a small policy noise std for Stage 2 and Stage 3 resume runs, so the task policy does not destroy the learned kick prior with excessive exploration.
This release ships right-foot presets and the right-foot reference motion. A
left-foot curriculum should use mirrored motion data and change
main_foot_name to left_ankle_roll_link.
Use Tracking-Body-Frame-Flat-G1-v0 registry for the paper-style body-frame
observation setup and Tracking-Flat-G1-v0 for external-mocap-style global
observation setup.
Use scripts/rsl_rl/play.py for playback. Known Stage-2 hot-test run:
python scripts/rsl_rl/play.py \
--task Tracking-Body-Frame-Flat-G1-v0 \
--wandb_path <your_checkpoint_path> \
--yaml right_kick/task_params_2.yaml \
--motion_file motions/right_kick.npz \
--num_envs 1 \
--headlessUse scripts/rsl_rl/eval.py for evaluation:
python scripts/rsl_rl/eval.py \
--task Tracking-Body-Frame-Flat-G1-v0 \
--wandb_path <your_checkpoint_path> \
--yaml <your_yaml_file> \
--motion_file motions/right_kick.npz \
--num_envs 6000 \
--headlesseval.py writes per-episode shot metrics and aggregate accuracy/speed summaries
under logs/rsl_rl/eval/.
Real-robot deployment expects an ONNX policy from this repo. The exporter writes
policy-obs.onnx with embedded metadata (joint names, PD gains, default poses,
observation/action layout, and motion anchor settings) for
RoboNaldo_Deploy.
| When | Output |
|---|---|
W&B training (--logger wandb) |
<run_folder>/<run_name>.onnx next to each saved model_*.pt |
play.py playback |
<checkpoint_folder>/exported/policy-obs.onnx |
Run play.py once on the checkpoint you plan to deploy (same --task, --yaml,
and --motion_file as training) to generate the ONNX artifact. Use the Stage-2
or Stage-3 task preset for shooting policies. Keep checkpoint paths and run IDs
generic in public docs; deployment should consume the exported ONNX artifact
rather than relying on a private checkpoint reference.
If RoboNaldo helps your research, please consider citing:
@article{robonaldo2026,
title={RoboNaldo: Accurate, stable, and powerful humanoid soccer shooting via motion-guided curriculum reinforcement learning},
author={OpenDriveLab},
journal={arXiv preprint arXiv:2606.11092},
year={2026},
url={https://arxiv.org/abs/2606.11092}
}This repository builds on Isaac Lab (IsaacLab), BeyondMimic, and RSL-RL.
