Mimic Policy Training
Train, preview, and export a K1 dance mimic policy from this guide.
Run the commands inside the Cyclo Lab container from /workspace/cyclo_lab.
Train
python scripts/reinforcement_learning/rsl_rl/train.py \
--task Cyclo-Mimic-K1-Rev1-Dance1 \
--num_envs 4096 \
--headless
Training output is written to:
logs/rsl_rl/cyclo_mimic_k1_rev1_dance1/*/
Each run directory contains:
logs/rsl_rl/cyclo_mimic_k1_rev1_dance1/*/model_*.pt- params/
agent.yamlenv.yamlsim2real.yaml
Play and Export
Run the matching play task after training.
By default, play.py loads the latest run and the latest model_*.pt checkpoint for the selected task.
It runs the policy in simulation and exports policy.onnx next to the checkpoint.
python scripts/reinforcement_learning/rsl_rl/play.py \
--task Cyclo-Mimic-K1-Rev1-Dance1-Play \
--num_envs 16
Add --headless when exporting without the Isaac Sim window.
To play a specific checkpoint instead of the latest one, add --checkpoint:
python scripts/reinforcement_learning/rsl_rl/play.py \
--task Cyclo-Mimic-K1-Rev1-Dance1-Play \
--checkpoint logs/rsl_rl/cyclo_mimic_k1_rev1_dance1/2026-01-01_12-00-00/model_1000.pt \
--num_envs 16
After play.py finishes, check the same training run directory.
The exported/ directory is added by play.py:
logs/rsl_rl/cyclo_mimic_k1_rev1_dance1/*/model_*.pt- exported/
policy.onnxpolicy.pt
- params/
agent.yamlenv.yamlsim2real.yaml
Prepare for Sim2Real
Before moving to Sim2Real, make sure these required files are ready:
exported/policy.onnxparams/sim2real.yaml- The motion reference CSV for the same motion used during training.
Then follow How to Deploy Your Own Cyclo Lab-Trained Policy to place the asset under the Sim2Real asset root and register the new mimic mode.
exported/policy.onnx and params/sim2real.yaml are a matched pair from one logs/rsl_rl/.../<run>/ directory.
Do not pair policy.onnx from one run with sim2real.yaml from another run.
How to Add Your Own Mimic Task
Use this flow when you want to add a new mimic task for your own motion.
This guide assumes you already have a Soma-retargeter output CSV from the Kimodo > Soma-retargeter pipeline.
The examples below use my_motion as the new motion name.
Use your own motion name in the same places.
-
Place and convert the motion file.
First, place the Soma-retargeter output CSV under the K1 Rev.1 motion data directory:
source/cyclo_lab/data/- motions/
- K1_rev1/
- dance1/
- dance2/
- my_motion/
my_motion_soma.csv
- K1_rev1/
my_motion_soma.csvis the Soma-retargeter output CSV.Convert it because Cyclo Lab does not train directly from the Soma-retargeter output CSV.
python3 scripts/tools/motion/soma_retargeter_csv_converter.py \-f source/cyclo_lab/data/motions/K1_rev1/my_motion/my_motion_soma.csv \--output_name source/cyclo_lab/data/motions/K1_rev1/my_motion/my_motion.npz \--headlessThis creates the Cyclo Lab motion files used after this step:
my_motion.csvfor Sim2Real deploymentmy_motion.npzfor training
After conversion,
my_motion_soma.csvis not used by training or Sim2Real deployment, so it can be removed from this folder. - motions/
-
Create a new mimic task config.
Create
my_motion_env_cfg.pyby duplicating the existingdance1_env_cfg.py. In the new file, setTRAJECTORY_FILEto the new NPZ path.TRAJECTORY_FILE = f"{CYCLO_LAB_ASSETS_DATA_DIR}/motions/K1_rev1/my_motion/my_motion.npz" -
Register the new task.
The files you edit are here:
source/cyclo_lab/cyclo_lab/simulation_tasks/manager_based/mimic/config/- k1_rev1/
- agents/
__init__.pybase_env_cfg.pydance1_env_cfg.pydance2_env_cfg.pymy_motion_env_cfg.pycloned from dance1_env_cfg.py
Add new entries to
source/cyclo_lab/cyclo_lab/simulation_tasks/manager_based/mimic/config/k1_rev1/__init__.pyfor:gym.register( id="Cyclo-Mimic-K1-Rev1-MyMotion", entry_point="isaaclab.envs:ManagerBasedRLEnv", disable_env_checker=True, kwargs={ "env_cfg_entry_point": f"{__name__}.my_motion_env_cfg:K1Rev1EnvCfg", "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:K1Rev1MimicPPORunnerCfg", },)gym.register( id="Cyclo-Mimic-K1-Rev1-MyMotion-Play", entry_point="isaaclab.envs:ManagerBasedRLEnv", disable_env_checker=True, kwargs={ "env_cfg_entry_point": f"{__name__}.my_motion_env_cfg:K1Rev1PlayEnvCfg", "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:K1Rev1MimicPPORunnerCfg", },)Check that both task IDs are registered:
python3 scripts/tools/list_envs.py | grep MyMotionCheck that the output includes
Cyclo-Mimic-K1-Rev1-MyMotionandCyclo-Mimic-K1-Rev1-MyMotion-Play. - k1_rev1/
-
Train and export.
Run
train.pywith the new task ID. Then runplay.pyto preview the policy and exportexported/policy.onnx.