Skip to content

Commit

Permalink
update readme for detail
Browse files Browse the repository at this point in the history
  • Loading branch information
craipy-hub committed Dec 25, 2024
1 parent 7545c89 commit 757b051
Show file tree
Hide file tree
Showing 4 changed files with 600 additions and 52 deletions.
171 changes: 119 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,95 +1,162 @@
# Unitree RL GYM
<div align="center">
<h1 align="center">Unitree RL GYM</h1>
<p align="center">
<span> 🌎English </span> | <a href="README_zh.md"> 🇨🇳中文 </a>
</p>
</div>

This is a simple example of using Unitree Robots for reinforcement learning, including Unitree Go2, H1, H1_2, G1
<p align="center">
<strong>This is a repository for reinforcement learning implementation based on Unitree robots, supporting Unitree Go2, H1, H1_2, and G1.</strong>
</p>

| Isaac Gym | Mujoco | Physical |
<div align="center">

| <div align="center"> Isaac Gym </div> | <div align="center"> Mujoco </div> | <div align="center"> Physical </div> |
|--- | --- | --- |
| [<img src="https://oss-global-cdn.unitree.com/static/32f06dc9dfe4452dac300dda45e86b34.GIF" width="240px">](https://oss-global-cdn.unitree.com/static/5bbc5ab1d551407080ca9d58d7bec1c8.mp4) | [<img src="https://oss-global-cdn.unitree.com/static/244cd5c4f823495fbfb67ef08f56aa33.GIF" width="240px">](https://oss-global-cdn.unitree.com/static/5aa48535ffd641e2932c0ba45c8e7854.mp4) | [<img src="https://oss-global-cdn.unitree.com/static/78c61459d3ab41448cfdb31f6a537e8b.GIF" width="240px">](https://oss-global-cdn.unitree.com/static/0818dcf7a6874b92997354d628adcacd.mp4) |

## 1. Installation
</div>

---

## 📦 Installation and Configuration

1. Create a new python virtual env with python 3.8
Please refer to [setup.md](/doc/setup_en.md) for installation and configuration steps.

2. Install pytorch 2.3.1 with cuda-12.1:
## 🔁 Process Overview

```bash
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
```
3. Install Isaac Gym
The basic workflow for using reinforcement learning to achieve motion control is:

- Download and install Isaac Gym Preview 4 from [https://developer.nvidia.com/isaac-gym](https://developer.nvidia.com/isaac-gym)
- `cd isaacgym/python && pip install -e .`
- Try running an example `cd examples && python 1080_balls_of_solitude.py`
- For troubleshooting check docs isaacgym/docs/index.html
4. Install rsl_rl (PPO implementation)
`Train``Play``Sim2Sim``Sim2Real`

- Clone [https://github.com/leggedrobotics/rsl_rl](https://github.com/leggedrobotics/rsl_rl)
- `cd rsl_rl && git checkout v1.0.2 && pip install -e .`
- **Train**: Use the Gym simulation environment to let the robot interact with the environment and find a policy that maximizes the designed rewards. Real-time visualization during training is not recommended to avoid reduced efficiency.
- **Play**: Use the Play command to verify the trained policy and ensure it meets expectations.
- **Sim2Sim**: Deploy the Gym-trained policy to other simulators to ensure it’s not overly specific to Gym characteristics.
- **Sim2Real**: Deploy the policy to a physical robot to achieve motion control.

5. Install unitree_rl_gym
## 🛠️ User Guide

- Navigate to the folder `unitree_rl_gym`
- `pip install -e .`
### 1. Training

6. Install unitree_sdk2py (Optional for deploy on real robot)
Run the following command to start training:

- Clone [https://github.com/unitreerobotics/unitree_sdk2_python](https://github.com/unitreerobotics/unitree_sdk2_python)
- `cd unitree_sdk2_python & pip install -e .`
```bash
python legged_gym/scripts/train.py --task=xxx
```

## 2. Train in Isaac Gym
#### ⚙️ Parameter Description
- `--task`: Required parameter; values can be (go2, g1, h1, h1_2).
- `--headless`: Defaults to starting with a graphical interface; set to true for headless mode (higher efficiency).
- `--resume`: Resume training from a checkpoint in the logs.
- `--experiment_name`: Name of the experiment to run/load.
- `--run_name`: Name of the run to execute/load.
- `--load_run`: Name of the run to load; defaults to the latest run.
- `--checkpoint`: Checkpoint number to load; defaults to the latest file.
- `--num_envs`: Number of environments for parallel training.
- `--seed`: Random seed.
- `--max_iterations`: Maximum number of training iterations.
- `--sim_device`: Simulation computation device; specify CPU as `--sim_device=cpu`.
- `--rl_device`: Reinforcement learning computation device; specify CPU as `--rl_device=cpu`.

1. Train:
`python legged_gym/scripts/train.py --task=go2`
**Default Training Result Directory**: `logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt`

* To run on CPU add following arguments: `--sim_device=cpu`, `--rl_device=cpu` (sim on CPU and rl on GPU is possible).
* To run headless (no rendering) add `--headless`.
* **Important** : To improve performance, once the training starts press `v` to stop the rendering. You can then enable it later to check the progress.
* The trained policy is saved in `logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt`. Where `<experiment_name>` and `<run_name>` are defined in the train config.
* The following command line arguments override the values set in the config files:
* --task TASK: Task name.
* --resume: Resume training from a checkpoint
* --experiment_name EXPERIMENT_NAME: Name of the experiment to run or load.
* --run_name RUN_NAME: Name of the run.
* --load_run LOAD_RUN: Name of the run to load when resume=True. If -1: will load the last run.
* --checkpoint CHECKPOINT: Saved model checkpoint number. If -1: will load the last checkpoint.
* --num_envs NUM_ENVS: Number of environments to create.
* --seed SEED: Random seed.
* --max_iterations MAX_ITERATIONS: Maximum number of training iterations.
2. Play:`python legged_gym/scripts/play.py --task=go2`
---

* By default, the loaded policy is the last model of the last run of the experiment folder.
* Other runs/model iteration can be selected by setting `load_run` and `checkpoint` in the train config.
### 2. Play

To visualize the training results in Gym, run the following command:

```bash
python legged_gym/scripts/play.py --task=xxx
```

### 2.1 Play Demo
**Description**:

- Play’s parameters are the same as Train’s.
- By default, it loads the latest model from the experiment folder’s last run.
- You can specify other models using `load_run` and `checkpoint`.

#### 💾 Export Network

Play exports the Actor network, saving it in `logs/{experiment_name}/exported/policies`:
- Standard networks (MLP) are exported as `policy_1.pt`.
- RNN networks are exported as `policy_lstm_1.pt`.

### Play Results

| Go2 | G1 | H1 | H1_2 |
|--- | --- | --- | --- |
| [![go2](https://oss-global-cdn.unitree.com/static/ba006789e0af4fe3867255f507032cd7.GIF)](https://oss-global-cdn.unitree.com/static/d2e8da875473457c8d5d69c3de58b24d.mp4) | [![g1](https://oss-global-cdn.unitree.com/static/32f06dc9dfe4452dac300dda45e86b34.GIF)](https://oss-global-cdn.unitree.com/static/5bbc5ab1d551407080ca9d58d7bec1c8.mp4) | [![h1](https://oss-global-cdn.unitree.com/static/fa04e73966934efa9838e9c389f48fa2.GIF)](https://oss-global-cdn.unitree.com/static/522128f4640c4f348296d2761a33bf98.mp4) |[![h1_2](https://oss-global-cdn.unitree.com/static/83ed59ca0dab4a51906aff1f93428650.GIF)](https://oss-global-cdn.unitree.com/static/15fa46984f2343cb83342fd39f5ab7b2.mp4)|

## 3. Sim in Mujoco
---

### 3.1 Mujoco Usage
### 3. Sim2Sim (Mujoco)

To execute sim2sim in mujoco, execute the following command:
Run Sim2Sim in the Mujoco simulator:

```bash
python deploy/deploy_mujoco/deploy_mujoco.py {config_name}
```

`config_name`: The file name of the configuration file. The configuration file will be found under `deploy/deploy_mujoco/configs/`, for example `g1.yaml`, `h1.yaml`, `h1_2.yaml`.
#### Parameter Description
- `config_name`: Configuration file; default search path is `deploy/deploy_mujoco/configs/`.

**example**:
#### Example: Running G1

```bash
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml
```

### 3.2 Mujoco Demo
#### ➡️ Replace Network Model

The default model is located at `deploy/pre_train/{robot}/motion.pt`; custom-trained models are saved in `logs/g1/exported/policies/policy_lstm_1.pt`. Update the `policy_path` in the YAML configuration file accordingly.

#### Simulation Results

| G1 | H1 | H1_2 |
|--- | --- | --- |
| [![mujoco_g1](https://oss-global-cdn.unitree.com/static/244cd5c4f823495fbfb67ef08f56aa33.GIF)](https://oss-global-cdn.unitree.com/static/5aa48535ffd641e2932c0ba45c8e7854.mp4) | [![mujoco_h1](https://oss-global-cdn.unitree.com/static/7ab4e8392e794e01b975efa205ef491e.GIF)](https://oss-global-cdn.unitree.com/static/8934052becd84d08bc8c18c95849cf32.mp4) | [![mujoco_h1_2](https://oss-global-cdn.unitree.com/static/2905e2fe9b3340159d749d5e0bc95cc4.GIF)](https://oss-global-cdn.unitree.com/static/ee7ee85bd6d249989a905c55c7a9d305.mp4) |

## 4. Deploy on Physical Robot

reference to [Deploy on Physical Robot(English)](deploy/deploy_real/README.md) | [实物部署(简体中文)](deploy/deploy_real/README.zh.md)
---

### 4. Sim2Real (Physical Deployment)

Before deploying to the physical robot, ensure it’s in debug mode. Detailed steps can be found in the [Physical Deployment Guide](deploy/deploy_real/README.md):

```bash
python deploy/deploy_real/deploy_real.py {net_interface} {config_name}
```

#### Parameter Description
- `net_interface`: Network card name connected to the robot, e.g., `enp3s0`.
- `config_name`: Configuration file located in `deploy/deploy_real/configs/`, e.g., `g1.yaml`, `h1.yaml`, `h1_2.yaml`.

#### Deployment Results

| G1 | H1 | H1_2 |
|--- | --- | --- |
| [![real_g1](https://oss-global-cdn.unitree.com/static/78c61459d3ab41448cfdb31f6a537e8b.GIF)](https://oss-global-cdn.unitree.com/static/0818dcf7a6874b92997354d628adcacd.mp4) | [![real_h1](https://oss-global-cdn.unitree.com/static/fa07b2fd2ad64bb08e6b624d39336245.GIF)](https://oss-global-cdn.unitree.com/static/ea0084038d384e3eaa73b961f33e6210.mp4) | [![real_h1_2](https://oss-global-cdn.unitree.com/static/a88915e3523546128a79520aa3e20979.GIF)](https://oss-global-cdn.unitree.com/static/12d041a7906e489fae79d55b091a63dd.mp4) |

---

## 🎉 Acknowledgments

This repository is built upon the support and contributions of the following open-source projects. Special thanks to:

- [legged\_gym](https://github.com/leggedrobotics/legged_gym): The foundation for training and running codes.
- [rsl\_rl](https://github.com/leggedrobotics/rsl_rl.git): Reinforcement learning algorithm implementation.
- [mujoco](https://github.com/google-deepmind/mujoco.git): Providing powerful simulation functionalities.
- [unitree\_sdk2\_python](https://github.com/unitreerobotics/unitree_sdk2_python.git): Hardware communication interface for physical deployment.

---

## 🔖 License

This project is licensed under the [BSD 3-Clause License](./LICENSE):
1. The original copyright notice must be retained.
2. The project name or organization name may not be used for promotion.
3. Any modifications must be disclosed.

For details, please read the full [LICENSE file](./LICENSE).

163 changes: 163 additions & 0 deletions README_zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
<div align="center">
<h1 align="center">Unitree RL GYM</h1>
<p align="center">
<a href="README.md">🌎 English</a> | <span>🇨🇳 中文</span>
</p>
</div>

<p align="center">
🎮🚪 <strong>这是一个基于 Unitree 机器人实现强化学习的示例仓库,支持 Unitree Go2、H1、H1_2和 G1。</strong> 🚪🎮
</p>

<div align="center">

| <div align="center"> Isaac Gym </div> | <div align="center"> Mujoco </div> | <div align="center"> Physical </div> |
|--- | --- | --- |
| [<img src="https://oss-global-cdn.unitree.com/static/32f06dc9dfe4452dac300dda45e86b34.GIF" width="240px">](https://oss-global-cdn.unitree.com/static/5bbc5ab1d551407080ca9d58d7bec1c8.mp4) | [<img src="https://oss-global-cdn.unitree.com/static/244cd5c4f823495fbfb67ef08f56aa33.GIF" width="240px">](https://oss-global-cdn.unitree.com/static/5aa48535ffd641e2932c0ba45c8e7854.mp4) | [<img src="https://oss-global-cdn.unitree.com/static/78c61459d3ab41448cfdb31f6a537e8b.GIF" width="240px">](https://oss-global-cdn.unitree.com/static/0818dcf7a6874b92997354d628adcacd.mp4) |

</div>

---

## 📦 安装配置

安装和配置步骤请参考 [setup.md](/doc/setup_zh.md)

## 🔁 流程说明

强化学习实现运动控制的基本流程为:

`Train``Play``Sim2Sim``Sim2Real`

- **Train**: 通过 Gym 仿真环境,让机器人与环境互动,找到最满足奖励设计的策略。通常不推荐实时查看效果,以免降低训练效率。
- **Play**: 通过 Play 命令查看训练后的策略效果,确保策略符合预期。
- **Sim2Sim**: 将 Gym 训练完成的策略部署到其他仿真器,避免策略小众于 Gym 特性。
- **Sim2Real**: 将策略部署到实物机器人,实现运动控制。

## 🛠️ 使用指南

### 1. 训练

运行以下命令进行训练:

```bash
python legged_gym/scripts/train.py --task=xxx
```

#### ⚙️ 参数说明
- `--task`: 必选参数,值可选(go2, g1, h1, h1_2)
- `--headless`: 默认启动图形界面,设为 true 时不渲染图形界面(效率更高)
- `--resume`: 从日志中选择 checkpoint 继续训练
- `--experiment_name`: 运行/加载的 experiment 名称
- `--run_name`: 运行/加载的 run 名称
- `--load_run`: 加载运行的名称,默认加载最后一次运行
- `--checkpoint`: checkpoint 编号,默认加载最新一次文件
- `--num_envs`: 并行训练的环境个数
- `--seed`: 随机种子
- `--max_iterations`: 训练的最大迭代次数
- `--sim_device`: 仿真计算设备,指定 CPU 为 `--sim_device=cpu`
- `--rl_device`: 强化学习计算设备,指定 CPU 为 `--rl_device=cpu`

**默认保存训练结果**`logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt`

---

### 2. Play

如果想要在 Gym 中查看训练效果,可以运行以下命令:

```bash
python legged_gym/scripts/play.py --task=xxx
```

**说明**

- Play 启动参数与 Train 相同。
- 默认加载实验文件夹上次运行的最后一个模型。
- 可通过 `load_run``checkpoint` 指定其他模型。

#### 💾 导出网络

Play 会导出 Actor 网络,保存于 `logs/{experiment_name}/exported/policies` 中:
- 普通网络(MLP)导出为 `policy_1.pt`
- RNN 网络,导出为 `policy_lstm_1.pt`

### Play 效果

| Go2 | G1 | H1 | H1_2 |
|--- | --- | --- | --- |
| [![go2](https://oss-global-cdn.unitree.com/static/ba006789e0af4fe3867255f507032cd7.GIF)](https://oss-global-cdn.unitree.com/static/d2e8da875473457c8d5d69c3de58b24d.mp4) | [![g1](https://oss-global-cdn.unitree.com/static/32f06dc9dfe4452dac300dda45e86b34.GIF)](https://oss-global-cdn.unitree.com/static/5bbc5ab1d551407080ca9d58d7bec1c8.mp4) | [![h1](https://oss-global-cdn.unitree.com/static/fa04e73966934efa9838e9c389f48fa2.GIF)](https://oss-global-cdn.unitree.com/static/522128f4640c4f348296d2761a33bf98.mp4) |[![h1_2](https://oss-global-cdn.unitree.com/static/83ed59ca0dab4a51906aff1f93428650.GIF)](https://oss-global-cdn.unitree.com/static/15fa46984f2343cb83342fd39f5ab7b2.mp4)|

---

### 3. Sim2Sim (Mujoco)

支持在 Mujoco 仿真器中运行 Sim2Sim:

```bash
python deploy/deploy_mujoco/deploy_mujoco.py {config_name}
```

#### 参数说明
- `config_name`: 配置文件,默认查询路径为 `deploy/deploy_mujoco/configs/`

#### 示例:运行 G1

```bash
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml
```

#### ➡️ 替换网络模型

默认模型位于 `deploy/pre_train/{robot}/motion.pt`;自己训练模型保存于`logs/g1/exported/policies/policy_lstm_1.pt`,只需替换 yaml 配置文件中 `policy_path`

#### 运行效果

| G1 | H1 | H1_2 |
|--- | --- | --- |
| [![mujoco_g1](https://oss-global-cdn.unitree.com/static/244cd5c4f823495fbfb67ef08f56aa33.GIF)](https://oss-global-cdn.unitree.com/static/5aa48535ffd641e2932c0ba45c8e7854.mp4) | [![mujoco_h1](https://oss-global-cdn.unitree.com/static/7ab4e8392e794e01b975efa205ef491e.GIF)](https://oss-global-cdn.unitree.com/static/8934052becd84d08bc8c18c95849cf32.mp4) | [![mujoco_h1_2](https://oss-global-cdn.unitree.com/static/2905e2fe9b3340159d749d5e0bc95cc4.GIF)](https://oss-global-cdn.unitree.com/static/ee7ee85bd6d249989a905c55c7a9d305.mp4) |


---

### 4. Sim2Real (实物部署)

实现实物部署前,确保机器人进入调试模式。详细步骤请参考 [实物部署指南](deploy/deploy_real/README.zh.md)

```bash
python deploy/deploy_real/deploy_real.py {net_interface} {config_name}
```

#### 参数说明
- `net_interface`: 连接机器人网卡名称,如 `enp3s0`
- `config_name`: 配置文件,存在于 `deploy/deploy_real/configs/`,如 `g1.yaml``h1.yaml``h1_2.yaml`

#### 运行效果

| G1 | H1 | H1_2 |
|--- | --- | --- |
| [![real_g1](https://oss-global-cdn.unitree.com/static/78c61459d3ab41448cfdb31f6a537e8b.GIF)](https://oss-global-cdn.unitree.com/static/0818dcf7a6874b92997354d628adcacd.mp4) | [![real_h1](https://oss-global-cdn.unitree.com/static/fa07b2fd2ad64bb08e6b624d39336245.GIF)](https://oss-global-cdn.unitree.com/static/ea0084038d384e3eaa73b961f33e6210.mp4) | [![real_h1_2](https://oss-global-cdn.unitree.com/static/a88915e3523546128a79520aa3e20979.GIF)](https://oss-global-cdn.unitree.com/static/12d041a7906e489fae79d55b091a63dd.mp4) |

---

## 🎉 致谢

本仓库开发离不开以下开源项目的支持与贡献,特此感谢:

- [legged\_gym](https://github.com/leggedrobotics/legged_gym): 构建训练与运行代码的基础。
- [rsl\_rl](https://github.com/leggedrobotics/rsl_rl.git): 强化学习算法实现。
- [mujoco](https://github.com/google-deepmind/mujoco.git): 提供强大仿真功能。
- [unitree\_sdk2\_python](https://github.com/unitreerobotics/unitree_sdk2_python.git): 实物部署硬件通信接口。


---

## 🔖 许可证

本项目根据 [BSD 3-Clause License](./LICENSE) 授权:
1. 必须保留原始版权声明。
2. 禁止以项目名或组织名作举。
3. 声明所有修改内容。

详情请阅读完整 [LICENSE 文件](./LICENSE)

Loading

0 comments on commit 757b051

Please sign in to comment.