Skip to content

Commit

Permalink
update rl notes
Browse files Browse the repository at this point in the history
  • Loading branch information
matheecs committed Aug 8, 2024
1 parent 4a0bddd commit 436eb07
Showing 1 changed file with 52 additions and 50 deletions.
102 changes: 52 additions & 50 deletions _posts/2024-07-31-RL.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,7 @@ author: "Jixiang Zhang"

## 构建并行仿真环境

<p align="center">
<img src="{{site.baseurl}}/images/rl.jpg" width="500"/>
</p>
<p align="center"><img src="{{site.baseurl}}/images/rl.jpg" width="500"/></p>

## 训练算法与环境交互

Expand All @@ -20,57 +18,30 @@ author: "Jixiang Zhang"

### class `LeggedRobotCfg`

* env
* terrian
* init_state
* control
* sim
* viewer
* noise
* normalization
* commands
* asset
* domain_rand
* rewards
| XXX | function |
| ------------- | -------- |
| env | 环境信息 |
| terrian | 地形信息 |
| init_state | 初始状态 |
| control | 关节控制 |
| sim | 仿真参数 |
| viewer | 观察设置 |
| noise | 噪声参数 |
| normalization | 缩放参数 |
| commands | 指令参数 |
| asset | 机器模型 |
| domain_rand | 域随机化 |
| rewards | 奖励参数 |

### class `LeggedRobotCfgPPO`

* policy
* runner
* algorithm
| XXX | function |
| --------- | ----------- |
| policy | 策略网络 |
| runner | ActorCritic |
| algorithm | 算法参数 |

```bash
Learning iteration 20291/40000

Computation: 215205 steps/s (collection: 0.333s, learning 0.124s)
Value function loss: 0.2949
Surrogate loss: -0.0028
Mean action noise std: 0.51
Mean reward: 628.45
Mean episode length: 1985.67
Mean episode rew_action_rate: -1.6791
Mean episode rew_ankle_action_rate: -0.3818
Mean episode rew_ankle_dof_acc: -0.0201
Mean episode rew_collision: -0.0008
Mean episode rew_dof_acc: -0.1378
Mean episode rew_dof_pos_limits: -0.0000
Mean episode rew_feet_contact_forces: -0.2618
Mean episode rew_feet_distance: 0.9603
Mean episode rew_lin_vel_z: -0.0237
Mean episode rew_orientation: 9.7049
Mean episode rew_target_joint_pos_l: 9.0367
Mean episode rew_target_joint_pos_r: 7.1679
Mean episode rew_termination: -0.0000
Mean episode rew_torque_limits: -0.0351
Mean episode rew_torques: -0.1538
Mean episode rew_tracking_ang_vel: 1.7254
Mean episode rew_tracking_lin_x_vel: 3.2664
Mean episode rew_tracking_lin_y_vel: 2.3819
```

<p align="center">
<img src="{{site.baseurl}}/images/classes.png" width="500"/>
</p>
<p align="center"> <img src="{{site.baseurl}}/images/classes.png" width="500"/></p>

## 构造仿真环境 `create_sim`

Expand Down Expand Up @@ -192,4 +163,35 @@ Mean episode rew_tracking_lin_y_vel: 2.3819
* 🏆 _reward_target_ankle_pos **exp**
* 🏆 _reward_target_hip_roll_pos **exp**

**训练日志**

```bash
Learning iteration 20291/40000

Computation: 215205 steps/s (collection: 0.333s, learning 0.124s)
Value function loss: 0.2949
Surrogate loss: -0.0028
Mean action noise std: 0.51
Mean reward: 628.45
Mean episode length: 1985.67
Mean episode rew_action_rate: -1.6791
Mean episode rew_ankle_action_rate: -0.3818
Mean episode rew_ankle_dof_acc: -0.0201
Mean episode rew_collision: -0.0008
Mean episode rew_dof_acc: -0.1378
Mean episode rew_dof_pos_limits: -0.0000
Mean episode rew_feet_contact_forces: -0.2618
Mean episode rew_feet_distance: 0.9603
Mean episode rew_lin_vel_z: -0.0237
Mean episode rew_orientation: 9.7049
Mean episode rew_target_joint_pos_l: 9.0367
Mean episode rew_target_joint_pos_r: 7.1679
Mean episode rew_termination: -0.0000
Mean episode rew_torque_limits: -0.0351
Mean episode rew_torques: -0.1538
Mean episode rew_tracking_ang_vel: 1.7254
Mean episode rew_tracking_lin_x_vel: 3.2664
Mean episode rew_tracking_lin_y_vel: 2.3819
```

<https://github.com/engineai-robotics/engineai_legged_gym>

0 comments on commit 436eb07

Please sign in to comment.