Skip to content

Commit

Permalink
polish(pu): add unizero quick start in readme
Browse files Browse the repository at this point in the history
  • Loading branch information
puyuan1996 committed Jul 24, 2024
1 parent a0bf161 commit 00f82fb
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 2 deletions.
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ LightZero is a library with a [PyTorch](https://pytorch.org/) implementation of
- [Stochastic MuZero](https://openreview.net/pdf?id=X6D9bAHhBQ1)
- [EfficientZero](https://arxiv.org/abs/2111.00210)
- [Gumbel MuZero](https://openreview.net/pdf?id=bERaNdoegnO&)
- [ReZero](https://arxiv.org/abs/2404.16364)
- [UniZero](https://arxiv.org/abs/2406.10667)

The environments and algorithms currently supported by LightZero are shown in the table below:

Expand Down Expand Up @@ -215,6 +217,13 @@ cd LightZero
python3 -u zoo/board_games/tictactoe/config/tictactoe_muzero_bot_mode_config.py
```
Train a UniZero agent to play [Pong](https://gymnasium.farama.org/environments/atari/pong/):
```bash
cd LightZero
python3 -u zoo/atari/config/atari_unizero_config.py
```
## 📚 Documentation
The LightZero documentation can be found [here](https://opendilab.github.io/LightZero/). It contains tutorials and the API reference.
Expand Down
11 changes: 10 additions & 1 deletion README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,8 @@ LightZero 是基于 [PyTorch](https://pytorch.org/) 实现的 MCTS 算法库,
- [Stochastic MuZero](https://openreview.net/pdf?id=X6D9bAHhBQ1)
- [EfficientZero](https://arxiv.org/abs/2111.00210)
- [Gumbel MuZero](https://openreview.net/pdf?id=bERaNdoegnO&)

- [ReZero](https://arxiv.org/abs/2404.16364)
- [UniZero](https://arxiv.org/abs/2406.10667)

LightZero 目前支持的环境及算法如下表所示:

Expand Down Expand Up @@ -196,6 +197,14 @@ python3 -u zoo/atari/config/atari_muzero_config.py
cd LightZero
python3 -u zoo/board_games/tictactoe/config/tictactoe_muzero_bot_mode_config.py
```

使用如下代码在 [Pong](https://gymnasium.farama.org/environments/atari/pong/) 环境上快速训练一个 UniZero 智能体:

```bash
cd LightZero
python3 -u zoo/atari/config/atari_unizero_config.py
```

## 📚 文档

LightZero的文档可以在[这里](https://opendilab.github.io/LightZero/)找到。文档中包含教程和API参考。
Expand Down
2 changes: 1 addition & 1 deletion lzero/envs/tests/test_ding_env_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,6 @@ def test(self):

obs = ding_env.reset()

assert isinstance(obs[0], np.ndarray)
assert isinstance(obs, (np.ndarray, float))
action = ding_env.random_action()
print('random_action: {}, action_space: {}'.format(action.shape, ding_env.action_space))

0 comments on commit 00f82fb

Please sign in to comment.