RL Toolkit

Papers

Installation with PyPI

On PC AMD64 with Ubuntu/Debian

Install dependences
```
apt update -y
apt install swig -y
```
Install RL-Toolkit
```
pip3 install rl-toolkit[all]
```

Run (for Server)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 server

Run (for Agent)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 agent --db_server localhost

Run (for Learner)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 learner --db_server 192.168.1.2

Run (for Tester)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 tester -f save/model/actor.h5

On NVIDIA Jetson

Install dependences
Tensorflow for JetPack, follow instructions here for installation.
```
sudo apt install swig -y
```

Install Reverb
Download Bazel 3.7.2 for arm64, here

mkdir ~/bin
mv ~/Downloads/bazel-3.7.2-linux-arm64 ~/bin/bazel
chmod +x ~/bin/bazel
export PATH=$PATH:~/bin

Clone Reverb with version that corespond with TF verion installed on NVIDIA Jetson !

git clone https://github.com/deepmind/reverb
cd reverb/
git checkout r0.9.0

Make changes in Reverb before building !
In .bazelrc

- build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain
+ # build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain

- build --copt=-mavx --copt=-DEIGEN_MAX_ALIGN_BYTES=64
+ build --copt=-DEIGEN_MAX_ALIGN_BYTES=64

In WORKSPACE

- PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
+ # PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
+ PROTOC_SHA256 = "7877fee5793c3aafd704e290230de9348d24e8612036f1d784c8863bc790082e"

In oss_build.sh

-  bazel test -c opt --copt=-mavx --config=manylinux2010 --test_output=errors //reverb/cc/...
+  bazel test -c opt --copt="-march=armv8-a+crypto" --test_output=errors //reverb/cc/...

# Builds Reverb and creates the wheel package.
-  bazel build -c opt --copt=-mavx $EXTRA_OPT --config=manylinux2010 reverb/pip_package:build_pip_package
+  bazel build -c opt --copt="-march=armv8-a+crypto" $EXTRA_OPT reverb/pip_package:build_pip_package

In reverb/cc/platform/default/repo.bzl

urls = [
   -        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-x86_64.zip" % (version, version),
   +        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-aarch_64.zip" % (version, version),
]

In reverb/pip_package/build_pip_package.sh

-  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG} --plat manylinux2010_x86_64 > /dev/null
+  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG}  > /dev/null

Build and install

bash oss_build.sh --clean true --tf_dep_override "tensorflow~=2.9.1" --release --python "3.8"
bash ./bazel-bin/reverb/pip_package/build_pip_package --dst /tmp/reverb/dist/ --release
pip3 install /tmp/reverb/dist/dm_reverb-*

Cleaning

cd ../
rm -R reverb/

Install RL-Toolkit
```
pip3 install rl-toolkit
```

Run (for Server)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 server

Run (for Agent)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 agent --db_server localhost

Run (for Learner)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 learner --db_server 192.168.1.2

Run (for Tester)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 tester -f save/model/actor.h5

Environments

Environment	Observation space	Observation bounds	Action space	Action bounds	Reward bounds
BipedalWalkerHardcore-v3	(24, )	[-inf, inf]	(4, )	[-1.0, 1.0]	[-1.0, 1.0]
Walker2DBulletEnv-v0	(22, )	[-inf, inf]	(6, )	[-1.0, 1.0]	[-1.0, 1.0]
AntBulletEnv-v0	(28, )	[-inf, inf]	(8, )	[-1.0, 1.0]	[-1.0, 1.0]
HalfCheetahBulletEnv-v0	(26, )	[-inf, inf]	(6, )	[-1.0, 1.0]	[-1.0, 1.0]
HopperBulletEnv-v0	(15, )	[-inf, inf]	(3, )	[-1.0, 1.0]	[-1.0, 1.0]
HumanoidBulletEnv-v0	(44, )	[-inf, inf]	(17, )	[-1.0, 1.0]	[-1.0, 1.0]
MinitaurBulletEnv-v0	(28, )	[-167.72488, 167.72488]	(8, )	[-1.0, 1.0]	[-1.0, 1.0]

Results

Environment	SAC + gSDE	SAC + gSDE + Huber loss	SAC + TQC + gSDE	RL-Toolkit
BipedalWalkerHardcore-v3	13 ± 18⁽²⁾	239 ± 118	228 ± 18⁽²⁾	205 ± 134
Walker2DBulletEnv-v0	2270 ± 28⁽¹⁾	2732 ± 96	2535 ± 94⁽²⁾	3123 ± 594
AntBulletEnv-v0	3106 ± 61⁽¹⁾	3460 ± 119	3700 ± 37⁽²⁾	3993 ± 214
HalfCheetahBulletEnv-v0	2945 ± 95⁽¹⁾	3003 ± 226	3041 ± 157⁽²⁾	2762 ± 153
HopperBulletEnv-v0	2515 ± 50⁽¹⁾	2555 ± 405	2401 ± 62⁽²⁾	2151 ± 664

Releases

SAC + gSDE + Huber loss
is stored here, branch r2.0
SAC + TQC + gSDE + LogCosh + Reverb
is stored here, branch r4.0

Frameworks: Tensorflow, Reverb, OpenAI Gym, PyBullet, WanDB, OpenCV

Name		Name	Last commit message	Last commit date
Latest commit History 1,005 Commits
.github		.github
config		config
docker		docker
img		img
models		models
rl_toolkit		rl_toolkit
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL Toolkit

Papers

Installation with PyPI

On PC AMD64 with Ubuntu/Debian

On NVIDIA Jetson

Environments

Results

Releases

About

Releases 8

Sponsor this project

Packages

Contributors 2

Languages

License

markub3327/rl-toolkit

Folders and files

Latest commit

History

Repository files navigation

RL Toolkit

Papers

Installation with PyPI

On PC AMD64 with Ubuntu/Debian

On NVIDIA Jetson

Environments

Results

Releases

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 8

Sponsor this project

Packages 0

Contributors 2

Languages

Packages