GitHub - taochenshh/hcp: (NeurIPS 2018) Hardware Conditioned Policies for Multi-Robot Transfer Learning

Hardware Conditioned Policies for Multi-Robot Transfer Learning

In NeurIPS 2018 [Project Website] [Demo Video] [pdf]

The Robotics Institute, Carnegie Mellon University

This is a pytorch-based implementation for our NeurIPS 2018 paper on hardware conditioned policies. The idea is that the policy input(state) is augmented with a hardware-specific encoding vector for better multi-robot skill transfer. The encoding vector can be either explicitly constructed (HCP-E) or learned implicitly via back-propagation (HCP-I). It's compatible with most of the existing deep reinforcement learning algorithms. We demonstrate the usage of our idea with DDPG+HER and PPO. If you find this work useful in your research, please cite:

@inproceedings{chen2018hardware,
  title={Hardware Conditioned Policies for Multi-Robot Transfer Learning},
  author={Chen, Tao and Murali, Adithyavairavan and Gupta, Abhinav},
  booktitle={Advances in Neural Information Processing Systems},
  pages={9355--9366},
  year={2018}
}

The code has been tested on Ubuntu 16.04.

Installation

Install Anaconda
Download code repo:

cd ~
git clone https://github.com/taochenshh/hcp.git
cd hcp

Create python environment

conda env create -f environment.yml
conda activate hcp

Install MuJoCo and mujoco-py 1.50

HCP-E Usage

Generate robot xml files

cd gen_robots
chmod +x gen_multi_dof_simrobot.sh
## generate both peg_insertion and reacher environments
./gen_multi_dof_simrobot.sh peg_insertion reacher
## generate peg_insertion environments only
./gen_multi_dof_simrobot.sh peg_insertion
## generate reacher environments only
./gen_multi_dof_simrobot.sh reacher

Train the policy model

cd ../HCP-E

## HCP-E: peg_insertion
python main.py --env=peg_insertion --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/peg_insertion --save_dir=peg_data/HCP-E

## HCP-E: reacher
cd util
python gen_start_and_goal.py
cd ..
python main.py --env=reacher --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/reacher --save_dir=reacher_data/HCP-E

Test the policy model

## HCP-E: peg_insertion
python main.py --env=peg_insertion --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/peg_insertion --save_dir=peg_data/HCP-E --test

## HCP-E: reacher
python main.py --env=reacher --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/reacher --save_dir=reacher_data/HCP-E --test

Add --render in the end if you want to visually test the policy.

HCP-I Usage

Generate robot xml files

cd gen_robots
python gen_hoppers.py --robot_num=1000

Train the policy model

cd ../HCP-I

python main.py --env=hopper --with_embed --robot_dir=../xml/gen_xmls/hopper --save_dir=hopper_data/HCP-I

Test the policy model

python main.py --env=hopper --with_embed --robot_dir=../xml/gen_xmls/hopper --save_dir=hopper_data/HCP-I --test

Add --render in the end if you want to visually test the policy.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
HCP-E		HCP-E
HCP-I		HCP-I
gen_robots		gen_robots
images		images
xml		xml
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hardware Conditioned Policies for Multi-Robot Transfer Learning

In NeurIPS 2018 [Project Website] [Demo Video] [pdf]

Installation

HCP-E Usage

HCP-I Usage

About

Releases

Packages

Languages

taochenshh/hcp

Folders and files

Latest commit

History

Repository files navigation

Hardware Conditioned Policies for Multi-Robot Transfer Learning

In NeurIPS 2018 [Project Website] [Demo Video] [pdf]

Installation

HCP-E Usage

HCP-I Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages