ddpg-reacher

A Deep Deterministic Policy Gradient Actor-Critic reinforcement learning solution to the Unity-ML(Udacity) Reacher environment.

Introduction

The Environment

Reacher is an environment in which 20 agents control a double-jointed arm each to move to a target location. The target (goal location) is moving and each agent receives a reward of +0.1 for each step that the agent's hand is in the goal location. Thus, the goal of each agent is to maintain its position at the target location for as many time steps as possible.

Set-up: Double-jointed arm which can move to target locations.

Goal: The agents must move its hand to the goal location, and keep it there.

Agents: The environment contains 20 agent with same Behavior Parameters.

Agent Reward Function (agent independent): +0.1 Each step agent's hand is in goal location.

Behavior Parameters:

Vector Observation space (State Space): 26 variables corresponding to position, rotation, velocity, and angular velocities of the two arm rigid bodies.
Actions (Action Space): 4 continuous actions, corresponding to torque applicable to two joints.

Benchmark Mean Reward: 30 Turns: An episode completes after 1000 frames

Getting Started

To set up your python environment and run the code in this repository, follow the instructions below.

setup Conda Python environment

Create (and activate) a new environment with Python 3.6.

Linux or Mac:

	conda create --name ddpg-rl python=3.6
	source activate ddpg-rl

Windows:

	conda create --name ddpg-rl python=3.6 
	activate ddpg-rl

Download repository

Clone the repository and install dependencies

	git clone https://github.com/kotsonis/ddpg-reacher.git
	cd ddpg-reacher
	pip install -r requirements.txt

Install Reacher environment

Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
Place the file in the ddpg-reacher folder, and unzip (or decompress) the file.
edit config.py to and set the self.reacher_location entry to point to the right location. Example :

self.reacher_location = './Reacher_Windows_x86_64/Reacher.exe'

alternatively you can pass the environment location to train.py with the reacher_location= argument

Instructions

Training

To train an agent, train.py reads the hyperparameters from config.py and accepts command line options to modify parameters and/or set saving options.You can get the CLI options by running

python train.py -h

to run a training with the parameters that produced a solution, you can run:

python train.py --save-model_dir=model --output-image=reacher_v3.png --episodes=200 --batch-size=256 --eps-decay=0.99 --n_step=7

Playing with a trained model

you can see the agent playing with the trained model as follows:

python play.py

You can also specify the number of episodes you want the agent to play, as well as the non-default trained model location as follows:

python play.py --episodes 20 save-model_dir=./new_reacher

Implementation and results

You can read about the implementation details and the results obtained in Report.md

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.vscode		.vscode
images		images
model		model
trained_model		trained_model
.gitignore		.gitignore
OUNoise.py		OUNoise.py
README.md		README.md
Report.md		Report.md
agent.py		agent.py
buffers.py		buffers.py
config.py		config.py
models.py		models.py
play.py		play.py
requirements.txt		requirements.txt
segmenttrees.py		segmenttrees.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ddpg-reacher

Introduction

The Environment

Behavior Parameters:

Getting Started

setup Conda Python environment

Download repository

Install Reacher environment

Instructions

Training

Playing with a trained model

Implementation and results

About

Releases

Packages

Languages

kotsonis/ddpg-reacher

Folders and files

Latest commit

History

Repository files navigation

ddpg-reacher

Introduction

The Environment

Behavior Parameters:

Getting Started

setup Conda Python environment

Download repository

Install Reacher environment

Instructions

Training

Playing with a trained model

Implementation and results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages