Multi-Agent Deep Reinforcement Learning

    # with docker-compose
    $ docker-compose up

    # with docker
    $ docker build gym -t gym
    $ docker run gym
    $ docker build . -t multi-agent-learning
    $ docker run multi-agent-learning

An implementation of Multi-Agent Reinforcement Learning with Deep Sarsa Agents using tensorflow-js. n agents train each a sarsa policy network on gridworld, a game similar to the gym environment in gym-gridworld.

Intuition: Distributing exploratory moves over n agents speeds up game exploration and overall convergence.

The algorithm presented in this codebase obtains an action by predictions from n simultaneously trained sarsa policy networks. On startup a master process spawns n worker processes, for each agent one. Workers communicate with each other via IPC messages using node cluster. To obtain an action, agent A broadcasts a prediction request to n-1 agents and generates one prediction from its own sarsa network model. Each worker is running the same agent, trains on the same game and sends/responds to prediction requests. When all agents responded with their prediction, the action with the largest value (Q_value) from all n predictions is obtained. During network training a factor epsilon between [epsilon_min,1] determines the fraction of times the policy action is picked versus a random action is obtained to transfer to the next state, i.e. the implementation is epsilon-greedy. Epsilon decays with increasing learning episodes. Each policy network has 4 hidden layers activated with ReLU.

Multiple collaborating agents require less episodes to solve gridworld. I think of it like a team of people that initially explore a maze individually by picking random routes. As they build up knowledge they take turns on intersections based on their collectively learned experience and get to the end quicker than they would all by themselves.

Score over episodes - each agent shares experience with 2 adjacent neighbors

Score over episodes - each agent shares experience with all other agents

Configure the game, cluster and agents

./config.json

{
  "environmentId": "MountainCar-v0",
  "workers": 16,
  "nearestNeighbor": true,
  "agent": {
    "discountFactor": 0.9,
    "episodes": 1500,
    "learningRate": 0.001,
    "epsilonDecay": 0.9998,
    "epsilonMin": 0.01
  },
  "gymApi": {
    "protocol": "http:",
    "hostname": "127.0.0.1",
    "port": "5000",
    "pathname": "/v1/envs/"
  }
}

LICENSE

MIT License. Please see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
bin		bin
gym		gym
results		results
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.json		config.json
docker-compose.yml		docker-compose.yml
package.json		package.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent Deep Reinforcement Learning

Configure the game, cluster and agents

LICENSE

About

Releases

Packages

Languages

License

hansman/multi-agent-reinforcement-learning

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Deep Reinforcement Learning

Configure the game, cluster and agents

LICENSE

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages