Skip to content

Latest commit

 

History

History
13 lines (8 loc) · 740 Bytes

File metadata and controls

13 lines (8 loc) · 740 Bytes

More Stable Multi-Agent Reinforcement Learning

This project aims to test/implement various algorithms that are related to Multi-agent RL to see whether these algorithm can lead the agent to more stable traing and/or desire behavior (Nash Equilibrium)

Right now we test most of the algorithms on Iterated Prisoner's Dilema to see whether tit-for-tat behavior arises from these kind of training.

Algorithms

Right now, I have planned to implement 3 algorithms

  1. Multiagent learning using a variable learning rate (https://www.sciencedirect.com/science/article/pii/S0004370202001212)
  2. Consensus Optimization from https://arxiv.org/abs/1705.10461
  3. Learning with Opponent-Learning Awareness (https://arxiv.org/abs/1709.04326)