Skip to content

TimeBreaker/Multi-Agent-Reinforcement-Learning-papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 

Repository files navigation

Multi-Agent Reinforcement Learning papers

This is a collection of Multi-Agent Reinforcement Learning (MARL) papers. Each category is a potential start point for you to start your research. Some papers are listed more than once because they belong to multiple categories.

For MARL papers with code and MARL resources, please refer to MARL Papers with Code and MARL Resources Collection.

I will continually update this repository and I welcome suggestions. (missing important papers, missing categories, invalid links, etc.) This is only a first draft so far and I'll add more resources in the next few months.

This repository is not for commercial purposes.

My email: chenhao915@mails.ucas.ac.cn

Overview

Reviews

Recent Reviews (Since 2019)

Other Reviews (Before 2019)

Environments

Environment Paper Code Accepted at Year
StarCraft The StarCraft Multi-Agent Challenge https://github.com/oxwhirl/smac NIPS 2019
StarCraft SMACv2: A New Benchmark for Cooperative Multi-Agent Reinforcement Learning https://github.com/oxwhirl/smacv2 2022
StarCraft Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks https://github.com/uoe-agents/epymarl NIPS 2021
Football Google Research Football: A Novel Reinforcement Learning Environment https://github.com/google-research/football AAAI 2020
PettingZoo PettingZoo: Gym for Multi-Agent Reinforcement Learning https://github.com/Farama-Foundation/PettingZoo NIPS 2021
Melting Pot Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot https://github.com/deepmind/meltingpot ICML 2021
MuJoCo MuJoCo: A physics engine for model-based control https://github.com/deepmind/mujoco IROS 2012
MALib MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning https://github.com/sjtu-marl/malib 2021
MAgent MAgent: A many-agent reinforcement learning platform for artificial collective intelligence https://github.com/Farama-Foundation/MAgent AAAI 2018
Neural MMO Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents https://github.com/openai/neural-mmo 2019
MPE Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments https://github.com/openai/multiagent-particle-envs NIPS 2017
Pommerman Pommerman: A multi-agent playground https://github.com/MultiAgentLearning/playground 2018
HFO Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork https://github.com/LARG/HFO AAMAS Workshop 2016

Dealing With Credit Assignment Issue

Value Decomposition

Paper Code Accepted at Year
VDN:Value-Decomposition Networks For Cooperative Multi-Agent Learning https://github.com/oxwhirl/pymarl AAMAS 2017
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning https://github.com/oxwhirl/pymarl ICML 2018
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning https://github.com/oxwhirl/pymarl ICML 2019
NDQ: Learning Nearly Decomposable Value Functions Via Communication Minimization https://github.com/TonghanWang/NDQ ICLR 2020
CollaQ:Multi-Agent Collaboration via Reward Attribution Decomposition https://github.com/facebookresearch/CollaQ 2020
SQDDPG:Shapley Q-Value: A Local Reward Approach to Solve Global Reward Games https://github.com/hsvgbkhgbv/SQDDPG AAAI 2020
QPD:Q-value Path Decomposition for Deep Multiagent Reinforcement Learning ICML 2020
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning https://github.com/oxwhirl/wqmix NIPS 2020
QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning 2020
QPLEX: Duplex Dueling Multi-Agent Q-Learning https://github.com/wjh720/QPLEX ICLR 2021

Other Methods

Paper Code Accepted at Year
COMA:Counterfactual Multi-Agent Policy Gradients https://github.com/oxwhirl/pymarl AAAI 2018
LiCA:Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning https://github.com/mzho7212/LICA NIPS 2020

Policy Gradient

Paper Code Accepted at Year
MADDPG:Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments https://github.com/openai/maddpg NIPS 2017
COMA:Counterfactual Multi-Agent Policy Gradients https://github.com/oxwhirl/pymarl AAAI 2018
IPPO:Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? 2020
MAPPO:The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games https://github.com/marlbenchmark/on-policy 2021
MAAC:Actor-Attention-Critic for Multi-Agent Reinforcement Learning https://github.com/shariqiqbal2810/MAAC ICML 2019
DOP: Off-Policy Multi-Agent Decomposed PolicyGradients https://github.com/TonghanWang/DOP ICLR 2021
M3DDPG:Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient AAAI 2019

Communication

Communication Without Bandwidth Constraint

Paper Code Accepted at Year
CommNet:Learning Multiagent Communication with Backpropagation https://github.com/facebookarchive/CommNet NIPS 2016
BiCNet:Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games https://github.com/Coac/CommNet-BiCnet 2017
VAIN: Attentional Multi-agent Predictive Modeling NIPS 2017
IC3Net:Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks https://github.com/IC3Net/IC3Net 2018
VBC:Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control NIPS 2019
Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation 2018
NDQ:Learning Nearly Decomposable Value Functions Via Communication MinimizationNDQ: Learning Nearly Decomposable Value Functions Via Communication Minimization https://github.com/TonghanWang/NDQ ICLR 2020
RIAL/RIDL:Learning to Communicate with Deep Multi-Agent Reinforcement Learning https://github.com/iassael/learning-to-communicate NIPS 2016
ATOC:Learning Attentional Communication for Multi-Agent Cooperation NIPS 2018
Fully decentralized multi-agent reinforcement learning with networked agents https://github.com/cts198859/deeprl_network ICML 2018
TarMAC: Targeted Multi-Agent Communication ICML 2019

Communication Under Limited Bandwidth

Paper Code Accepted at Year
SchedNet:Learning to Schedule Communication in Multi-Agent Reinforcement learning 2019
Learning Multi-agent Communication under Limited-bandwidth Restriction for Internet Packet Routing 2019
Gated-ACML:Learning Agent Communication under Limited Bandwidth by Message Pruning AAAI 2020
Learning Efficient Multi-agent Communication: An Information Bottleneck Approach ICML 2020
Coordinating Multi-Agent Reinforcement Learning with Limited Communication AAMAS 2013

Emergent

Paper Code Accepted at Year
Multiagent Cooperation and Competition with Deep Reinforcement Learning PloS one 2017
Multi-agent Reinforcement Learning in Sequential Social Dilemmas 2017
Emergent preeminence of selfishness: an anomalous Parrondo perspective Nonlinear Dynamics 2019
Emergent Coordination Through Competition 2019
Biases for Emergent Communication in Multi-agent Reinforcement Learning NIPS 2019
Towards Graph Representation Learning in Emergent Communication 2020
Emergent Tool Use From Multi-Agent Autocurricula https://github.com/openai/multi-agent-emergence-environments ICLR 2020
On Emergent Communication in Competitive Multi-Agent Teams AAMAS 2020
QED:Quasi-Equivalence Discovery for Zero-Shot Emergent Communication 2021
Incorporating Pragmatic Reasoning Communication into Emergent Language NIPS 2020

Opponent Modeling

Paper Code Accepted at Year
Bayesian Opponent Exploitation in Imperfect-Information Games IEEE Conference on Computational Intelligence and Games 2018
LOLA:Learning with Opponent-Learning Awareness AAMAS 2018
Variational Autoencoders for Opponent Modeling in Multi-Agent Systems 2020
Stable Opponent Shaping in Differentiable Games 2018
Opponent Modeling in Deep Reinforcement Learning https://github.com/hhexiy/opponent ICML 2016
Game Theory-Based Opponent Modeling in Large Imperfect-Information Games AAMAS 2011
Agent Modelling under Partial Observability for Deep Reinforcement Learning NIPS 2021

Game Theoretic

Paper Code Accepted at Year
α-Rank: Multi-Agent Evaluation by Evolution Scientific reports 2019
α^α -Rank: Practically Scaling α-Rank through Stochastic Optimisation AAMAS 2020
A Game Theoretic Framework for Model Based Reinforcement Learning ICML 2020
Fictitious Self-Play in Extensive-Form Games ICML 2015
Combining Deep Reinforcement Learning and Search for Imperfect-Information Games NIPS 2020
Real World Games Look Like Spinning Tops NIPS 2020
PSRO: A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning NIPS 2017
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games NIPS 2020
A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems AAMAS 2013
Neural Replicator Dynamics: Multiagent Learning via Hedging Policy Gradients AAMAS 2020

Hierarchical

Paper Code Accepted at Year
Hierarchical multi-agent reinforcement learning AAMAS 2006
Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery AAMAS 2020
Hierarchical Critics Assignment for Multi-agent Reinforcement Learning 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game 2019
Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction 2018
HAMA:Multi-Agent Actor-Critic with Hierarchical Graph Attention Network AAAI 2020

Ad Hoc Teamwork

Paper Code Accepted at Year
CollaQ:Multi-Agent Collaboration via Reward Attribution Decomposition https://github.com/facebookresearch/CollaQ 2020
A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems AAMAS 2013
Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork https://github.com/LARG/HFO AAMAS Workshop 2016
Open Ad Hoc Teamwork using Graph-based Policy Learning https://github.com/uoe-agents/GPL ICLM 2021
A Survey of Ad Hoc Teamwork: Definitions, Methods, and Open Problems 2022
Towards open ad hoc teamwork using graph-based policy learning ICML 2021
Learning with generated teammates to achieve type-free ad-hoc teamwork IJCAI 2021
Online ad hoc teamwork under partial observability ICLR 2022

League Training

Paper Code Accepted at Year
AlphaStar:Grandmaster level in StarCraft II using multi-agent reinforcement learning Nature 2019

Curriculum Learning

Paper Code Accepted at Year
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems AAMAS 2021
From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning https://github.com/starry-sky6688/MARL-Algorithms AAAI 2020
EPC:Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning https://github.com/qian18long/epciclr2020 ICLR 2020
Emergent Tool Use From Multi-Agent Autocurricula https://github.com/openai/multi-agent-emergence-environments ICLR 2020
Learning to Teach in Cooperative Multiagent Reinforcement Learning AAAI 2019
StarCraft Micromanagement with Reinforcement Learning and Curriculum Transfer Learning IEEE Transactions on Emerging Topics in Computational Intelligence 2018
Cooperative Multi-agent Control using deep reinforcement learning https://github.com/sisl/MADRL AAMAS 2017
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems NIPS 2021

Mean Field

Paper Code Accepted at Year
Mean Field Multi-Agent Reinforcement Learning ICML 2018
Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning The world wide web conference 2019
Bayesian Multi-type Mean Field Multi-agent Imitation Learning NIPS 2020

Transfer Learning

Paper Code Accepted at Year
A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems Journal of Artificial Intelligence Research 2019
Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning 2020

Meta Learning

Paper Code Accepted at Year
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning ICML 2021
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments 2017

Fairness

Paper Code Accepted at Year
FEN:Learning Fairness in Multi-Agent Systems NIPS 2019
Fairness in Multiagent Resource Allocation with Dynamic and Partial Observations AAMAS 2018
Fairness in Multi-agent Reinforcement Learning for Stock Trading 2019

Exploration

Dense Reward Exploration

Paper Code Accepted at Year
MAVEN:Multi-Agent Variational Exploration https://github.com/starry-sky6688/MARL-Algorithms NIPS 2019
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning ICML 2019
Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration NIPS 2021
Celebrating Diversity in Shared Multi-Agent Reinforcement Learning https://github.com/lich14/CDS NIPS 2021

Sparse Reward Exploration

Paper Code Accepted at Year
EITI/EDTI:Influence-Based Multi-Agent Exploration https://github.com/TonghanWang/EITI-EDTI ICLR 2020
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning ICML 2021
Centralized Model and Exploration Policy for Multi-Agent 2021
REMAX: Relational Representation for Multi-Agent Exploration AAMAS 2022

Uncategorized

Paper Code Accepted at Year
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning ICLR 2020
Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning 2019
Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework AAAI 2021
Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory AAAI 2021
LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning https://github.com/yalidu/liir NIPS 2019

Graph Neural Network

Paper Code Accepted at Year
Multi-Agent Game Abstraction via Graph Attention Neural Network https://github.com/starry-sky6688/MARL-Algorithms AAAI 2020
Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation ICLR 2020
Multi-Agent Reinforcement Learning with Graph Clustering 2020
Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems ICML 2018

Model-based

Paper Code Accepted at Year
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping 2020

NAS

Paper Code Accepted at Year
MANAS: Multi-Agent Neural Architecture Search 2019

Safe Multi-Agent Reinforcement Learning

Paper Code Accepted at Year
MAMPS: Safe Multi-Agent Reinforcement Learning via Model Predictive Shielding 2019
Safer Deep RL with Shallow MCTS: A Case Study in Pommerman 2019

From Single-Agent to Multi-Agent

Paper Code Accepted at Year
IQL:Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents https://github.com/oxwhirl/pymarl ICML 1993
IPPO:Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? 2020
MAPPO:The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games https://github.com/marlbenchmark/on-policy 2021
MADDPG:Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments https://github.com/openai/maddpg NIPS 2017

Discrete-Continuous Hybrid Action Space / Parameterized Action Space

Paper Code Accepted at Year
Deep Reinforcement Learning in Parameterized Action Space 2015
DMAPQN: Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces IJCAI 2019
H-PPO: Hybrid actor-critic reinforcement learning in parameterized action space IJCAI 2019
P-DQN: Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space 2018

Role

Paper Code Accepted at Year
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles https://github.com/TonghanWang/ROMA ICML 2020
RODE: Learning Roles to Decompose Multi-Agent Tasks https://github.com/TonghanWang/RODE ICLR 2021
Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing https://github.com/uoe-agents/seps ICML 2021

Diversity

Paper Code Accepted at Year
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems AAMAS 2021
Q-DPP:Multi-Agent Determinantal Q-Learning https://github.com/QDPP-GitHub/QDPP ICML 2020
Diversity is All You Need: Learning Skills without a Reward Function 2018
Modelling Behavioural Diversity for Learning in Open-Ended Games ICML 2021
Diverse Agents for Ad-Hoc Cooperation in Hanabi CoG 2019
Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning IJCAI 2020
Quantifying environment and population diversity in multi-agent reinforcement learning 2021

Sparse Reward

Paper Code Accepted at Year
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems NIPS 2021
Individual Reward Assisted Multi-Agent Reinforcement Learning https://github.com/MDrW/ICML2022-IRAT ICML 2022

Large Scale

Paper Code Accepted at Year
From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning https://github.com/starry-sky6688/MARL-Algorithms AAAI 2020
PooL: Pheromone-inspired Communication Framework for Large Scale Multi-Agent Reinforcement Learning 2022
Factorized Q-learning for large-scale multi-agent systems ICDAI 2019
EPC:Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning https://github.com/qian18long/epciclr2020 ICLR 2020
Mean Field Multi-Agent Reinforcement Learning ICML 2018
A Study of AI Population Dynamics with Million-agent Reinforcement Learning AAMAS 2018

DTDE

Paper Code Accepted at Year
Networked Multi-Agent Reinforcement Learning in Continuous Spaces IEEE conference on decision and control 2018
Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning NIPS 2019
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents ICML 2018

Decision Transformer

Paper Code Accepted at Year
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks 2021
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem https://github.com/PKU-MARL/Multi-Agent-Transformer 2022
Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers 2022

Offline MARL

Paper Code Accepted at Year
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks 2021
Believe what you see: Implicit constraint approach for offline multi-agent reinforcement learning NIPS 2021

Adversarial

Paper Code Accepted at Year
Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems 2022
Distributed Multi-Agent Deep Reinforcement Learning for Robust Coordination against Noise 2022
On the Robustness of Cooperative Multi-Agent Reinforcement Learning IEEE Security and Privacy Workshops 2020
Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning CVPR workshop 2022
Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient AAAI 2019
Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations NIPS Deep Reinforcement Learning Workshop 2018
Policy Regularization via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods 2021

Multi-Agent Path Finding

  • TODO

To be Categorized

Paper Code Accepted at Year
Mind-aware Multi-agent Management Reinforcement Learning https://github.com/facebookresearch/M3RL ICLR 2019
Emergence of grounded compositional language in multi-agent populations https://github.com/bkgoksel/emergent-language AAAI 2018
Emergent Complexity via Multi-Agent Competition https://github.com/openai/multiagent-competition ICLR 2018
TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning https://github.com/tencent-ailab/TLeague 2020
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers https://github.com/hhhusiyi-monash/UPDeT ICLR 2021
SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning https://github.com/deligentfool/SIDE AAMAS 2022
UNMAS: Multiagent Reinforcement Learningfor Unshaped Cooperative Scenarios https://github.com/James0618/unmas TNNLS 2021
Context-Aware Sparse Deep Coordination Graphs https://github.com/TonghanWang/CASEC-MACO-benchmark ICLR 2022

TODO

  • Multi-Agent Path Finding
  • Generalization in MARL

Citation

If you find this repository useful, please cite our repo:

@misc{chen2021multi,
  author={Chen, Hao},
  title={Multi-Agent Reinforcement Learning Papers},
  year={2021}
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/TimeBreaker/Multi-Agent-Reinforcement-Learning-papers}}
}