Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning

An implementation for the experiments in Appendix A in the paper:

Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning

Jingfeng Wu, Vladimir Braverman, Lin F. Yang

Usage

Comparison with the Optimal Single-Objective RL Algorithm: python mdp.py
Comparison with Q-Learning: python q-learning.py
The Effect of Number of Objectives: python dim.py