Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning
An implementation for the experiments in Appendix A in the paper:
Jingfeng Wu, Vladimir Braverman, Lin F. Yang
- Comparison with the Optimal Single-Objective RL Algorithm:
python mdp.py
- Comparison with Q-Learning:
python q-learning.py
- The Effect of Number of Objectives:
python dim.py