Contains the implementation of all the RL algorithm. Each algorithm has two version :
- One version using Tables for the value-functions
- One version using neural networks to approximate the values of the value-functions
Contains the implementation of the agent used in the 5 experiments
Contains all the parameters that we used for the algorithms/ensemble methods for each experiment
Contains the implementations off all the environments (mazes) needed for the 5 experiments as well as functions to generate such environments.
Script used to run the experiments on a cluster and distribute the trials across multiple cores.
- Q-learning
- SARSA
- Actor-Critic
- QV-learning
- ACLA
- Belief State
- Maze observations
- Exp 1 (Simple maze + base algo)
- Exp 2 (Partially obsebable maze + neural net)
- Exp 3 (Dynamic obstacles maze + neural net)
- Exp 4 (Dynamic Goal maze + neural net)
- Exp 5 (Generalized maze + neural net)
- Q-learning
- SARSA
- Actor-Critic
- QV-learning
- ACLA
- Majority voting
- Rank voting
- Boltzmann multiplication
- Boltzmann addition
- Simple Dyna maze (9x6)
- Dyna maze with Dynamic Goal (9x6)
- Dyna maze with dynamic obstacles (9x6)
- Generalized maze (9x6)