Everytime forward Reinforcement Learning(RL) is not feasible for all of the problems due to the complexity involved in the designing of the reward function. In those circumstances, Inverse Reinforcement Learning(IRL) is the game changer. Imitation learning technique is part of it and it showed wonderful results on some of the problems.
In this project, I created an agent that tries to imitate the expert and learns the path navigation in the process. Thanks to openAI-Gym simulator for providing such a wonderful platform for creating the dynamics of the environment.
The project is divided into two steps
- Triaining the expert using Proximal Policy Optimization(PPO) algorithm
- Train the agent using the expert trajectories from the step1 by utilizing GAN architecture.
Design:
More details can be found in the report.