Deep Recurrent Q-Learning for Partially Observable Markov Decision Processes

EE782 : Advanced Topics in Machine Learning

Abstract

This project presents our unique implementation of Deep Recurrent Q-Learning (DRQL) that incorporates Transfer Learning for feature extraction, a customized LSTM for temporal recurrence, and a domain-informed reward function. This tailored approach aims to expedite convergence compared to the vanilla implementation outlined in the original paper. The performance evaluation focuses on two adaptive Atari 2600 games: Assault-v5 and Bowling, where game difficulty scales with player proficiency. Comparative analysis between the convergence of our optimized reward function and the vanilla version is conducted using StepLR and CosineAnnealingLR learning rate schedulers, complemented by theoretical explanations. Additionally, an efficient windowed episodic memory implementation employing bootstrapped sequential updates is proposed to optimize GPU memory utilization

Watch our agent play in action

Assault-v5	Bowling

Environment Setup

python3 -m venv mlproj
source mlproj/bin/activate
pip install -r requirements.txt

Link to Jupyter Notebook Detailed Report with Code, Experimentation and Results

Collaborators:

Rohan Kalbag
Vansh Kapoor
Sankalp Bhamare

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
EE782.ipynb		EE782.ipynb
Report.pdf		Report.pdf
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Recurrent Q-Learning for Partially Observable Markov Decision Processes

EE782 : Advanced Topics in Machine Learning

Abstract

Watch our agent play in action

Environment Setup

Collaborators:

About

Releases

Packages

Contributors 2

Languages

rohankalbag/deep-recurrent-q-learning-for-pomdps

Folders and files

Latest commit

History

Repository files navigation

Deep Recurrent Q-Learning for Partially Observable Markov Decision Processes

EE782 : Advanced Topics in Machine Learning

Abstract

Watch our agent play in action

Environment Setup

Collaborators:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages