Flappy Bird with Deep Reinforcement Learning

Flappy Bird Game trained on a Double Dueling Deep Q Network with Prioritized Experience Replay implemented using Pytorch.

See Full 3 minutes video

Getting Started

Here I will explain how to run the game which runs automatically using saved model, also I will breif you about basics of Q Learning, Deep Q learning, Dueling architecture and Prioritized Experience Replay.

Prerequisites

You will need Python 3.X.X with some packages which you can install direclty using requirements.txt.

pip install -r requirements.txt

Running The Game

Use the following command to run the game where '--model' indicates the location of saved DQN model.

python3 play_game.py --model checkpoints/flappy_best_model.dat

Deep Q Learning

Q Learning is off policy learning method in reinforcement learning which is a developement over on-policy Temporal Difference control algorithm. Q-learning tries to estimate a state-action value function for target policy that deterministically selects the action of highest value.

The problem with Tradition Q learning is that it is not suitable for continuous environment (like Flappy Bird) where an agent can be in infinite number of states. So it is not feasible to store all states in a grid which we use in tradition Q learning. So we use Deep Q learning in these environments.

Deep Q learning is based on Deep Neural Network which takes current state in the form of image or say continuous value and approximates Q-values for each action based on that state.

Take a look at this article which explains Deep Q Learning

Network Architecture (Dueling Architecture)

Here I have used Dueling architecture to calculate Q values. Q-values correspond to how good it is to be at that state and taking an action at that state Q(s,a). So we can decompose Q(s,a) as the sum of: V(s) - the value of being at that state A(s) - the advantage of taking that action at that state (how much better is to take this action versus all other possible actions at that state).

Q(s,a) = V(s) + A(s,a)

Prioritized Experience Replay

The idea behind PER was that some experiences may be more important than others for our training, but might occur less frequently. Because we sample the batch uniformly (selecting the experiences randomly) these rich experiences that occur rarely have practically no chance to be selected. We want to take in priority experience where there is a big difference between our prediction and the TD target, since it means that we have a lot to learn about it.

pt = |dt| + e
where,
	pt = priority of the experience
	dt = magnitude of TD error
	e = constant assures that priority do not become 0

Take a look at this article which explains Double Dueling and PER

Authors

Aditya Jain : Portfolio

Licence

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledements

The Game has been taken from this repository
Thanks Siraj Raval for Move37 course on theschool.ai which helped understand these concepts.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
checkpoints		checkpoints
game		game
screenshots		screenshots
.gitignore		.gitignore
Deep Q Learning.ipynb		Deep Q Learning.ipynb
LICENSE		LICENSE
README.md		README.md
play_game.py		play_game.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flappy Bird with Deep Reinforcement Learning

Getting Started

Prerequisites

Running The Game

Deep Q Learning

Network Architecture (Dueling Architecture)

Prioritized Experience Replay

Authors

Licence

Acknowledements

About

Releases

Packages

Languages

License

adityajn105/flappy-bird-deep-q-learning

Folders and files

Latest commit

History

Repository files navigation

Flappy Bird with Deep Reinforcement Learning

Getting Started

Prerequisites

Running The Game

Deep Q Learning

Network Architecture (Dueling Architecture)

Prioritized Experience Replay

Authors

Licence

Acknowledements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages