Skip to content

Latest commit

 

History

History

deepq

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Deep Q-learning

Implementation of the DeepQ algorithm with Double Q.

How it works

DeepQ is an progression on standard Q-learning.

q-learning

With DeepQ, rather than storing Q-values in a table, they are aprroximated using neural networks. This allows for more accurate Q-value estimates as well as the ability to model continuous states.

DeepQ also includes the notion of experience replay, in which the agent stores the states, actions, and outcomes at every step in memory and then randomly samples from them during training.

Double-Q is further implemented in which the target, or expected future rewards, is modeled in a separate network having the weights intermittently copied over from the 'online' network making the predictions. This helps learning by providing a more stable target to pursue.

Examples

See the experiments folder for example implementations.

Roadmap

  • Prioritized replay
  • Dueling Q
  • Soft updates
  • More environments

References