demo-q-learning

some toy demos, q learning with neural network function approximator

files:

└── src
    └──envs 
    │   └── GridWorld.py          # a grid world
    ├── agent
    │   ├── Linear.py             # a linear network/regression 
    │   └── MLP.py                # a feed-forward network 
    ├── run_lqn_agent_minimal.py  # run a linear q network, update weights by hand (no autodiff)
    ├── run_lqn_agent.py          # run a linear q network     
    ├── run_mlp_agent.py          # run a feed-forward q network 
    ├── run_rnn_agent.py          # run a lstm q network 
    └── utils.py

results:

here's the q learning update rule, the agent is also epsilon greedy

here's the learning curve from one agent:

here's a sample path from a trained agent; red dot = reward, black dot = bomb:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

demo-q-learning

files:

results:

Files

README.md

Latest commit

History

README.md

File metadata and controls

demo-q-learning

files:

results: