Skip to content

Room world from Sutton et al (Between MDP and semi-MDP)

Notifications You must be signed in to change notification settings

tmomose/sutton_rooms

Repository files navigation

sutton_rooms

Environment

Room world from Sutton et al (Between MDP and semi-MDP)

room-map

Paper: Between MDPs and semi-MDPs

Experiment

Used the room environment to test out some ideas for hierarchical reinforcement learning and planning in HRL.

  1. Plain flat Q-learning
    1. Just one version. [code]
  2. Hierarchical Q-learning
    1. Basic version (s-MDP; two-layer hierarchy with predefined deterministic lower-level policy) [code]
    2. Intraoption-learning version (lower-level is trainable) [code]
  3. Planning Hierarchical Q-learning
    1. Basic version (same as Hierarchical Q-learning but with a 2-step plan output from the upper level; No replanning) [code]
    2. Version with replanning [code]

About

Room world from Sutton et al (Between MDP and semi-MDP)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages