sutton_rooms

Environment

Room world from Sutton et al (Between MDP and semi-MDP)

Used the room environment to test out some ideas for hierarchical reinforcement learning and planning in HRL.

Plain flat Q-learning
1. Just one version. [code]
Hierarchical Q-learning
1. Basic version (s-MDP; two-layer hierarchy with predefined deterministic lower-level policy) [code]
2. Intraoption-learning version (lower-level is trainable) [code]
Planning Hierarchical Q-learning
1. Basic version (same as Hierarchical Q-learning but with a 2-step plan output from the upper level; No replanning) [code]
2. Version with replanning [code]

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
images		images
.gitignore		.gitignore
README.md		README.md
learning_test_utilities.py		learning_test_utilities.py
merged_result_plots.py		merged_result_plots.py
q_learning_test.py		q_learning_test.py
room_world.py		room_world.py
smdp_plan_q_learning_test.py		smdp_plan_q_learning_test.py
smdp_q_learning_test.py		smdp_q_learning_test.py
smdp_q_learning_test_interr_numpy.py		smdp_q_learning_test_interr_numpy.py
smdp_q_learning_test_interrupting.py		smdp_q_learning_test_interrupting.py
smdp_q_learning_test_intraoption.py		smdp_q_learning_test_intraoption.py
smdp_replan_q_learning_test.py		smdp_replan_q_learning_test.py