Reinforcement Learning with tabular methods: TD-learning (Q-learning and SARSA) and MENACE-like approach applied to a Rubik's cube with a move set restricted to 180-degree turns.
reinforcement-learning q-learning epsilon-greedy sarsa simulated-annealing td-learning softmax menace-matchboxes
-
Updated
Aug 1, 2021 - C