Skip to content

Latest commit

 

History

History
19 lines (14 loc) · 952 Bytes

File metadata and controls

19 lines (14 loc) · 952 Bytes

Reinforcement Learning Algorithms

Reinforcement Learning Python

Introduction

This repository includes implementations of the following algorithms:

  • Deep Q-Learning: Utilizing experience replay and target networks.
  • Multi-Armed Bandits: Including strategies like epsilon-greedy and Upper Confidence Bound (UCB).
  • N-step Tree Backup: Implementation for n-step bootstrapping.
  • Off-Policy Learning: Algorithms such as Q-learning.
  • On-Policy Learning: Methods like SARSA.
  • Thompson Sampling: Bayesian approach for balancing exploration and exploitation.
  • Expected SARSA: An enhancement over SARSA with expected rewards.
  • Gradient Preference-Based Methods: Various policy gradient algorithms.
  • Policy Iteration: Classical dynamic programming algorithm for solving MDPs.