Reinforcement-Learning

This repo contains my course homeworks and projects of reinforcement learning.

Black-Box Optimization

Goal

Using Black-Box optimization methods (like CEM, FCHC, GA, etc.) to find the optimal policies for agent.

Policy Improvement

Goal

Provided a policy and its histories, the goal is to safely find better policies of the problem.

Implementation

Link of this section

Temporal Difference (TD) Learning

Introduction

TD learning is a policy evaluation algorithm that learns from experiences similar to Monte-Carlo algorithms. It chooses actions using π and sees what happens, which is called sampling, rather than requiring knowledge about P and R. However, like the dynamic programming methods, it produces estimates based on other estimates - it bootstraps. This enables the model perform its updates before the end of an episode.

Goal

This implementation includes two TD learning algorithms: Sarsa and Q-Learning.

Sarsa uses TD to estimate q^π, and simultaneously change π to be (nearly) greedy with respect to q^π. It changes the Bellman equation into an update rule.
Q-Learning changes the Bellman optimality equation into an update rule comparing to Sarsa.

Some of the introduction is quoted from Prof. Philips Tomas’s notes.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
BlackBoxOptimization		BlackBoxOptimization
PolicyImprovement		PolicyImprovement
TemporalDifference(TD)Learning		TemporalDifference(TD)Learning
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement-Learning

Black-Box Optimization

Goal

Policy Improvement

Goal

Implementation

Temporal Difference (TD) Learning

Introduction

Goal

About

Releases

Packages

Languages

RiverLeeGitHub/Reinforcement-Learning

Folders and files

Latest commit

History

Repository files navigation

Reinforcement-Learning

Black-Box Optimization

Goal

Policy Improvement

Goal

Implementation

Temporal Difference (TD) Learning

Introduction

Goal

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages