MDP-DynamicProgramming

Policy Generation for Mobile Robot in obstacle environment using Policy Iteration , Generalised Policy Iteration , and Value Iteration in both Deterministic , and Stochastic Models. Here process is assumed to be Markov Decision Process , and the problem is solved using Dynamic Programming .

In this programing problem I am going to solve the problem explain in Figure 14.2 of the book “Probabilistic Robotics” using Dynamic Programing.

The robot can move in 8 directions (4 straight + 4 diagonal). The robot has two model: a) Deterministic model, that always executes movements perfectly. b) Stochastic model, that has a 20% probability of moving +/-45degrees from the commanded move. (1 means occupied and 0 means free). The reward of hitting obstacle is -50.0 . Reward for any other movement that does not end up at goal is -1.0. The reward for reaching the goal is 100.0. The goal location is at W(8,11) Use gamma =0.95.

Requirement: To generate the optimal policy for the robot using the following algorithms:

1) Policy iteration (algorithm on page 80 Sutton and Barto)
2) Value Iteration (algorithm on page 83 Sutton and Barto)
3) Generalized Policy Iteration

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Outputs		Outputs
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
dynamic_programming.py		dynamic_programming.py
programming_assignment_2.pdf		programming_assignment_2.pdf
test.py		test.py
world.txt		world.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MDP-DynamicProgramming

Outputs

About

Releases

Packages

Languages

shivakumar-tekumatla/MDP-DynamicProgramming

Folders and files

Latest commit

History

Repository files navigation

MDP-DynamicProgramming

Outputs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages