AI_multi_arm_bandit_problem

Implemention of 5 armed bandit problem with greedy and ɛ-greedy action selection algorithms. Comparing the results of ɛ -greedy action selection method (ɛ =0.4) with the greedy one.

In this experiment, we are going to implement classical 5-armed bandit problem with two selection algorithms: greedy and ɛ-greedy action selection algorithm. Basically, we want to identify the bandit machine with the highest reward and exploit it. In greedy algorithm, it always exploits current knowledge and there will be no exploration and in ɛ-greedy algorithm it continues to explore and later after time it will perform better. We implemented the algorithm in python programming language. The 5-armed bandit problem with greedy and ɛ-greedy action selection algorithm shows the balance between exploration and exploitation.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Figure_1.png		Figure_1.png
Figure_2.png		Figure_2.png
README.md		README.md
program.py		program.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI_multi_arm_bandit_problem

About

Releases

Packages

Languages

surajrimal/AI_multi_arm_bandit_problem

Folders and files

Latest commit

History

Repository files navigation

AI_multi_arm_bandit_problem

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages