It is a repository containing the implementation of combinatorial multi-armed bandit algorithm with Thompson Sampling to solve combinatorial optimization problems involving single agent as well as multiple agents in a decentralized environment. Full algorithm can be found here.
The repository includes following files:
- src: Contains the implementation of the learning agent class.
- Jupiter Notebook Examples: Contains Jupyter Notebook examples utilizing the implemented learning agent to solve single agent and multi-agent combinatorial optimization problems in a decentralized manner.
For any further information, you can contact me at sharyal.zafar@ens-rennes.fr.