xrl-pucrio

Introduction

The goal of this project is to serve as both a basic introduction to the explainable reinforcement learning (XRL) world and a sandbox for other XRL students/researchers to try out running XRL techniques with as little pre-running work as possible.

This was done through the implementation of two different XRL techniques (Belief Maps and VIPER) in a single codebase with non-technique-specific functions and classes being as generic as possible.

More information about contents, running instructions and project structure/architecture can be found in the repo wiki.

Repository created for the INF2102 discipline in the PUC-Rio Informatics MSci program.

Commands

Creating environment from environment.yml file:

conda env create -f conda_env.yml

To activate resulting environment:

conda activate xrlpucrio

To run with default configuration:

python xrlpucrio.py

The -h option can be added to the above command line in order to get more info about running options. Note that results are saved in the "results" folder as the program is executed but the folder is completely erased at the start of each run - if the user wants to keep their results, they should be moved elsewhere on end of execution.

To run all tests:

python -m unittest discover -v

The "main" tests (in files "test_run_hvalues.py" and "test_run_viper.py") take quite a while to run (around 10 minutes).

References

This codebase was inspired by multiple sources and other repositories.

For general RL code (such as training loop), see the "Solving Blackjack with QLearning" tutorial by Gymnasium.

For the Belief Map/H-values technique, see the What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes paper and the rl-intention repo for the paper source code.

For the VIPER technique, see the Verifiable Reinforcement Learning via Policy Extraction paper and the viper repo for the paper source code.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
agents		agents
environments		environments
networks		networks
results		results
saved/qlearning_blackjack		saved/qlearning_blackjack
tests		tests
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
classes_test_XRL-PUCRio.png		classes_test_XRL-PUCRio.png
classes_without_test_XRL-PUCRio.png		classes_without_test_XRL-PUCRio.png
conda_env.yml		conda_env.yml
hvalues_args.json		hvalues_args.json
packages_XRLPucRio.png		packages_XRLPucRio.png
viper_args.json		viper_args.json
xrlpucrio.py		xrlpucrio.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

xrl-pucrio

Introduction

Commands

References

About

Languages

License

mateuslevisf/xrl-pucrio

Folders and files

Latest commit

History

Repository files navigation

xrl-pucrio

Introduction

Commands

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages