This C++ toolbox is aimed at representing and solving common AI problems, implementing an easy-to-use interface which should be hopefully extensible to many problems, while keeping code readable.
Current development includes MDPs, POMDPs and related algorithms. This toolbox
has been developed taking inspiration from the Matlab MDPToolbox
, which you
can find here, and from the
pomdp-solve
software written by A. R. Cassandra, which you can find
here.
This toolbox is aimed at Decision Theoretic Control algorithms. The general idea is to create algorithms that are able to interact with an environment in order to obtain some reward using actions, and to find the best policy of actions to use to do so.
The field divides itself into planning and reinforcement learning: planning focuses into solving problems that we know how to model: think chess, or 2048. Reinforcement learning focuses on exploring an unknown environment and learning the best policy for it. An excellent introduction to the basics can be found freely online in this book.
There are many variants of these problems, with single agent worlds, multi agent, multi objective, competitive, cooperative, partially observable and so on. This framework is a work in progress that tries to implement many DTC algorithms in one place, much like OpenCV is for Computer Vision algorithms.
Please note that the API may change over time (although most things at this point are stable) since at every algorithm I add I may decide to alter the API a bit, to offer a more consistent interface throughout the library.
Decision Theoretic Control is a field which is in rapid development. There are incredibly many methods to solve problems, each with a huge number of variants. This framework only tries to implement the most influential methods, and in their vanilla form (or the form that is most widely used in the research community to my knowledge), trying to keep the code as simple as possible.
If you need any of the variants, the code is structured so that it is easy to read it and modify it to your requirements, versus providing an endless list of parameters and include all the variants. Some toolboxes do this, but my opinion is that this makes the code very hard to digest, which makes it also hard to find out what parameters to set to get the algorithm variant you want.
Cassandra's POMDP format is a type of text file that contains a definition of an MDP or POMDP model. You can find some examples here. While it is absolutely not necessary to use this format, and you can define models via code, we do parse a reasonable subset of Cassandra's POMDP format, which allows to reuse already defined problems with this library.
The user interface of the library is pretty much the same with Python than what
you would get by using simply C++. See the examples
folder to see just how
much Python and C++ code resemble each other. Since Python does not allow
templates, the classes are binded with as many as possible instantiations.
Additionally, the library allows the usage of native Python generative models (where you don't need to specify the transition and reward functions, you only sample next state and reward). This allows for example to directly use OpenAI gym environments with minimal code writing.
That said, if you need to customize a specific implementation to make it perform better on your specific use-cases, or if you want to try something completely new, you will have to use C++.
Policies:
- Exploring Selfish Reinforcement Learning (ESRL)
- Q-Greedy Policy
- Softmax Policy
- Linear Reward Penalty
- Thompson Sampling (Normal distribution)
Algorithms:
- Dyna-Q
- Dyna2
- Expected SARSA
- Hysteretic Q-Learning
- Importance Sampling
- Linear Programming
- Monte Carlo Tree Search (MCTS)
- Policy Evaluation
- Policy Iteration
- Prioritized Sweeping
- Q-Learning
- Q(λ)
- SARSA(λ)
- SARSA
- Retrace(λ)
- Tree Backup(λ)
- Value Iteration
Policies:
- Normal Policy
- Epsilon-Greedy Policy
- Softmax Policy
- Q-Greedy Policy
- PGA-APP
- Win or Learn Fast Policy Iteration (WoLF)
Algorithms:
- Augmented MDP (AMDP)
- Blind Strategies
- Fast Informed Bound
- GapMin
- Incremental Pruning
- Linear Support
- PERSEUS
- POMCP with UCB1
- Point Based Value Iteration (PBVI)
- QMDP
- Real-Time Belief State Search (RTBSS)
- Witness
- rPOMCP
Policies:
- Normal Policy
Not in Python yet.
Algorithms:
- Learning with Linear Rewards (LLR)
- Multi-Agent Upper Confidence Exploration (MAUCE)
- Multi-Objective Variable Elimination (MOVE)
- Upper Confidence Variable Elimination (UCVE)
- Variable Elimination
Policies:
- Q-Greedy Policy
Not in Python yet.
Algorithms:
Policies:
- SingleAction Policy
- Epsilon-Greedy Policy
- Q-Greedy Policy
To build the library you need:
- cmake >= 3.9
- the boost library >= 1.54
- the Eigen 3.3 library.
- the lp_solve library (a shared library must be available to compile the Python wrapper).
In addition, full C++17 support is now required (this means at least g++-7)
Once you have all required dependencies, you can simply execute the following commands from the project's main folder:
mkdir build
cd build/
cmake ..
make
cmake
can be called with a series of flags in order to customize the output,
if building everything is not desirable. The following flags are available:
CMAKE_BUILD_TYPE # Defines the build type
MAKE_ALL # Builds all there is to build in the project
MAKE_LIB # Builds the whole core C++ libraries (MDP, POMDP, etc..)
MAKE_MDP # Builds only the core C++ MDP library
MAKE_FMDP # Builds only the core C++ Factored/Multi-Agent and MDP libraries
MAKE_POMDP # Builds only the core C++ POMDP and MDP libraries
MAKE_TESTS # Builds the library's tests for the compiled core libraries
MAKE_EXAMPLES # Builds the library's examples using the compiled core libraries
MAKE_PYTHON # Builds Python bindings for the compiled core libraries
PYTHON_VERSION # Selects the Python version you want (2 or 3). If not
# specified, we try to guess based on your default interpreter.
These flags can be combined as needed. For example:
# Will build MDP and MDP Python 3 bindings
cmake -DCMAKE_BUILD_TYPE=Debug -DMAKE_MDP=1 -DMAKE_PYTHON=1 -DPYTHON_VERSION=3 ..
The default flags when nothing is specified are MAKE_ALL
and
CMAKE_BUILD_TYPE=Release
.
The static library files will be available directly in the build directory.
Three separate libraries are built: AIToolboxMDP
, AIToolboxPOMDP
and
AIToolboxFMDP
. In case you want to link against either the POMDP library or
the Factored MDP library, you will also need to link against the MDP one, since
both of them use MDP functionality.
A number of small tests are included which you can find in the test/
folder.
You can execute them after building the project using the following command
directly from the build
directory, just after you finish make
:
ctest
The tests also offer a brief introduction for the framework, waiting for a more complete descriptive write-up. Only the tests for the parts of the library that you compiled are going to be built.
To compile the library's documentation you need the Doxygen tool. To use it it is sufficient to execute the following command from the project's root folder:
doxygen
After that the documentation will be generated into an html
folder in the
main directory.
To compile a program that uses this library, simply link it against the compiled
libraries you need, and possibly to the lp_solve
libraries (if using POMDP or
FMDP).
Please note that since both POMDP and FMDP libraries rely on the MDP code, you
MUST specify those libraries before the MDP library when linking,
otherwise it may result in undefined reference
errors. The POMDP and Factored
MDP libraries are not currently dependent on each other so their order does not
matter.
For Python, you just need to import the AIToolbox.so
module, and you'll be
able to use the classes as exported to Python. All classes are documented, and
you can run in the Python CLI
help(AIToolbox.MDP)
help(AIToolbox.POMDP)
to see the documentation for each specific class.
The latest documentation is available here. Keep in mind that it may not always be 100% up to date with the latest commits, while the one you compile yourself will of course be.
For Python docs you can find them by typing help(AIToolbox)
from the
interpreter. It should show the exported API for each class, along with any
differences in input/output.