SQUARE Symsim is a test-bed for implementing reinforcement learning algorithms, formalizing their correctness properties, and testing them. It is implemented in Scala 3, in purely functional style, and uses property-based testing.
There is no installation or release yet. See below in Adding a new agent how to clone, branch, and run the code.
The implementation is quite memory hungry right now, so we recommend the following sbt setup if you run out of memory:
export SBT_OPTS="-Xmx3G -XX:+UseG1GC -Xss2M"
Place this in your .bashrc
or execute in the current shell, just
before starting sbt
.
So far discrete (exact) Q-Learning and SARSA are implemented, along with a bunch of simple examples.
-
Git clone
the repo orgit pull
(in this case you can skip step 2) to have the fresh version -
Change directory to the cloned repo:
cd symsim
-
Create a new branch (the repo is configured not to allow to push to main). Let our example be tic-tac-toe
git checkout -b tic-tac-toe
-
Create a new package in
src/main/scala/symsim/examples/concrete/
. The existing one is calledbraking
, let's call the new onetictactoe
mkdir -pv src/main/scala/symsim/examples/concrete/tictactoe
The package goes under
examples
andconcrete
for "concrete execution RL". -
Inside the new directory create a file
TicTacToe.scala
.cp -iv src/main/scala/symsim/examples/concrete/braking/Car.scala src/main/scala/symsim/examples/concrete/tictactoe/TicTacToe.scala edit src/main/scala/symsim/examples/concrete/tictactoe/TicTacToe.scala
Adjust the name of the package object from braking to
tictactoe
. Then change the four types (both names and definitions) to whatever makes sense for TicTacToe. For instances createTicState
- to represent the state of the gameTicObservableState
- this might be just a renaming because the Tic Tac Toe state space is finiteTicAction
- possible moves -
Implement the TicTacToe agent.
Edit this file from top eliminating the Car example and introducing the TicTacToe example. There are two parts: in the class in the top we give all the logics of the agent, and in the instances/constraints part in the bottom we use the type system to prove that our types have all the necessary properties for the machinery to work. It might be useful to consult the interface definition (which also has comments at plenty):
src/main/scala/symsim/Agent.scala
. -
Working with git and PRs.
Throughout the process you can commit as normally. The first time you try to push, observe what git tells you to do, to push to the remote branch. Follow the instruction, and then read the message from git again after the succesful push, to find the link to create a pull request. Open that link and create a pull request
Adding Tic Tac Toe
. You can mark it as work in progress (create a 'draft pull request' instead ofpull request
) if you are not done. After this you can continue pushing as normally from your branch, if you make new commits, and others in the project, will be able to track and discuss your progress easily. -
Compiling
To compile your code you can open
sbt
in the root directory (sbt
is the only tool you have to install, you do not need to installscala
):sbt ...>compile
-
Running the learning
There is a corresponding test tree (to the
main
source tree). Underconcrete/examples/braking/
you will find the fileExperiments.scala
that shows how the braking car learning is executed. So far, we disguise it as a test. You can copy this file to the corresponding directory fortictactoe
and adjust it to instantiate the tic-tac-toe learning.
-
Create a new branch (the repo is configured not to allow to push to main). Let continue with tic-tac-toe example.
git checkout -b tic-tac-toe-tests
-
Create a new package in
src/test/scala/symsim/examples/concrete/
for the new agent.mkdir -pv src/test/scala/symsim/examples/concrete/tictactoe
-
Inside the new directory create a file
TicTacToeSpec.scala
.cp -iv src/test/scala/symsim/examples/concrete/braking/CarSpec.scala src/test/scala/symsim/examples/concrete/tictactoe/TicTacToeSpec.scala edit src/test/scala/symsim/examples/concrete/tictactoe/TicTacToeSpec.scala
Adjust the name of the package object from braking to
tictactoe
, and import the new agent instancesimport TicTacToe.instances
. -
Then, you can add your preferred tests by just adding the following line for each test and replacing question marks with the boolean property.
property ("TITLE THAT YOU PREFER TO SHOW IN THE TERMINAL") = ???
-
Test
To test your code you can open
sbt
in the root directory:sbt ...>testOnly symsim/examples/concrete/braking/TicTacToeSpec
Symsim is developed at the SQUARE group at IT University of Copenhagen, and at the SIRIUS Centre of University of Oslo. The work is financially supported by the Danish DIREC initiative, under a bridge project Verifiable and Safe AI for Autonomous Systems.