Skip to content

ashurrafiev/ClassParallelTM

Repository files navigation

Class-Parallel Tsetlin Machine

Feedback decision tree visualised:
tsetlin_decision_tree-combined.pdf

Usage

./tm [options]
option type description
-step‑size int number of inputs per training step
-steps int total number of training steps
-s float learning rate s
-boost‑pos 0 or 1 boost positive feedback; default is 0
-t float threshold T
-ts float,float,... comma-separated threshold list; this way you can set different T values for each class
-tnorm 0 or 1 T values are normalized; default is 1
-rand‑seed int use specific random seed or timer if 0 (default)
-acc‑eval‑train int currently disabled, dooes nothing
-acc‑eval‑test int use test dataset to evaluate accuracy: 0 - don't evaluate, -1 evaluate once at the end of training, n - evaluate every n steps
-log‑tastates if specified, enable logging of the TA state spectrum
-log‑status if specified, enable logging of the TM status variables and events
-log‑acc if specified, enable logging of the accuracy evaluation results
-log‑append if specified, append to the existing log files; default is rewrite
-load‑state string if specified, load previously saved state of the TM; the string value is path format, where %d is replaced by a class index
-save‑state string if specified, the state of TM is saved after training; the string value is path format, where %d is replaced by a class index
-train‑mask string binary mask to enable training per class; default is 11111..., i.e., every class is training
-par 0 or 1 enable parallel execution; default is 1

Example:

./tm -step-size 12000 -steps 5 -acc-eval-test 1 -log-acc

Train MNIST for 5 epochs, evaluate accuracy after each epoch (step) and log it.

TsetlinOptions.h

Some parameters are hard-coded in TsetlinOptions.h and require recompilation when changed.

option type value description
FEATURES int (28*28) number of input features; defaulted to MNIST
CLASSES int 10 number of classes; defaulted to 10 for MNIST
CLAUSES int 200 number of clauses per class
NUM_STATES int 100 number of TA states per decision; exclude states are (-NUM_STATES+1) .. 0, include states are 1 .. NUM_STATES
LIT_LIMIT 0 or 1 0 toggle literal-limiting feedback algorithm
INPUT_DATA_PATH char* "pkbits" path to input data directory
TRAIN_DATA_FMT char* "/mnist-train-cls%d.bin" format of the train data input file (per-class, %d is replaced with the class index); the file is in pkbits format
TEST_DATA char* "/mnist-test.bin" test data file name; the file is in pkbits format

MNIST training and test data is included in pkbits.zip.

Build instructions

Using precompiled logger files

Logger headers TsetlinLogger.h and TsetlinLoggerDefs.h were pre-built for the logger.xml configuration. If you don't need to change logger functionality, you can use these precompiled files.

To build:

make quick

To clean:

make clean

Recompile logger files

Do not change TsetlinLogger.h and TsetlinLoggerDefs.h directly. If you need to modify logger, edit logger.xml and rebuild logger files.

To build:

make all

Requirements:

Edit GEN_LOGGER variable in the makefile to set path to AuxTsetlinTools.

To clean including TsetlinLogger.h and TsetlinLoggerDefs.h:

make cleanall

More information on logger generator and logger.xml specification: Logger XML specification

Plotting diagrams

Logged TM status variables can be plotted as SVG images using Jython scripts in the plots folder.

Requirements:

Download and put into JAR files into plots folder. Modify dataPath variable in All.py script to point to the location of the acc.csv and *-status.csv files generated by the TM logging.

To plot diagrams, use:

java -jar jython-standalone-2.7.2.jar All.py > All.svg

There is a separate tool for drawing TA spectrum diagrams: TA State Spectrogram

About

Class-parallel C implemenation of a Tsetlin Machine

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published