Library code | Paper | Original library
This repository contains the experiments for our paper on steerable PDOs. It makes use of our extension of the e2cnn library. Parts of the code are also reused from the experiments code for the e2cnn paper.
We use poetry
to manage package dependencies
because this makes it easy to lock the precise package versions with which
we tested the code. If you have poetry
installed, just run ./install.sh
,
which will create a new virtual environment and install all dependencies
into this environment. You can activate it with poetry shell
.
Note: If you want to recreate the figures from the paper, using
figures.py
, you also need to install seaborn
, which is not included in
the poetry
environment.
Of course you don't need to use poetry
, you can also install the requirements
yourself using pip
or conda
. You can find a list of required packages
in pyproject.toml
. In addition to the once listed there, you will need
to install the RBF
package.
Finally, run
pip install git+https://github.com/ejnnr/steerable_pdos@pdo_econv
to install the version of the e2cnn library that we need for these experiments.
The entry point into the MNIST-rot experiments is python main.py
.
The simplest way to use this command is
python main.py +experiment=<experiment name>
where <experiment name
can be any combination of {diffop,kernel,vanilla}_{3x3,5x5}
.
For example, to reproduce our MNIST-rot result for 3x3 kernels, run python main.py +experiment=kernel_3x3
.
For differential operators, the default is FD discretization. If you want to use Gaussians,
use +model.smoothing=<standard deviation>
, e.g.
python main.py +experiment=diffop_5x5 +model.smoothing=1.3
(we used 1.3 for 5x5 kernels and 1 for 3x3 kernels). For RBF-FD, use +model.rbffd=true
instead.
To reproduce the restriction models (which start with D_N equivariance and restrict to C_N), use
+model.flip=true +model.restriction_layer=6
(this restricts for the 6th layer, you can change
that number). For the quotient experiments, use +model.quotient=true
.
To use SO(2) irreps instead of regular representations, use +model.group_order=-N
where N
is the maximum irrep frequency you want to use (N = 3
is reasonable).
Note the minus sign; without it, this would use C_N
as the symmetry group.
Finally, our code also allows you to exactly imitate the PDO-eConv basis.
To do so, add the model.pdo_econv
option, i.e.
python main.py +experiment=diffop_5x5 +model.pdo_econv=1
You can combine this with +model.smoothing=...
to use Gaussian discretization.
But in general, the PDO-eConv basis is less flexible and thus cannot be combined
with all of the options described above. It also currently only supports 5x5 kernels.
main.py
has many other options, so there are many architecture and hyperparameter
choices that you can easily modify. For example, the following command illustrates
a few of them:
python main.py +experiment=diffop_5x5 \
trainer.max_epochs=50 \
data.batch_size=32 \
model.learning_rate=0.001 \
model.maximum_order=2 \
model.weight_decay=1e-4 \
model.fc_dropout=0.2 \
model.lr_decay=0.9 \
model.optimizer=sgd \
model.channels=\[20, 30, 40, 40, 50, 70\]
In the config/
directory, you can see some more option, as well as even more
in diffop_experiments/model.py
. Don't hesitate to contact me
or file a Github issue if you'd like to try something not mentioned here, maybe it's
already implemented.
Some further options that you will probably want to set:
+trainer.gpus=1
to use the GPUdata.num_workers=<number of workers>
to use multiple workers and speed up trainingseed=<random seed>
: a seed will always be used, by default it is 0. So if you want multiple runs, change the seed!dir.run=<directory>
to save the logs for that run inlogs/<directory>
. For example, something likedir.run=diffop/3x3/gaussian/<seed>
may be useful. By default, a directory based on the current date and time is created for each run.
For the STL-10 experiments, we simply reused the code for the experiments for the e2cnn paper, with only very small additions to support Steerable PDOs. These experiments therefore have different entrypoints, see the original repo for details.
To run exactly those experiments that we ran for our paper, you can use ./run_stl.sh
.
This will do six runs for each of the eight models we consider, and will probably take
on the order of 300h on a GPU (depending on your exact system of course).
To have more flexibility, e.g. for multiple parallel runs, you can see the individual
commands for each model in ./run_stl.sh
.
You can run python figures.py
to reproduce the figures from our paper. You will have
to install seaborn
first. The figures will be saved into the fig
folder.
The original e2cnn library and parts of the experiments code in this repository were developed as part of the General E(2)-Equivariant Steerable CNNs paper. The extension of the library to steerable PDOs and other parts of the experiment code were written for our steerable PDO paper. Please cite these papers if you find the code in this repository useful for your own work:
@inproceedings{e2cnn,
title={{General E(2)-Equivariant Steerable CNNs}},
author={Weiler, Maurice and Cesa, Gabriele},
booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
year={2019},
}
@misc{jenner2021steerable,
title={Steerable Partial Differential Operators for Equivariant Neural Networks},
author={Erik Jenner and Maurice Weiler},
year={2021},
eprint={2106.10163},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
All code in this repository is distributed under the BSD Clear license. See LICENSE file.