Skip to content

Commit

Permalink
Merge pull request #89 from pyt-team/topotune
Browse files Browse the repository at this point in the history
Topotune Readme
  • Loading branch information
gbg141 authored Oct 10, 2024
2 parents 0e5325a + d5e2ea9 commit 7d00bc4
Show file tree
Hide file tree
Showing 6 changed files with 95 additions and 51 deletions.
48 changes: 46 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,16 +145,60 @@ We list the neural networks trained and evaluated by `TopoBenchmarkX`, organized
### Combinatorial complexes
| Model | Reference |
| --- | --- |
| GCCN | Generalized Combinatorial Complex Neural Networks |
| GCCN | [Generalized Combinatorial Complex Neural Networks](https://arxiv.org/pdf/2410.06530) |

## :bulb: TopoTune

We include TopoTune, a comprehensive framework for easily defining and training new, general TDL models (GCCNs, pictured below) on any domain using any (graph) neural network ω as a backbone, as well as reproducing existing models. To train and test a GCCN, it is sufficient to specify the chocie of domain, neighborhood structure, and backbone model in the configuration. We provide scripts to reproduce a broad class of GCCNs in `scripts/topotune` and reproduce iterations of existing neural networks in `scripts/topotune/existing_models`, as previously reported.
We include TopoTune, a comprehensive framework for easily defining and training new, general TDL models on any domain using any (graph) neural network ω as a backbone. The pre-print detailing this framework is [TopoTune: A Framework for Generalized Combinatorial Complex Neural Networks](https://arxiv.org/pdf/2410.06530). In a GCCN (pictured below), the input complex is represented as an ensemble of strictly augmented Hasse graphs, one per neighborhood of the complex. Each of these Hasse graphs is processed by a sub model ω, and the outputs are rank-wise aggregated in between layers.

<p align="center">
<img src="resources/gccn.jpg" width="700">
</p>

### Defining and training a GCCN
To implement and train a GCCN, run the following command line with the desired choice of dataset, lifting domain (ex: `cell`, `simplicial`), PyTorch Geometric backbone model (ex: `GCN`, `GIN`, `GAT`, `GraphSAGE`) and parameters (ex. `model.backbone.GNN.num_layers=2`), neighborhood structure (routes), and other hyperparameters.


```
python -m topobenchmarkx \
dataset=graph/PROTEINS \
dataset.split_params.data_seed=1 \
model=cell/topotune\
model.tune_gnn=GCN \
model.backbone.GNN.num_layers=2 \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[2,1\],boundary\]\] \
model.backbone.layers=4 \
model.feature_encoder.out_channels=32 \
model.feature_encoder.proj_dropout=0.3 \
model.readout.readout_name=PropagateSignalDown \
logger.wandb.project=TopoTune_cell \
trainer.max_epochs=1000 \
callbacks.early_stopping.patience=50 \
```

To use a single augmented Hasse graph expansion, use `model={domain}/topotune_onehasse` instead of `model={domain}/topotune`.

To specify a set of neighborhoods (routes) on the complex, use a list of neighborhoods each specified as `\[\[{source_rank}, {destination_rank}\], {neighborhood}\]`. Currently, the following options for `{neighborhood}` are supported:
- `up_laplacian`, from rank $r$ to $r$
- `down_laplacian`, from rank $r$ to $r$
- `boundary`, from rank $r$ to $r-1$
- `coboundary`, from rank $r$ to $r+1$
- `adjacency`, from rank $r$ to $r$ (stand-in for `up_adjacency`, as `down_adjacency` not yet supported in TopoBenchmarkX)


### Using backbone models from any package
By default, backbone models are imported from `torch_geometric.nn.models`. To import and specify a backbone model from any other package, such as `torch.nn.Transformer` or `dgl.nn.GATConv`, it is sufficient to 1) make sure the package is installed and 2) specify in the command line:

```
model.tune_gnn = {backbone_model}
model.backbone.GNN._target_={package}.{backbone_model}
```

### Reproducing experiments

We provide scripts to reproduce experiments on a broad class of GCCNs in [`scripts/topotune`](scripts/topotune) and reproduce iterations of existing neural networks in [`scripts/topotune/existing_models`](scripts/topotune/existing_models), as previously reported in the [TopoTune paper](https://arxiv.org/pdf/2410.06530).

We invite users interested in running extensive sweeps on new GCCNs to replicate the `--multirun` flag in the scripts. This is a shortcut for running every possible combination of the specified parameters in a single command.

## :rocket: Liftings

Expand Down
14 changes: 7 additions & 7 deletions scripts/topotune/existing_models/tune_cwn.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ python -m topobenchmarkx \
model=cell/topotune_onehasse,cell/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN.num_layers=1 \
model.backbone.routes=\[\[\[0,1\],coincidence\],\[\[1,1\],adjacency\],\[\[2,1\],incidence\]\] \
model.backbone.routes=\[\[\[0,1\],coboundary\],\[\[1,1\],adjacency\],\[\[2,1\],boundary\]\] \
logger.wandb.project=TopoTune_CWN \
dataset=graph/MUTAG \
optimizer.parameters.lr=0.001 \
Expand All @@ -24,7 +24,7 @@ python -m topobenchmarkx \
model=cell/topotune_onehasse,cell/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN.num_layers=1 \
model.backbone.routes=\[\[\[0,1\],coincidence\],\[\[1,1\],adjacency\],\[\[2,1\],incidence\]\] \
model.backbone.routes=\[\[\[0,1\],coboundary\],\[\[1,1\],adjacency\],\[\[2,1\],boundary\]\] \
logger.wandb.project=TopoTune_CWN \
dataset=graph/NCI1 \
optimizer.parameters.lr=0.001 \
Expand All @@ -45,7 +45,7 @@ python -m topobenchmarkx \
model=cell/topotune_onehasse,cell/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN.num_layers=1 \
model.backbone.routes=\[\[\[0,1\],coincidence\],\[\[1,1\],adjacency\],\[\[2,1\],incidence\]\] \
model.backbone.routes=\[\[\[0,1\],coboundary\],\[\[1,1\],adjacency\],\[\[2,1\],boundary\]\] \
logger.wandb.project=TopoTune_CWN \
dataset=graph/NCI109 \
optimizer.parameters.lr=0.001 \
Expand All @@ -65,7 +65,7 @@ python -m topobenchmarkx \
model=cell/topotune_onehasse,cell/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN.num_layers=1 \
model.backbone.routes=\[\[\[0,1\],coincidence\],\[\[1,1\],adjacency\],\[\[2,1\],incidence\]\] \
model.backbone.routes=\[\[\[0,1\],coboundary\],\[\[1,1\],adjacency\],\[\[2,1\],boundary\]\] \
logger.wandb.project=TopoTune_CWN \
dataset=graph/ZINC \
optimizer.parameters.lr=0.001 \
Expand All @@ -90,7 +90,7 @@ python -m topobenchmarkx \
model=cell/topotune_onehasse,cell/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN.num_layers=1 \
model.backbone.routes=\[\[\[0,1\],coincidence\],\[\[1,1\],adjacency\],\[\[2,1\],incidence\]\] \
model.backbone.routes=\[\[\[0,1\],coboundary\],\[\[1,1\],adjacency\],\[\[2,1\],boundary\]\] \
logger.wandb.project=TopoTune_CWN \
dataset=graph/cocitation_citeseer \
optimizer.parameters.lr=0.001 \
Expand All @@ -111,7 +111,7 @@ python -m topobenchmarkx \
model=cell/topotune_onehasse,cell/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN.num_layers=1 \
model.backbone.routes=\[\[\[0,1\],coincidence\],\[\[1,1\],adjacency\],\[\[2,1\],incidence\]\] \
model.backbone.routes=\[\[\[0,1\],coboundary\],\[\[1,1\],adjacency\],\[\[2,1\],boundary\]\] \
logger.wandb.project=TopoTune_CWN \
dataset=graph/cocitation_pubmed \
optimizer.parameters.lr=0.01 \
Expand All @@ -134,7 +134,7 @@ python -m topobenchmarkx \
model=cell/topotune_onehasse,cell/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN.num_layers=1 \
model.backbone.routes=\[\[\[0,1\],coincidence\],\[\[1,1\],adjacency\],\[\[2,1\],incidence\]\] \
model.backbone.routes=\[\[\[0,1\],coboundary\],\[\[1,1\],adjacency\],\[\[2,1\],boundary\]\] \
logger.wandb.project=TopoTune_CWN \
dataset=graph/PROTEINS,graph/cocitation_cora \
optimizer.parameters.lr=0.001 \
Expand Down
16 changes: 8 additions & 8 deletions scripts/topotune/existing_models/tune_sccn.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ python -m topobenchmarkx \
model.feature_encoder.out_channels=128 \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN.num_layers=1 \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coincidence\],\[\[1,0\],incidence\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coincidence\],\[\[2,1\],incidence\],\[\[2,2\],down_laplacian\]\] \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coboundary\],\[\[1,0\],boundary\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coboundary\],\[\[2,1\],boundary\],\[\[2,2\],down_laplacian\]\] \
model.backbone.layers=3 \
dataset.split_params.data_seed=1,3,5,7,9 \
model.readout.readout_name=NoReadOut \
Expand All @@ -28,7 +28,7 @@ python -m topobenchmarkx \
model.feature_encoder.out_channels=64 \
model.backbone.GNN.num_layers=1 \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coincidence\],\[\[1,0\],incidence\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coincidence\],\[\[2,1\],incidence\],\[\[2,2\],down_laplacian\]\] \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coboundary\],\[\[1,0\],boundary\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coboundary\],\[\[2,1\],boundary\],\[\[2,2\],down_laplacian\]\] \
model.backbone.layers=3 \
model.feature_encoder.proj_dropout=0.5 \
model.readout.readout_name=PropagateSignalDown \
Expand All @@ -51,7 +51,7 @@ python -m topobenchmarkx \
model.feature_encoder.out_channels=64 \
model.backbone.GNN.num_layers=1 \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coincidence\],\[\[1,0\],incidence\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coincidence\],\[\[2,1\],incidence\],\[\[2,2\],down_laplacian\]\] \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coboundary\],\[\[1,0\],boundary\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coboundary\],\[\[2,1\],boundary\],\[\[2,2\],down_laplacian\]\] \
model.backbone.layers=4 \
model.readout.readout_name=NoReadOut \
transforms.graph2simplicial_lifting.signed=True \
Expand All @@ -72,7 +72,7 @@ python -m topobenchmarkx \
python -m topobenchmarkx \
model=simplicial/topotune_onehasse,simplicial/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coincidence\],\[\[1,0\],incidence\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coincidence\],\[\[2,1\],incidence\],\[\[2,2\],down_laplacian\]\] \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coboundary\],\[\[1,0\],boundary\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coboundary\],\[\[2,1\],boundary\],\[\[2,2\],down_laplacian\]\] \
dataset=graph/PROTEINS \
optimizer.parameters.lr=0.01 \
model.feature_encoder.out_channels=128 \
Expand All @@ -95,7 +95,7 @@ python -m topobenchmarkx \
model=simplicial/topotune_onehasse,simplicial/topotune \
dataset=graph/ZINC \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coincidence\],\[\[1,0\],incidence\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coincidence\],\[\[2,1\],incidence\],\[\[2,2\],down_laplacian\]\] \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coboundary\],\[\[1,0\],boundary\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coboundary\],\[\[2,1\],boundary\],\[\[2,2\],down_laplacian\]\] \
optimizer.parameters.lr=0.001 \
model.feature_encoder.out_channels=128 \
model.backbone.layers=4 \
Expand All @@ -117,7 +117,7 @@ python -m topobenchmarkx \
python -m topobenchmarkx \
model=simplicial/topotune_onehasse,simplicial/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coincidence\],\[\[1,0\],incidence\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coincidence\],\[\[2,1\],incidence\],\[\[2,2\],down_laplacian\]\] \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coboundary\],\[\[1,0\],boundary\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coboundary\],\[\[2,1\],boundary\],\[\[2,2\],down_laplacian\]\] \
dataset=graph/cocitation_citeseer \
optimizer.parameters.lr=0.01 \
model.feature_encoder.out_channels=64 \
Expand All @@ -139,7 +139,7 @@ python -m topobenchmarkx \
model=simplicial/topotune_onehasse,simplicial/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.GNN._target_=topobenchmarkx.nn.backbones.graph.IdentityGCN \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coincidence\],\[\[1,0\],incidence\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coincidence\],\[\[2,1\],incidence\],\[\[2,2\],down_laplacian\]\] \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coboundary\],\[\[1,0\],boundary\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coboundary\],\[\[2,1\],boundary\],\[\[2,2\],down_laplacian\]\] \
dataset=graph/cocitation_cora \
optimizer.parameters.lr=0.01 \
model.feature_encoder.out_channels=32 \
Expand All @@ -160,7 +160,7 @@ python -m topobenchmarkx \
python -m topobenchmarkx \
model=simplicial/topotune_onehasse,simplicial/topotune \
model.tune_gnn=GCN,GIN,GAT,GraphSAGE \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coincidence\],\[\[1,0\],incidence\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coincidence\],\[\[2,1\],incidence\],\[\[2,2\],down_laplacian\]\] \
model.backbone.routes=\[\[\[0,0\],up_laplacian\],\[\[0,1\],coboundary\],\[\[1,0\],boundary\],\[\[1,1\],down_laplacian\],\[\[1,1\],up_laplacian\],\[\[1,2\],coboundary\],\[\[2,1\],boundary\],\[\[2,2\],down_laplacian\]\] \
dataset=graph/cocitation_pubmed \
optimizer.parameters.lr=0.01 \
model.feature_encoder.out_channels=64 \
Expand Down
Loading

0 comments on commit 7d00bc4

Please sign in to comment.