A foundation model for the IceCube neutrino telescope, implementing masked modeling and transfer learning approaches.
- Memory-efficient data handling with memory-mapped datasets
- Multiple transformer architectures (Standard, Flash Attention, SwiGLU)
- Two-stage training: pretraining and finetuning
- Distributed training support with SLURM integration
- Automated checkpoint management and experiment tracking
pip install -e .
We use a very efficient memory-mapped dataset. This allows us to load the data very quickly and to use it in a memory-efficient way. The downside is that we subsample the long sequences to a fixed sequence length on the preprocessing step.
- Download the data from the kaggle IceCube competition
ans save it to
<kagle data path>
.
It is convenient to use the kaggle API to download the data (see the details here):
kaggle competitions download -c icecube-neutrinos-in-deep-ice
-
Adjust the paths in the
configs/prepare_datasets.yaml
file. -
Run the preprocessing script:
python scripts/prepare_memmaped_data.py --config_path configs/prepare_datasets.yaml
Example configuration files are provided in configs/*.example.yaml
. To use them:
- Copy the example config to create your actual config:
cp configs/polarbert.example.yaml configs/polarbert.yaml
cp configs/finetuning.example.yaml configs/finetuning.yaml
- Update the paths and parameters in your config files:
- Set data directories
- Adjust model parameters if needed
- Configure training settings
- Set pretrained model path for finetuning
Note: Actual config files with paths are excluded from git to avoid sharing system-specific paths.
Pretrain the model on masked DOM prediction and charge regression:
# Local development
python -m polarbert.pretraining \
--config configs/polarbert.yaml \
--model_type flash \ # Options: base, flash, swiglu
--name my_experiment
# In SLURM job script
srun python -m polarbert.pretraining \
--config configs/polarbert.yaml \
--model_type flash \
--job_id "${SLURM_JOB_ID}"
Available model architectures:
base: Standard Transformer flash: Flash Attention Transformer (recommended) swiglu: SwiGLU Activation Transformer Finetuning Finetune a pretrained model on directional prediction:
Update checkpoint path in configs/finetuning.yaml:
pretrained:
checkpoint_path: '/path/to/your/checkpoint.pth'
model_type: 'flash' # same as pretraining
freeze_backbone: false # whether to freeze pretrained weights
Start finetuning:
# Local development
python -m polarbert.finetuning \
--config configs/finetuning.yaml \
--name my_finetuning
# In SLURM job script
srun python -m polarbert.finetuning \
--config configs/finetuning.yaml \
--job_id "${SLURM_JOB_ID}"
Models and Checkpoints Checkpoints are saved under:
Pretraining: checkpoints/<model_name>/ Finetuning: checkpoints/finetuned_/ Each training run saves:
Best model based on validation loss Last model state Final model state