PolarBERT

A foundation model for the IceCube neutrino telescope, implementing masked modeling and transfer learning approaches.

Features

Memory-efficient data handling with memory-mapped datasets
Multiple transformer architectures (Standard, Flash Attention, SwiGLU)
Two-stage training: pretraining and finetuning
Distributed training support with SLURM integration
Automated checkpoint management and experiment tracking

Installation

pip install -e .

Preparing the data

We use a very efficient memory-mapped dataset. This allows us to load the data very quickly and to use it in a memory-efficient way. The downside is that we subsample the long sequences to a fixed sequence length on the preprocessing step.

Download the data from the kaggle IceCube competition ans save it to <kagle data path>.
It is convenient to use the kaggle API to download the data (see the details here):

kaggle competitions download -c icecube-neutrinos-in-deep-ice

Adjust the paths in the configs/prepare_datasets.yaml file.
Run the preprocessing script:

python scripts/prepare_memmaped_data.py --config_path configs/prepare_datasets.yaml

Configuration

Example configuration files are provided in configs/*.example.yaml. To use them:

Copy the example config to create your actual config:

cp configs/polarbert.example.yaml configs/polarbert.yaml
cp configs/finetuning.example.yaml configs/finetuning.yaml

Update the paths and parameters in your config files:

Set data directories
Adjust model parameters if needed
Configure training settings
Set pretrained model path for finetuning

Note: Actual config files with paths are excluded from git to avoid sharing system-specific paths.

Training

Pretraining

Pretrain the model on masked DOM prediction and charge regression:

# Local development
python -m polarbert.pretraining \
    --config configs/polarbert.yaml \
    --model_type flash \  # Options: base, flash, swiglu
    --name my_experiment

# In SLURM job script
srun python -m polarbert.pretraining \
    --config configs/polarbert.yaml \
    --model_type flash \
    --job_id "${SLURM_JOB_ID}"

Available model architectures:

base: Standard Transformer flash: Flash Attention Transformer (recommended) swiglu: SwiGLU Activation Transformer Finetuning Finetune a pretrained model on directional prediction:

Update checkpoint path in configs/finetuning.yaml:

pretrained:
  checkpoint_path: '/path/to/your/checkpoint.pth'
  model_type: 'flash'  # same as pretraining
  freeze_backbone: false  # whether to freeze pretrained weights

Start finetuning:

# Local development
python -m polarbert.finetuning \
    --config configs/finetuning.yaml \
    --name my_finetuning

# In SLURM job script
srun python -m polarbert.finetuning \
    --config configs/finetuning.yaml \
    --job_id "${SLURM_JOB_ID}"

Models and Checkpoints Checkpoints are saved under:

Pretraining: checkpoints/<model_name>/ Finetuning: checkpoints/finetuned_/ Each training run saves:

Best model based on validation loss Last model state Final model state

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
scripts		scripts
src/polarbert		src/polarbert
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PolarBERT

Features

Installation

Preparing the data

Configuration

Training

Pretraining

About

Releases

Packages

Languages

License

timinar/PolarBERT

Folders and files

Latest commit

History

Repository files navigation

PolarBERT

Features

Installation

Preparing the data

Configuration

Training

Pretraining

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages