Example Notebooks to Replicate Experiments from Era Splitting Paper. All notebooks should run top to bottom.
4 Experiments:
- Shifted Sine Wave
- Synthetic Memorization ( Parascandolo et.al https://arxiv.org/abs/2009.00329 )
- Camelyon17 ( Bandi et. al. https://pubmed.ncbi.nlm.nih.gov/30716025/ )
- Numerai
All Experiments follow the same proceedure: random sample from a grid of parameters, train and evaluated all configuration and view results.
Era Splitting Paper: https://arxiv.org/abs/2309.14496
The version of Python used for the experiments was 3.8.16, but this software should work with newer versions as well.
See requirements.txt for package requirements.
Era splitting is implemented in fork of Scikit-Learn, located at the following repository.
https://github.com/jefferythewind/scikit-learn-erasplit
The quick start below will load all the packages required to run the notebooks in a new conda environment.
Here is how to get started assuming you have a version of anaconda installed.
conda create -n erasplit python==3.8.16
conda activate erasplit
python -m pip install -r requirements.txt
To get the environment to show up in jupyter, use this:
python -m ipykernel install --user --name=erasplit
That's it! Should be ready to run the notebooks after that.
The era splitting is part of the forked sklearn library and is automatically installed with pip via the requirements file. To install the sklearn version w/ era splitting alone, use this command:
python -m pip install 'scikit-learn @ git+https://github.com/jefferythewind/scikit-learn-erasplit.git'
@misc{delise2023era,
title={Era Splitting},
author={Timothy DeLise},
year={2023},
eprint={2309.14496},
archivePrefix={arXiv},
primaryClass={cs.LG}
}