For developing and reproducing ML + HEP projects.
JetNet • Installation • Quickstart • Documentation • Contributing • Citation • References
JetNet is an effort to increase accessibility and reproducibility in jet-based machine learning.
Currently we provide:
- Easy-to-access and standardised interfaces for the following datasets:
- Standard implementations of generative evaluation metrics (Ref. [1, 2]), including:
- Fréchet physics distance (FPD)
- Kernel physics distance (KPD)
- Wasserstein-1 (W1)
- Fréchet ParticleNet Distance (FPND)
- coverage and minimum matching distance (MMD)
- Loss functions:
- Differentiable implementation of the energy mover's distance [3]
- And more general jet utilities.
Additional functionality is under development, and please reach out if you're interested in contributing!
JetNet can be installed with pip:
pip install jetnet
To use the differentiable EMD loss jetnet.losses.EMDLoss
, additional libraries must be installed via
pip install "jetnet[emdloss]"
Finally, PyTorch Geometric must be installed independently for the Fréchet ParticleNet Distance metric jetnet.evaluation.fpnd
(Installation instructions).
Datasets can be downloaded and accessed quickly, for example:
from jetnet.datasets import JetNet, TopTagging
# as numpy arrays:
particle_data, jet_data = JetNet.getData(
jet_type=["g", "q"], data_dir="./datasets/jetnet/", download=True
)
# or as a PyTorch dataset:
dataset = TopTagging(
jet_type="all", data_dir="./datasets/toptagging/", split="train", download=True
)
Evaluation metrics can be used as such:
generated_jets = np.random.rand(50000, 30, 3)
fpnd_score = jetnet.evaluation.fpnd(generated_jets, jet_type="g")
Loss functions can be initialized and used similarly to standard PyTorch in-built losses such as MSE:
emd_loss = jetnet.losses.EMDLoss(num_particles=30)
loss = emd_loss(real_jets, generated_jets)
loss.backward()
The full API reference and tutorials are available at jetnet.readthedocs.io. Tutorial notebooks are in the tutorials folder, with more to come.
We welcome feedback and contributions! Please feel free to create an issue for bugs or functionality requests, or open pull requests from your forked repo to solve them.
Perform an editable installation of the package from inside your forked repo and install the pytest
package for unit testing:
pip install -e .
pip install pytest
Run the test suite to ensure everything is working as expected:
pytest tests # tests all datasets
pytest tests -m "not slow" # tests only on the JetNet dataset for convenience
If you use this library for your research, please cite our article in the Journal of Open Source Software:
@article{Kansal_JetNet_2023,
author = {Kansal, Raghav and Pareja, Carlos and Hao, Zichun and Duarte, Javier},
doi = {10.21105/joss.05789},
journal = {Journal of Open Source Software},
number = {90},
pages = {5789},
title = {{JetNet: A Python package for accessing open datasets and benchmarking machine learning methods in high energy physics}},
url = {https://joss.theoj.org/papers/10.21105/joss.05789},
volume = {8},
year = {2023}
}
Please further cite the following if you use these components of the library.
@inproceedings{Kansal_MPGAN_2021,
author = {Kansal, Raghav and Duarte, Javier and Su, Hao and Orzari, Breno and Tomei, Thiago and Pierini, Maurizio and Touranakou, Mary and Vlimant, Jean-Roch and Gunopulos, Dimitrios},
booktitle = "{Advances in Neural Information Processing Systems}",
editor = {M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan},
pages = {23858--23871},
publisher = {Curran Associates, Inc.},
title = {Particle Cloud Generation with Message Passing Generative Adversarial Networks},
url = {https://proceedings.neurips.cc/paper_files/paper/2021/file/c8512d142a2d849725f31a9a7a361ab9-Paper.pdf},
volume = {34},
year = {2021},
eprint = {2106.11535},
archivePrefix = {arXiv},
}
@article{Kansal_Evaluating_2023,
author = {Kansal, Raghav and Li, Anni and Duarte, Javier and Chernyavskaya, Nadezda and Pierini, Maurizio and Orzari, Breno and Tomei, Thiago},
title = {Evaluating generative models in high energy physics},
reportNumber = "FERMILAB-PUB-22-872-CMS-PPD",
doi = "10.1103/PhysRevD.107.076017",
journal = "{Phys. Rev. D}",
volume = "107",
number = "7",
pages = "076017",
year = "2023",
eprint = "2211.10295",
archivePrefix = "arXiv",
}
Please cite the respective qpth or cvxpy libraries, depending on the method used (qpth
by default), as well as the original EMD paper [3].
[1] R. Kansal et al., Particle Cloud Generation with Message Passing Generative Adversarial Networks, NeurIPS 2021 [2106.11535].
[2] R. Kansal et al., Evaluating Generative Models in High Energy Physics, Phys. Rev. D 107 (2023) 076017 [2211.10295].
[3] P. T. Komiske, E. M. Metodiev, and J. Thaler, The Metric Space of Collider Events, Phys. Rev. Lett. 123 (2019) 041801 [1902.02346].