Skip to content
View EigenPro's full-sized avatar

Block or report EigenPro

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
EigenPro/README.md

EigenPro

EigenPro [1-3] is a GPU-enabled fast and scalable solver for training kernel machines. It applies a projected stochastic gradient method with dual preconditioning to enable major speed-ups. It is currently based on a PyTorch backend.

Highlights

  • Fast: EigenPro is the fastest kernel method at large scale.
  • Plug-and-play: Our method learns a quality model with little hyper-parameter tuning in most cases.
  • Scalable: The training time of one epoch is nearly linear in both model size and data size. This is the first kernel method that achieves such scalability without any compromise on testing performance.

Coming Soon

  • Support for multi-GPU and model-parallelism: We are adding support for multiple GPUs and model-parallelism.

Usage

Installation

pip install git+ssh://git@github.com/EigenPro/EigenPro.git@main

Run Example

Linux:

bash examples/run_fmnist.sh

Windows:

examples\run_fmnist.bat

Jupyter Notebook: examples/notebook.ipynb

See files under examples/ for more details.

Empirical Results

In the experiments described below, P denotes the number of centers (model size), essentially representing the model size, while 'd' signifies the ambient dimension. For all experiments, a Laplacian kernel with a bandwidth of 20.0 was employed.

1. CIFAR5M Extracted Features on single GPU

We used extracted features from the pretrained 'mobilenet-2' network available in the timm library. The benchmarks processed the full 5 million samples of CIFAR5M with d = 1280 for one epoch for two versions of EigenPro and FALKON [4-6]. All of these experiments were run on a single A100 GPU. The maximum RAM we had access to was 1.2TB, which was not sufficient for FALKON with 1M centers.

Example Image

2. Libri?Speech Extracted Features on single GPU

We used 10 million samples with d = 1024 for one epoch for two versions of EigenPro and FALKON. All of these experiments were run on a single V100 GPU. The maximum RAM available for this experiment was 300GB, which was not sufficient for FALKON with more than 128K centers. The features are extracted using an acoustic model (a VGG+BLSTM architecture in [7]) to align the length of audio and text.

Example Image


References

  1. Abedsoltan, Amirhesam and Belkin, Mikhail and Pandit, Parthe, "Toward Large Kernel Models," Proceedings of the 40th International Conference on Machine Learning, ICML'23, JMLR.org, 2023. Link
  2. Siyuan Ma, Mikhail Belkin, "Kernel machines that adapt to GPUs for effective large batch training," Proceedings of the 2nd SysMLConference, 2019. Link
  3. Siyuan Ma, Mikhail Belkin, "Diving into the shallows: a computational perspective on large-scale shallow learning," Advances in Neural Information Processing Systems 30 (NeurIPS 2017). Link
  4. Giacomo Meanti, Luigi Carratino, Lorenzo Rosasco, Alessandro Rudi, “Kernel methods through the roof: handling billions of points efficiently,” Advances in Neural Information Processing Systems, 2020. Link
  5. Alessandro Rudi, Luigi Carratino, Lorenzo Rosasco, “FALKON: An optimal large scale kernel method,” Advances in Neural Information Processing Systems, 2017. Link
  6. Ulysse Marteau-Ferey, Francis Bach, Alessandro Rudi, “Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses,” Advances in Neural Information Processing Systems, 2019. Link
  7. Hui, L. and Belkin, M. "Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks." In International Conference on Learning Representations, 2021. Link

Cite us

Popular repositories Loading

  1. EigenPro2 EigenPro2 Public

    EigenPro2 iteration in Tensorflow (Keras)

    Python 23 10

  2. EigenPro3 EigenPro3 Public

    Python 20 7

  3. EigenPro-pytorch EigenPro-pytorch Public

    EigenPro Iteration in PyTorch

    Python 19 7

  4. EigenPro-tensorflow EigenPro-tensorflow Public

    EigenPro iteration in Tensorflow (Keras)

    Python 6 2

  5. EigenPro-matlab EigenPro-matlab Public

    EigenPro iteration in MATLAB

    MATLAB 4 3

  6. EigenPro EigenPro Public

    Latest and fastest EigenPro that scales to billions of examples

    Python 4 1