Skip to content

[ICML 2023] NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning

Notifications You must be signed in to change notification settings

weitianxin/MLP_Fusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning

This is the code repository for the paper "NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning"(ICML 2023).

Main Idea

We proposed to create a light-weighted pretrained language model by clustering sub-MLPs into centroids that could be restored as a compressed MLP, which well approximates the NTK (neural tangent kernel) of the original MLP. We validated our method on both natural language understanding (NLU) and generation (NLG) tasks.

image (2)

Run Code

NLU Tasks

Navigate to "scripts/run_glue.sh", and edit the parameters, including but limited to:

  • "model_name_or_path": select a model

  • "task_name": NLU benchmark's name (e.g. "sst2", "mnli").

  • "distill": whether to enable distillation.

  • "max_seq_length": maximum sequence length(to be padded/truncated to).

  • "ffn_mode": chosen reduction/sketching method (eg. "sketch", "cluster", "mmd").

  • "sketch_layers": MLP layers that the sketching/reduction methods applied to.

  • "ffn_bn": feedforward network's bottleneck layer's dimension.

  • "mid_dim": intermediate dimension.

  • "re_init_layers": chosen re-initialize layers.

  • "seed": chosen random seed.

  • "num_train_epochs": number of training epochs.

  • "learning_rate": chosen learning rate.

  • "metric_for_best_model": metric for the chosen NLU benchmark.

Then, type into command prompt:

sh run_glue.sh

It will run the "run_glue.py" with the set of modified configurations, which validates the chosen reduction/sketching method on the selected NLU benchmark.

NLG Tasks

To choose a set of configurations for the task, navigate to the file "nlg/scripts/run_nlg.sh". Within this file, you can choose to use any of configurations available in "nlg/configs" by modifying the parameters. In case you want to create your own set of configurations, you can do so by creating a new .json file within the "nlg/configs" directory. This way, you can customize the configurations according to your specific requirements.

Some important parameters in configurations:

  • "seed": random seed.

  • "task": task to be executed.

  • "model_name": model's name.

  • "n_epochs": number of training epochs.

  • "train_batch_size/valid_batch_size": batch size for training/validation.

  • "lr": learning rate.

  • "ffn_mode": chosen reduction/sketching method (e.g. "sketch", "cluster", "mmd").

  • "sketch_layers": MLP layers that the sketching/reduction methods applied to.

  • "ffn_bn": feedforward network's bottleneck layer's dimension.

  • "mid_dim": intermediate dimension.

After adding configurations, simply execute the task by typing into command prompt:

sh run_nlg.sh

It will run "train.py" and "evaluate.py" with whatever configurations assigned to trainings and validations.

Citation

Please kindly cite our paper if you find the code/paper useful.

@InProceedings{pmlr-v202-wei23b,
  title = 	 {NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning},
  author =       {Wei, Tianxin and Guo, Zeming and Chen, Yifan and He, Jingrui},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {36821--36838},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR}
}

About

[ICML 2023] NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages