MicroAdam

This repository contains the code to reproduce the results for the paper MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence.

We provide code to reproduce the following experiments:

BERT-Base/Large and OPT-1.3B on GLUE/MNLI using HuggingFace repository
Llama-2 7B on GSM8k using llm-foundry from MosaicML

Installation

MicroAdam optimizer is implemented in the ISTA-DASLab-Optimizers repository, along with other optimizers. It is installable via pip install ista-daslab-optimizers (automatically done in the install.sh script). Follow the steps below to setup the environment for MicroAdam:

cd ~
git clone git@github.com:IST-DASLab/MicroAdam.git
cd ~/MicroAdam
source install.sh

Reproduce experiments for GLUE/MNLI

We provide the scripts run_hf_glue_mnli_OPTIM.sh, where OPTIM is the optimizer name, as follows: microadam, adamw, galore, came, adamw8b.

cd ~/MicroAdam/huggingface_glue_mnli
# bash run_hf_glue_mnli_adamw.sh
# bash run_hf_glue_mnli_adamw8b.sh
# bash run_hf_glue_mnli_came.sh
# bash run_hf_glue_mnli_galore.sh
bash run_hf_glue_mnli_microadam.sh

Reproduce experiments for Llama-2 7B on GSM-8k

We can run the experiments using the following commands:

Run MicroAdam

cd ~/MicroAdam/llm-foundry/scripts/train
bash run_llama2-7b_gsm8k_microadam.sh

Run AdamW-8bit

python3 train.py yamls/finetune/llama2-7b_microadam_gsm8k.yaml \
        task=gsm8k \
        optimizer.name=adamw8b \
        optimizer.defaults.lr=5e-5 \
        save_folder=./llama2_7b_gsm8k_adamw8b \
        seed=42

Run DecoupledAdamW

python3 train.py yamls/finetune/llama2-7b_microadam_gsm8k.yaml \
        task=gsm8k \
        optimizer.name=decoupled_adamw \
        optimizer.defaults.lr=5e-5 \
        save_folder=./llama2_7b_gsm8k_decoupled_adamw \
        seed=42

Changes compared to the original `llm-foundry` repository:

method build_optimizer
changes in llm-foundry/scripts/train/train.py:
- set run_name and save_folder depending on wandb group, job_type and name
- added evaluation and time elapsed to be logged to wandb
- added wandb_groups_config to finetuning yaml
changes in finetuning yaml file:
- added task variable
- added wandb_groups section

Citing

If you find our work useful, please consider citing:

@misc{modoranu2024microadam,
      title={MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence}, 
      author={Ionut-Vlad Modoranu and Mher Safaryan and Grigory Malinovsky and Eldar Kurtic and Thomas Robert and Peter Richtarik and Dan Alistarh},
      year={2024},
      eprint={2405.15593},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
huggingface_glue_mnli		huggingface_glue_mnli
llm-foundry		llm-foundry
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MicroAdam

Installation

Reproduce experiments for GLUE/MNLI

Reproduce experiments for Llama-2 7B on GSM-8k

Run MicroAdam

Run AdamW-8bit

Run DecoupledAdamW

Changes compared to the original `llm-foundry` repository:

Citing

About

Releases

Packages

Languages

License

IST-DASLab/MicroAdam

Folders and files

Latest commit

History

Repository files navigation

MicroAdam

Installation

Reproduce experiments for GLUE/MNLI

Reproduce experiments for Llama-2 7B on GSM-8k

Run MicroAdam

Run AdamW-8bit

Run DecoupledAdamW

Changes compared to the original llm-foundry repository:

Citing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Changes compared to the original `llm-foundry` repository:

Packages