Code the ICML 2024 paper: "MADA: Meta-Adaptive Optimizers through hyper-gradient Descent"
machine-learning deep-neural-networks optimization machine-learning-algorithms optimization-algorithms adam-optimizer gpt-2 meta-optimizer large-language-models
-
Updated
Jul 3, 2024 - Python