Theory of Machine Learning, EPFL

All

12 repositories

llm-past-tense
Public
Does Refusal Training in LLMs Generalize to the Past Tense? [NeurIPS 2024 Safe Generative AI Workshop (Oral)]
jailbreaking robustness generalization llms
Python
•6•55•0•0•Updated Oct 13, 2024Oct 13, 2024
why-weight-decay
Public
Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]
Python
•
Other
•0•49•0•0•Updated Sep 25, 2024Sep 25, 2024
llm-adaptive-attacks
Public
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]
Shell
•
MIT License
•23•209•1•0•Updated Sep 20, 2024Sep 20, 2024
icl-alignment
Public
Is In-Context Learning Sufficient for Instruction Following in LLMs?
alignment instruction-following in-context-learning
Python
•
Apache License 2.0
•3•23•0•0•Updated May 31, 2024May 31, 2024
long-is-more-for-alignment
Public
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]
Python
•1•14•1•0•Updated May 2, 2024May 2, 2024
sam-low-rank-features
Public
Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]
Jupyter Notebook
•1•24•1•0•Updated Sep 22, 2023Sep 22, 2023
sharpness-vs-generalization
Public
A modern look at the relationship between sharpness and generalization [ICML 2023]
Jupyter Notebook
•3•42•0•0•Updated Sep 11, 2023Sep 11, 2023
sgd-sparse-features
Public
SGD with large step sizes learns sparse features [ICML 2023]
sgd implicit-bias generalization large-step-sizes
Jupyter Notebook
•5•32•0•0•Updated Apr 24, 2023Apr 24, 2023
tml-epfl.github.io
Public
Creating a repository to store all related information for the weekly TML group meetings.
HTML
•
MIT License
•0•0•0•0•Updated Nov 16, 2022Nov 16, 2022
understanding-sam
Public
Towards Understanding Sharpness-Aware Minimization [ICML 2022]
generalization sharpness flatness sharpness-aware-minimization understanding-deep-learning
Jupyter Notebook
•3•35•0•0•Updated Jun 14, 2022Jun 14, 2022
adv-training-corruptions
Public
On the effectiveness of adversarial training against common corruptions [UAI 2022]
Python
•1•30•2•0•Updated May 16, 2022May 16, 2022
understanding-fast-adv-training
Public
Understanding and Improving Fast Adversarial Training [NeurIPS 2020]
robust-optimization robustness adversarial-examples adversarial-training
Python
•12•94•0•0•Updated Sep 23, 2021Sep 23, 2021