📄 Paper | 🌐 Project Page | ⚙️ Installation | 🚀 Example Usage | ⚡ Demo
🏆 Leaderboard | 📊 Datasets | 📊 Add A New Dataset | 🤗 Base Models
MU-Bench is trying to address the following limitations of the evaluation of Machine Unlearning (MU):
- Inconsistency. employing different trained models and architectures, and sample removal strategies, which hampers accurate comparison. In addition, prior MU approaches have mainly focused on singular tasks or modalities, which is not comprehensive. To address these limitations,
MU-Bench is the first comprehensive benchmark for MU that
- unifies the sets of deleted samples and trained models
- provides broad coverage of tasks and data modalities, including previously unexplored domains such as speech and video classification.
To add a new dataset to mubench
, please fill out this Google Form or concat the authors.
Dataset | Task | Domain | Modality | D |
---|---|---|---|---|
Discriminative Tasks | ||||
CIFAR-100 | Image Classification | General | Image | 50K |
IMDB | Sentiment Classification | Movie Review | Text | 25K |
DDI-2013 | Relation Extraction | Biomedical | Text | 25K |
NLVR² | Visual Reasoning | General | Image-Image-Text | 62K |
Speech Commands | Keyword Spotting | Commands | Speech | 85K |
UCF101 | Action Classification | General | Video | 9.3K |
Generative Tasks | ||||
SAMSum | Text Summarization | Chat Dialogue | Text | 14K |
Celeb Profile | Text Generation | Biography | Text | 183 |
Tiny ImageNet | Text-to-Image Generation | General | Image-Text | 20K |
Bold datasets are ones that have never been evaluated in unlearning.
pip install mubench
OR
git clone https://github.com/CLU-UML/MU-Bench
import mubench
from mubench import UnlearningArguments, get_base_model, load_unlearn_data
unlearn_config = UnlearningArguments(
unlearn_method="multi_delete", # MU method, MultiDelete ECCV'24
backbone="vilt", # Network architecture
data_name="nlvr2", # Dataset
del_ratio=5 # Standardized splits
)
model, tokenizer = get_base_model(unlearn_config)
raw_datasets = load_unlearn_data(unlearn_config)
print(raw_datasets.keys())
By default, load_unlearn_data
creates the training set train
based on the unleaning method, as well as df_eval
and dr_eval
for evaluation. The original training set is orig_train
.
['train', 'validation', 'test', 'df_eval', 'dr_eval', 'orig_train']
# Standard HuggingFace code
from transformers import TrainingArguments
args = TrainingArguments(output_dir="tmp")
# Additional code for unlearning
from mubench import UnlearningArguments, unlearn_trainer
unlearn_config = UnlearningArguments(
unlearn_method="multi_delete", # MU method, MultiDelete ECCV'24
backbone="vilt", # Network architecture
data_name="nlvr2", # Dataset
del_ratio=5 # Standardized splits
)
trainer = unlearn_trainer(unlearn_config.unlearn_method)(
args=args,
unlearn_config=unlearn_config
)
trainer.unlearn() # Start Unlearning and Evaluation!
# Standard HuggingFace code
from transformers import TrainingArguments
args = TrainingArguments(output_dir="tmp")
# Additional code for unlearning
from mubench import UnlearningArguments, unlearn_trainer
model = # Define your own model
raw_datasets = load_unlearn_data(unlearn_config)
raw_datasets['train'] = # Customize unlearning data
unlearn_config = UnlearningArguments(
unlearn_method="multi_delete", # MU method, MultiDelete ECCV'24
backbone="vilt", # Network architecture
data_name="nlvr2", # Dataset
del_ratio=5 # Standardized splits
)
trainer = unlearn_trainer(unlearn_config.unlearn_method)(
args=args,
unlearn_config=unlearn_config,
model=model, # Overwrite the standard model
raw_datasets=raw_datasets, # Overwrite the standard data
)
trainer.unlearn() # Start Unlearning and Evaluation!
from mubench import get_training_, unlearn_trainer
unlearn_config = UnlearningArguments(
unlearn_method="multi_delete", # MU method, MultiDelete ECCV'24
backbone="vilt", # Network architecture
data_name="nlvr2", # Dataset
del_ratio=5 # Standardized splits
)
trainer = unlearn_trainer(unlearn_config.unlearn_method)(
args=args,
unlearn_config=unlearn_config
)
trainer.unlearn() # Start Unlearning and Evaluation!
Stay informed about the latest developments and enhancements to this project. Below is a summary of recent updates:
- [Feature]: Implement more unlearning algorithms. Estimated release in [month/year].
- [Improvement]: Include more base models / architectures. Estimated release in [month/year].