Skip to content

To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models

License

Notifications You must be signed in to change notification settings

zjunlp/KnowUnDo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KnowUnDo

To Forget or Not? Towards Practical Knowledge Unlearning for LLMs

📃 arXiv • 🤗 Dataset

🔔 Overview

We provide the KnowUnDo, a benchmark containing copyrighted content and user privacy domains to evaluate if the unlearning process inadvertently erases essential knowledge. Access our KnowUnDo directly on Hugging Face.

To address this, we propose a simple yet effective method, MemFlex, which utilizes gradient information to precisely target and unlearn sensitive parameters.

📊 Load Datasets

You can easily load the datasets following below.

from datasets import load_dataset

dataset = load_dataset("zjunlp/KnowUnDo", name='copyright', split='unlearn')
  • Available configuration names and corresponding splits:
    • copyright: unlearn, retention;
    • privacy: unlearn, retention;

🚀 How to run

Environment Setup

git clone https://github.com/zjunlp/KnowUnDo.git
cd KnowUnDo
conda create -n KnowUnDo python==3.10

conda activate KnowUnDo
pip install -e .
pip install -r requirements.txt

cd llm_unlearn/apex
pip install -v --no-cache-dir ./

Download Large Language Models (LLMs)

# directory: KnowUnDo
mkdir models
cd models
git lfs install
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
git clone https://huggingface.co/Qwen/Qwen1.5-7B-Chat

Pretrain LLMs in Our Setting

# directory: pretrain
bash run_finetune_lora.sh

Knowledge Localization (Optional)

We have released the localized knowledge region. You can perform the localization yourself as follows.

# directory: pretrain
bash run_localization.sh

Prepare tokenized datasets

# directory: llm_unlearn
cd utils
bash tokenize_datasets.sh
  • --val for the val split of the dataset.
  • --prompt for concating direct_prompt before the question in the datasets.

Unlearning experiments

# directory: llm_unlearn
bash run_baselines_lora.sh
bash run_ours_lora.sh
  • Available methods with corresponding arguments:
    • --unlearn_method gradient_ascent
    • --unlearn_method random_label --completely_random True (named Fine-tuning with Random Labels in the paper)
    • --unlearn_method random_label --top_k 1 --rm_groundtruth True (named Unlearning with Adversarial Samples in the paper)
    • --unlearn_method ascent_plus_descent
    • --unlearn_method ascent_plus_kl_divergence
    • --unlearn_method ascent_plus_descent --general True
    • --unlearn_method ascent_plus_kl_divergence --general True
    • --unlearn_method memflex (the strong baseline proposed by us)

Eval Unlearned Model

You can evaluate multiple unlearned models together by running our script only once.

# directory: llm_unlearn
bash run_eval_baselines_lora.sh
  • --direct_prompt=True means concating direct_prompt before the question in the datasets.

🎉 Acknowledgement

We would like to express our sincere gratitude to the excellent work Unlearning LLM, TOFU, LLaMA, and Qwen.

📖 Citation

If you use or extend our work, please cite the paper as follows:

@misc{tian2024forgetnotpracticalknowledge,
      title={To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models}, 
      author={Bozhong Tian and Xiaozhuan Liang and Siyuan Cheng and Qingbin Liu and Mengru Wang and Dianbo Sui and Xi Chen and Huajun Chen and Ningyu Zhang},
      year={2024},
      eprint={2407.01920},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2407.01920}, 
}