ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Fengqing Jiang^1,* , Zhangchen Xu^1,* , Luyao Niu^1,* ,
Bill Yuchen Lin² , Radha Poovendran¹

¹University of Washington ²Allen Institute for AI
^*Equal Contribution

Warning: This project contains model outputs that may be considered offensive

[arXiv]

Overview

Usage

Setup Environment

bash build_env.sh chatbug

Run with Chatbug

python chatbug.py

You can set up the attack.yaml or run with cmd args to config the experiments.

Citation

If you find our project useful in your research, please consider citing:

@misc{jiang2024chatbug,
      title={ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates}, 
      author={Fengqing Jiang and Zhangchen Xu and Luyao Niu and Bill Yuchen Lin and Radha Poovendran},
      year={2024},
      eprint={2406.12935},
      archivePrefix={arXiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
asset		asset
datasets		datasets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
attack.yaml		attack.yaml
build_env.sh		build_env.sh
chatbug.py		chatbug.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Overview

Usage

Setup Environment

Run with Chatbug

Citation

About

Releases

Packages

Languages

License

uw-nsl/ChatBug

Folders and files

Latest commit

History

Repository files navigation

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Overview

Usage

Setup Environment

Run with Chatbug

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages