Fengqing Jiang1,* ,
Zhangchen Xu1,* ,
Luyao Niu1,* ,
Bill Yuchen Lin2 ,
Radha Poovendran1
1University of Washington 2Allen Institute for AI
*Equal Contribution
Warning: This project contains model outputs that may be considered offensive
bash build_env.sh chatbug
python chatbug.py
You can set up the attack.yaml
or run with cmd args to config the experiments.
If you find our project useful in your research, please consider citing:
@misc{jiang2024chatbug,
title={ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates},
author={Fengqing Jiang and Zhangchen Xu and Luyao Niu and Bill Yuchen Lin and Radha Poovendran},
year={2024},
eprint={2406.12935},
archivePrefix={arXiv}
}