Skip to content
/ ChatBug Public

Official Repo of Paper `ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates`

License

Notifications You must be signed in to change notification settings

uw-nsl/ChatBug

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Fengqing Jiang1,* ,  Zhangchen Xu1,* ,  Luyao Niu1,* , 
Bill Yuchen Lin2 ,  Radha Poovendran1  

1University of Washington   2Allen Institute for AI   
*Equal Contribution

Warning: This project contains model outputs that may be considered offensive

[arXiv]

Overview

Usage

Setup Environment

bash build_env.sh chatbug

Run with Chatbug

python chatbug.py

You can set up the attack.yaml or run with cmd args to config the experiments.

Citation

If you find our project useful in your research, please consider citing:

@misc{jiang2024chatbug,
      title={ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates}, 
      author={Fengqing Jiang and Zhangchen Xu and Luyao Niu and Bill Yuchen Lin and Radha Poovendran},
      year={2024},
      eprint={2406.12935},
      archivePrefix={arXiv}
}

About

Official Repo of Paper `ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates`

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published