This is the repo of paper "Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning" which is accepted by ACL 2020. Due to data privacy, we only release the data augmentation related code. The origin spam dataset is the same as our another paper "Detect Camouflaged Spam Content via StoneSkipping: Graph and Text Joint Embedding for Chinese Character Variation Representation"
@inproceedings{jiang2020camouflaged,
title={Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning},
author={Jiang, Zhuoren and Gao, Zhe and Duan, Yuguang and Kang, Yangyang and Sun, Changlong and Zhang, Qiong and Liu, Xiaozhong},
booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
pages={3080--3085},
year={2020}
}