Skip to content

Xianjun-Yang/Awesome_papers_on_LLMs_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome papers on LLMs detection

This repo is a curated list of papers about detection of LLMs-generated content. It includes most lastest papers about detection methods, datasets, attack, etc. We will consistently update this repo to include the most recent papers.

Contents

Training-based

Black-box

2024

  • Distinguishing LLM-generated from Human-written Code by Contrastive Learning [pdf] 11/07/2024
  • EAGLE: A Domain Generalization Framework for AI-generated Text Detection [pdf] 03/25/2024

2023

  • DETECTING MACHINE-GENERATED TEXTS BY MULTI-POPULATION AWARE OPTIMIZATION FOR MAXIMUM MEAN DISCREPANCY [pdf] 02/27/2024
  • Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse Motifs [pdf] 02/19/2024
  • LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning [pdf] 02/04/2024
  • FEW-SHOT DETECTION OF MACHINE-GENERATED TEXT USING STYLE REPRESENTATIONS [pdf] 01/12, 2024
  • Token Prediction as Implicit Classification to Identify LLM-Generated Text [pdf] Nov. 15, 2023
  • AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language Models Denoising [pdf] Nov. 14, 2023
  • G3Detector: General GPT-Generated Text Detector [pdf]
  • GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [pdf]
  • GPT Paternity Test: GPT Generated Text Detection with GPT Genetic Inheritance [pdf]

2022

  • OpenAI Text Classifier [link]
  • GPTZero [link]
  • CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data Limitation With Contrastive Learning [pdf]
  • LLMDet: A Large Language Models Detection Tool [pdf]
  • Multiscale Positive-Unlabeled Detection of AI-Generated Texts [pdf]
  • RADAR: Robust AI-Text Detection via Adversarial Learning [pdf]
  • On the Zero-Shot Generalization of Machine-Generated Text Detectors [pdf]
  • ConDA: Contrastive Domain Adaptation for AI-generated Text Detection [pdf]
  • From Text to Source: Results in Detecting Large Language Model-Generated Content [pdf]
  • Ghostbuster: Detecting Text Ghostwritten by Large Language Models [pdf]
  • Deepfake Text Detection in the Wild [pdf]

2020

  • Automatic Detection of Generated Text is Easiest when Humans are Fooled [pdf] 11/10/2024
  • DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection [pdf] 11/10/2024

White-box

2024

  • Text Fluoroscopy: Detecting LLM-Generated Text [pdf] 11/2024 through Intrinsic Features

2023

  • SeqXGPT: Sentence-Level AI-Generated Text Detection [pdf]
  • Origin Tracing and Detecting of LLMs [pdf]

2019

  • GLTR: Statistical Detection and Visualization of Generated Text [pdf]
  • Release strategies and the social impacts of language models [pdf]

Zero-shot

Black-box

2024

  • SimLLM: Detecting Sentences Generated by Large Language Models Using Similarity between the Generation and its Re-generation [pdf] 10/11/2024
  • Learning to Rewrite: Generalized LLM-Generated Text Detection [pdf] 08/11/2024
  • Improving Logits-based Detector without Logits from Black-box LLMs [pdf] 06/11/2024

2023

  • Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling [link] 15/02/2024
  • Raidar: geneRative AI Detection viA Rewriting [link] 23/01/2024
  • SPOTTING LLMS WITH BINOCULARS: ZERO-SHOT DETECTION OF MACHINE-GENERATED TEXT [link]
  • Detectgpt: Zero-shot machine-generated text detection using probability curvature [pdf]
  • DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text [pdf]
  • Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense [pdf]
  • Smaller Language Models are Better Black-box Machine-Generated Text Detectors [pdf]
  • Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts [pdf]

White-box

2024

  • Detecting Machine-Generated Long-Form Content with Latent-Space Variables [pdf] 10/04/2024
  • Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness [pdf] 09/25/2024

2023

  • Does DETECTGPT Fully Utilize Perturbation? Selective Perturbation on Model-Based Contrastive Learning Detector would be Better [pdf] 02/03/2024
  • DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text [pdf]
  • DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text [pdf]
  • Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature [pdf]
  • GPT-who: An Information Density-based Machine-Generated Text Detector [pdf]
  • Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model [pdf]

Before 2020

  • Detecting Fake Content with Relative Entropy Scoring [pdf]
  • Computer-generated text detection using machine learning: A systematic review [pdf]
  • GLTR: Statistical Detection and Visualization of Generated Text [pdf]

Watermarking

Black-box

2024

  • POSTMARK: A Robust Blackbox Watermark for Large Language Models [pdf] 06/20/2024

2023

  • Watermarking Text Generated by Black-Box Language Models [pdf]

2022

  • Tracing text provenance via context-aware lexical substitution [pdf]

Before 2020

  • Natural language watermarking and tamperproofing [pdf]
  • Natural language watermarking [pdf]
  • Natural language watermarking via morphosyntactic alterations [pdf]
  • The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions [pdf]

White-box

2024

  • Downstream Trade-offs of a Family of Text Watermarks [pdf] 11/21/2024
  • Less is More: Sparse Watermarking in LLMs with Enhanced Text Quality [pdf] 07/21/2024
  • Waterfall: Framework for Robust and Scalable Text Watermarking [pdf] 07/08/2024
  • A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models [pdf] 06/27/2024
  • CODEIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [pdf] 04/25/2024
  • Watermark-based Detection and Attribution of AI-Generated Content [pdf] 04/07/2024
  • A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules [pdf] 04/01/2024
  • WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models [pdf] 03/29/2024
  • Duwak: Dual Watermarks in Large Language Models [pdf] 03/20/2024
  • WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off [pdf] 03/11/2024
  • Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models [link] 02/28/2024
  • EmMark: Robust Watermarks for IP Protection of Embedded Quantized Large Language Models [link] 02/28/2024
  • Multi-Bit Distortion-Free Watermarking for Large Language Models [link] 02/27/2024
  • Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models [pdf] 02/22/2024
  • GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick [link] 20/02/2024
  • k-SEMSTAMP : A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text [link] 19/02/2024
  • Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs [link] 08/02/2024
  • Provably Robust Multi-bit Watermarking for AI-generated Text via Error Correction Code [link] 30/01/2024
  • Adaptive Text Watermark for Large Language Models [pdf] 26/01/2024

2023

  • WatME: Towards Lossless Watermarking Through Lexical Redundancy [pdf] 16/11/2023
  • Optimizing watermarks for large language models [pdf] 31/12/2023
  • Hey That’s Mine! Imperceptible Watermarks are Preserved in Diffusion Generated Outputs [pdf] 11/09/2023
  • Towards Optimal Statistical Watermarking [pdf] 13/12/2023
  • ON THE LEARNABILITY OF WATERMARKS FOR LANGUAGE MODELS [pdf] 7/12/2023
  • Mark My Words: Analyzing and Evaluating Language Model Watermarks [pdf] 3/12/2023
  • I Know You Did Not Write That! A Sampling-Based Watermarking Method for Identifying Machine Generated Text [pdf] 30/11/2023
  • TOWARDS CODABLE WATERMARKING FOR INJECTING MULTI-BIT INFORMATION TO LLM [pdf] 27/11/2023
  • Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring [pdf] 16/11/2023
  • Performance Trade-offs of Watermarking Large Language Models [pdf] 16/11/2023
  • WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models [pdf] 13/11/2023
  • Publicly Detectable Watermarking for Language Models [pdf] 25/10/2023
  • Unbiased Watermark for Large Language Models [pdf] 18/10/2023
  • A watermark for large language models [pdf]
  • Undetectable Watermarks for Language Models [pdf]
  • Provable Robust Watermarking for AI-Generated Text [pdf]
  • Robust Distortion-free Watermarks for Language Models [pdf]
  • SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation [pdf]
  • DiPmark: A Stealthy, Efficient and Resilient Watermark for Large Language Models [pdf]
  • Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy [pdf]
  • A Semantic Invariant Robust Watermark for Large Language Models [pdf]
  • REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models [pdf]
  • Robust Multi-bit Natural Language Watermarking through Invariant Features [pdf]
  • Advancing Beyond Identification: Multi-bit Watermark for Language Models [pdf]
  • Three Bricks to Consolidate Watermarks for Large Language Models [pdf]

2022

  • My AI safety lecture for UT Effective Altruism [Link]

Fingerprinting

  • Instructional Fingerprinting of Large Language Models [pdf] 21/01/2024
  • TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification [pdf] 20/02/2024
  • LLMmap: Fingerprinting For Large Language Models [pdf] 22/07/2024

Code-detection

  • Zero-Shot Detection of Machine-Generated Codes [pdf]
  • Who Wrote this Code? Watermarking for Code Generation [pdf]

Attack

  • Beemo: Benchmark of Expert-edited Machine-generated Outputs [pdf] 11/06/2024
  • DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios [pdf] 10/31/2024
  • HUMANIZING THE MACHINE: PROXY ATTACKS TO MISLEAD LLM DETECTORS [pdf] 10/25/2024
  • RAFT: Realistic Attacks to Fool Text Detectors [pdf] 10/04/2024
  • AI-generated text boundary detection with RoFT [pdf] 09/04/2024
  • A Transfer Attack to Image Watermarks [pdf] 09/12/2024
  • LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like Posts [pdf] 09/05/2024
  • Watermark Smoothing Attacks against Language Models [pdf] 07/19/2024
  • Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection [pdf] 06/20/2024
  • Black-Box Detection of Language Model Watermarks [pdf] 05/28/2024
  • Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack [pdf] 04/01/2024
  • Bypassing LLM Watermarks with Color-Aware Substitutions [pdf] 03/19/2024
  • Watermark Stealing in Large Language Models [pdf] 02/29/2024
  • Attacking LLM Watermarks by Exploiting Their Strengths [pdf] 02/27/2024
  • Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models [pdf] 02/22/2024
  • Machine-generated Text Localization [pdf] 02/19/2024
  • Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks [pdf] 02/19/2024
  • Authorship Obfuscation in Multilingual Machine-Generated Text Detection [pdf] 01/17/2024
  • LANGUAGE MODEL DETECTORS ARE EASILY OPTIMIZED AGAINST [pdf] 11/28/2023
  • A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts [pdf] 11/14/2023
  • Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models [pdf] 11/8/2023
  • Does Human Collaboration Enhance the Accuracy of Identifying LLM-Generated Deepfake Texts? [pdf]
  • Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense [pdf]
  • Red Teaming Language Model Detectors with Language Models [pdf]
  • Paraphrase Detection: Human vs. Machine Content [pdf]
  • Large Language Models can be Guided to Evade AI-Generated Text Detection [pdf]
  • Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index [pdf]
  • How Reliable Are AI-Generated-Text Detectors? An Assessment Framework Using Evasive Soft Prompts [pdf]
  • On the Reliability of Watermarks for Large Language Models [pdf]

Datasets

2024

  • CoAT: Corpus of artificial texts [pdf] 09/07/2024
  • Detecting Machine-Generated Texts: Not Just “AI vs Humans” and Explainability is Complicated [pdf] 06/27/2024
  • Spotting AI’s Touch: Identifying LLM-Paraphrased Spans in Text [pdf] 05/22/2024
  • RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors [pdf] 05/16/2024
  • M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection [pdf] 02/19/2024

2023

  • How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection [pdf]
  • CHEAT: A Large-scale Dataset for Detecting ChatGPT-writtEn AbsTracts [pdf]
  • Ghostbuster: Detecting Text Ghostwritten by Large Language Models [pdf]
  • M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection [pdf]
  • GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [pdf]
  • Mgtbench: Benchmarking machine-generated text detection [pdf]
  • HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus [pdf]
  • MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark [pdf]

2022 and before

  • TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation [pdf]

Misc

  • Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? [pdf] 07/24/2024
  • Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews [pdf] 03/13/2024
  • A Survey of AI-generated Text Forensic Systems: Detection, Attribution, and Characterization [pdf] 03/05/2024
  • Hidding the Ghostwriters: An Adversarial Evaluation of AI-Generated Student Essay Detection [pdf] 02/01/2024
  • LLM- Detect AI Generated Text. Kaggle. [link]
  • Can AI-Generated Text be Reliably Detected? [pdf]
  • On the Possibilities of AI-Generated Text Detection [pdf]
  • GPT detectors are biased against non-native English writers [pdf]
  • ChatLog: Recording and Analyzing ChatGPT Across Time [pdf]
  • On the Zero-Shot Generalization of Machine-Generated Text Detectors [pdf]

If you find this repo useful, please cite our work.

@article{yang2023survey,
  title={A Survey on Detection of LLMs-Generated Content},
  author={Yang, Xianjun and Pan, Liangming and Zhao, Xuandong and Chen, Haifeng and Petzold, Linda and Wang, William Yang and Cheng, Wei},
  journal={arXiv preprint arXiv:2310.15654},
  year={2023}
}

@inproceedings{yangdna,
  title={DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text},
  author={Yang, Xianjun and Cheng, Wei and Wu, Yue and Petzold, Linda Ruth and Wang, William Yang and Chen, Haifeng},
  booktitle={The Twelfth International Conference on Learning Representations}
}

@article{yang2023zero,
  title={Zero-shot detection of machine-generated codes},
  author={Yang, Xianjun and Zhang, Kexun and Chen, Haifeng and Petzold, Linda and Wang, William Yang and Cheng, Wei},
  journal={arXiv preprint arXiv:2310.05103},
  year={2023}
}

@article{zeng2024improving,
  title={Improving Logits-based Detector without Logits from Black-box LLMs},
  author={Zeng, Cong and Tang, Shengkun and Yang, Xianjun and Chen, Yuanzhou and Sun, Yiyou and Li, Yao and Chen, Haifeng and Cheng, Wei and Xu, Dongkuan and others},
  journal={arXiv preprint arXiv:2406.05232},
  year={2024}
}

About

The lastest paper about detection of LLM-generated text and code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •