TDD

Unveiling and Manipulating Prompt Influence in Large Language Models (ICLR 2024) Link

TDD explores using token distributions to explain autoregressive LLMs. Our another work, PromptExplainer, explains masked language models such as BERT and RoBERTa using token distributions. Welcome to check PromptExplainer!

Reproduce our results

The are two steps to reproduce our results.

Step 1: Generate saliency scores using TDD_step1.py. You may choose different datasets and LLMs to generate saliency scores.
Step 2: Evaluate using AOPC and Sufficiency by TDD_step2.py. It calculates AOPC and Suff scores using the saliency scores from step 1.

Please use your own LLaMA access token while experimenting with it.

Acknowledgement

The code for contrastive explanation baselines is from interpret-lm. The dataset is from BLiMP. We thank the authors for their excellent contributions!

Citation

If you find our work useful, please consider citing TDD:

@inproceedings{feng2024tdd,
  title={Unveiling and Manipulating Prompt Influence in Large Language Models},
  author={Feng, Zijian and Zhou, Hanzhang and Zhu, Zixiao and Qian, Junlang and Mao, Kezhi},
  booktitle={The Twelfth International Conference on Learning Representations},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
exp_result		exp_result
interpret		interpret
README.md		README.md
TDD_step1.py		TDD_step1.py
TDD_step2.py		TDD_step2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TDD

Reproduce our results

Acknowledgement

Citation

About

Releases

Packages

Languages

zijian678/TDD

Folders and files

Latest commit

History

Repository files navigation

TDD

Reproduce our results

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages