By training RALMs on 1K examples we can make them robust to irrelevant context and improve QA performance [Paper].
Our models and data are available at the RetRobust HuggingFace Collection.
LLama-2 inference servers were set using lm-sys/FastChat. Experiments were run using the framework from reasoning-on-cots. To run these experiments, see here.
See here.
See here.
bibtex
@misc{yoran2023making,
title={Making Retrieval-Augmented Language Models Robust to Irrelevant Context},
author={Ori Yoran and Tomer Wolfson and Ori Ram and Jonathan Berant},
year={2023},
eprint={2310.01558},
archivePrefix={arXiv},
primaryClass={cs.CL}
}