Code and data for this paper: Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers
Annotations of faithfulness errors along with the errors themselves can be found under 'error_annotations'.
Scripts to have models generate summaries, score summaries, and label faithfulness errors are in 'model_scripts'.
The interface for writers to evaluate summaries is in 'streamlit_interface'.
The writer assigned scores and feedback are in 'writer_ratings_comments.tsv'.