The dataset contains 6270 labeled examples for news headline hallucination detection used in the WebConf 2023 paper ""Why is this misleading?": Detecting News Headline Hallucinations with Explanations". News article titles and passages are included in this dataset, they are extracted by the public news-please library based on the released article URLs.
In particular, each row corresponds to one example and it contains the following six fields:
- "ExampleId" field contains a unique identifier for this example.
- "SplitID" field indicates which dataset split this example belongs to (one of train, validation, and test).
- "ArticleURL" field includes the news article URL.
- "Generated Headline" field contains the generated news headline.
- "ArticalTitle" field contains news titles extracted by the news-please library.
- "ArticalPassage" field contains news passages extracted by the news-please library.
- "Explanation" field contains the human written explanation and could simply be an empty string.
- "Label" field indicates if the headline is supported/entailed by the news article.
If you find this dataset useful for your research, please cite the following paper in your publication:
@inproceedings{Shen2023ExHalder,
title={"Why is this misleading?": Detecting News Headline Hallucinations with Explanations},
author={Jiaming Shen and Jialu Liu and Dan Finnie and Negar Rahmati and Michael Bendersky and Marc Najork},
booktitle={Proceedings of The 34th International Conference in the Web (WWW 2023)},
year={2023}
}