summarization

Experiments in LLM summarization

Extractive summarization

Can extracting claims from source text reduce hallucentions in summaries?

We first test our detection process on the HaluEval summarization dataset and found it to have an accuracy of 74% on 1000 samples (equal split positive and negative).

Next, we perform abstractive summarization on the full FaVe dataset (200 samples) and then evaluate the hallucination rate (% summaries with hallucinations).

Finally, we perform extractive summarization on FaVe followed by the same evaluation. The extractive summarization process consists of extracting claims from the source material, and then summarizing the claims.

We found that the abstractive summarization process has a detected hallucination rate of 18%, and the extractive process has a rate of 19%.

All steps used GPT 4o mini.

HaluEval dataset: https://github.com/RUCAIBox/HaluEval FaVe dataset: https://github.com/elicit/fave-dataset

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
llm		llm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calibrate.py		calibrate.py
extract_summary.py		extract_summary.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

summarization

Extractive summarization

About

Releases

Packages

Languages

License

rravenel/summarization

Folders and files

Latest commit

History

Repository files navigation

summarization

Extractive summarization

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages