Non-authentic Hadith Corpus
-
Arabic Hadith corpus
-
It contains 452,624 words from different lesser-known Hadith books
-
It also included several annotated Hadith books, which help to determine the switch points between the Isnad, the Matan,and the comment to provide a ground truth.
-
Some of these books have both Hadiths (authentic and NAH), while others only contain NAH.
-
In NAH_Contents.csv file, you will find the list of all Hadith books in this corpus.
-
The annotating process was done to determine eight primary features for each Hadith in this corpus:
-
No.: The Hadith reference number.
-
Full Hadith: The Hadith as it appears in the book without annotations
-
Isnad: The chain of narrators.
-
Matan: The act of the Prophet Muhammad.
-
Authors Comments: The author describes the authenticity of each Hadith.
-
Hadith Type: The Hadith Type (Maqtu` مقطوع, Mawquf موقوف and Marfoʻ مرفوع) or Hadith degree (ضعيف, موضوع and so on).
-
Authenticity: Whether this Hadith is authentic or non-authentic.
-
Topic: The chapter title.
-
Tarmom T, Atwell E, Alsalka MA. 2020. Non-authentic Hadith Corpus: Design and Methodology. International Journal on Islamic Applications in Computer Science And Technology. 13-19 8.3 http://eprints.whiterose.ac.uk/155642/