Skip to content

d7miiZ/Dialect-Graduation-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Graduation-Project

This is our bachelor's project in computer science, and the goal is to classify arabic text into five types of dialects (GLF,EGY,IRQ,LEV,NOR).
You can find the research paper at research/Graduation_project.pdf.
Here is a link to test the model: https://arabic.hawzen.me/

Credits

Data

Dataset Source
SMADC Areej Alshutayri and Eric Atwell. Classifying arabic dialect text in the social media arabic dialect corpus (smadc). 01 2021.
AOC-dialectal-annotations Ryan Cotterell and Chris Callison-Burch. A multi-dialect, multigenre corpus of informal written Arabic. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 241–245, Reykjavik, Iceland, May 2014. European Language Resources Association (ELRA).
annotated_data Omar F. Zaidan and Chris Callison-Burch. The Arabic online commentary dataset: an annotated dataset of informal Arabic with high dialectal content. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 37–41, Portland, Oregon, USA, June 2011. Association for Computational Linguistics.
Dart Israa Alsarsour, Esraa Mohamed, Reem Suwaileh, and Tamer Elsayed. DART: A large dataset of dialectal Arabic tweets. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018. European Language Resources Association (ELRA).
extra_data Us

Superviser

Dr. Nasser A. AlSadhan

Researchers

Abdulrahman Al-Shawi
Musaad Al-Qubayl
Khaled Al-Bader
Abdullah Al-Suwailem
Mohand Al-Rasheed

About

This is our graduation project for our bcs in computer science

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •