https://www.kaggle.com/code/abedkhooli/ar-reviews-100k/data
cleanig arabic data set using:
- remove Punctuation using re library in python by making special function yo do that
- remove stop words by using two different way:
- using !pip install arabic-stopwords
- using external file of stop words (https://github.com/shimaa83/arabic-stop-words)\ - remove diacritization using camel library (pip install camel-tools)
- making normalization using camel library
classification using classic machine learning - first convert words TfidfTransformer and then apply some algorithms