A Natural Language Processing (NLP) engine for Arabic text analysis and tokenization. The engine extracts content from a folder of text files, tokenizes it, and analyzes the most frequent next words or sequences. Additionally, it checks the syntax of the extracted tokens against a provided Arabic dictionary.
Clone the repository to your local machine: