Skip to content

Rafiq7M/CNN-Arabic-22-Letter-HMBD-v1

Repository files navigation

CNN Arabic 22 Letter HMBD-v1

Image Classification Using CNN (Convolutional Neural Networks)

Authors

Project Overview

This project implements a Convolutional Neural Network (CNN) for classifying 22 isolated Arabic letters. The model is trained on a custom dataset of handwritten Arabic letters, demonstrating the application of deep learning techniques in Arabic character recognition.

Dataset

The dataset consists of 22 classes of isolated Arabic letters. This database was taken from a huge database containing all the letters of the Arabic language written by hand. The link to this database is HossamBalaha/HMBD-v1. We thank everyone who prepared and equipped this database, which we benefited from in this project.

Dataset Information

# Directory Name Number of Images Example Image
0 Dataset\Ain_Isolated 462
1 Dataset\Alf_Hamza_Above_Isolated 476
2 Dataset\Alf_Hamza_Under_Isolated 474
3 Dataset\Alf_Isolated 480
4 Dataset\Baa_Isolated 468
5 Dataset\Baa_Middle 460
6 Dataset\Daad_Isolated 455
7 Dataset\Dal_Isolated 472
8 Dataset\Faa_Isolated 464
9 Dataset\Gem_Isolated 472
10 Dataset\Gem_Start 472
11 Dataset\Gen_Isolated 920
12 Dataset\Hamza_Isolated 466
13 Dataset\Kaf_Isolated 464
14 Dataset\Lam_Alf_Hamza_Isolated 459
15 Dataset\Mem_Isolated 468
16 Dataset\Qaf_Isolated 467
17 Dataset\Raa_Isolated 476
18 Dataset\Saad_Isolated 459
19 Dataset\Sin_Isolated 468
20 Dataset\Taa_Isolated 467
21 Dataset\Taa_Middle 462

Project Structure

  • Important Note: Due to the size of the file "X_Arabic_22_letter_64.pickle,y_Arabic_22_letter_64.pickle" it has not been uploaded to the gtb, so you can run the project, and these files will be created and then you can train the model based on these files.
CNN ARABIC 22 LETTER HMBD -V1/
├── datasave/
│   ├── checkpoints/
│   ├── model_logs/
│   │   ├── train/
│   │   └── validation/
│   ├── model_acc_Arabic_22_letter_64.h5
│   ├── weights_model_acc_Arabic_22_letter_64.h5
│   ├── X_Arabic_22_letter_64.pickle
│   └── y_Arabic_22_letter_64.pickle
├── Dataset/
│   ├── Ain_Isolated/
│   ├── Alf_Hamza_Above_Isolated/
│   ├── ...
│   └── Taa_Middle/
└── CNN Arabic 22 Letter HMBD .ipynb

Dependencies

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • opencv-python (cv2)
  • tensorflow
  • keras
  • scikit-learn
  • pickle

Model Architecture

The CNN model architecture is as follows:

KerasModel = keras.models.Sequential([
    keras.layers.Conv2D(8, kernel_size=(5, 5), activation='relu', input_shape=(s, s, 3)),
    keras.layers.Conv2D(16, kernel_size=(5, 5), activation='relu'),
    keras.layers.Conv2D(16, kernel_size=(3, 3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2), strides=2),
    keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'),
    keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2), strides=2),
    keras.layers.Dropout(0.2),
    keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2), strides=2),
    keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
    keras.layers.BatchNormalization(),
    keras.layers.Dropout(0.2),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(22, activation='softmax')
])

Training

The model is compiled using the Adam optimizer and sparse categorical crossentropy loss function:

KerasModel.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Results

The training and validation accuracy curves, as well as the loss curves, are provided in the notebook. These visualizations help in understanding the model's performance and identifying potential overfitting or underfitting. All results and detailed drawings will be found in the CNN Arabic 22 Letter HMBD-v1 file. The following image shows the results of the model and the accuracy of classifying letters.

Usage

To use this project:

  1. Clone the repository
  2. Install the required dependencies
  3. Run the Jupyter notebook CNN Arabic 22 Letter HMBD .ipynb

Future Work

  • Experiment with different model architectures
  • Implement data augmentation techniques to improve model generalization
  • Explore transfer learning approaches using pre-trained models
  • Create models to recognize handwritten Arabic words.

Acknowledgements

This project was developed for discussion to obtain practical grades in the Neural Networks course as part of the Bachelors' of Software Engineering major at Taiz University.

We would like to thank our teachers and colleagues for their support and feedback throughout the development process. Also, thanks to everyone who contributed to the preparation and publication of the dataset HMBD-v1.

License

MIT License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published