CNN Arabic 22 Letter HMBD-v1

Image Classification Using CNN (Convolutional Neural Networks)

Authors

Project Overview

This project implements a Convolutional Neural Network (CNN) for classifying 22 isolated Arabic letters. The model is trained on a custom dataset of handwritten Arabic letters, demonstrating the application of deep learning techniques in Arabic character recognition.

Dataset

The dataset consists of 22 classes of isolated Arabic letters. This database was taken from a huge database containing all the letters of the Arabic language written by hand. The link to this database is HossamBalaha/HMBD-v1. We thank everyone who prepared and equipped this database, which we benefited from in this project.

Dataset Information

#	Directory Name	Number of Images
0	Dataset\Ain_Isolated	462
1	Dataset\Alf_Hamza_Above_Isolated	476
2	Dataset\Alf_Hamza_Under_Isolated	474
3	Dataset\Alf_Isolated	480
4	Dataset\Baa_Isolated	468
5	Dataset\Baa_Middle	460
6	Dataset\Daad_Isolated	455
7	Dataset\Dal_Isolated	472
8	Dataset\Faa_Isolated	464
9	Dataset\Gem_Isolated	472
10	Dataset\Gem_Start	472
11	Dataset\Gen_Isolated	920
12	Dataset\Hamza_Isolated	466
13	Dataset\Kaf_Isolated	464
14	Dataset\Lam_Alf_Hamza_Isolated	459
15	Dataset\Mem_Isolated	468
16	Dataset\Qaf_Isolated	467
17	Dataset\Raa_Isolated	476
18	Dataset\Saad_Isolated	459
19	Dataset\Sin_Isolated	468
20	Dataset\Taa_Isolated	467
21	Dataset\Taa_Middle	462

Project Structure

Important Note: Due to the size of the file "X_Arabic_22_letter_64.pickle,y_Arabic_22_letter_64.pickle" it has not been uploaded to the gtb, so you can run the project, and these files will be created and then you can train the model based on these files.

CNN ARABIC 22 LETTER HMBD -V1/
├── datasave/
│   ├── checkpoints/
│   ├── model_logs/
│   │   ├── train/
│   │   └── validation/
│   ├── model_acc_Arabic_22_letter_64.h5
│   ├── weights_model_acc_Arabic_22_letter_64.h5
│   ├── X_Arabic_22_letter_64.pickle
│   └── y_Arabic_22_letter_64.pickle
├── Dataset/
│   ├── Ain_Isolated/
│   ├── Alf_Hamza_Above_Isolated/
│   ├── ...
│   └── Taa_Middle/
└── CNN Arabic 22 Letter HMBD .ipynb

Dependencies

pandas
numpy
matplotlib
seaborn
opencv-python (cv2)
tensorflow
keras
scikit-learn
pickle

Model Architecture

The CNN model architecture is as follows:

KerasModel = keras.models.Sequential([
    keras.layers.Conv2D(8, kernel_size=(5, 5), activation='relu', input_shape=(s, s, 3)),
    keras.layers.Conv2D(16, kernel_size=(5, 5), activation='relu'),
    keras.layers.Conv2D(16, kernel_size=(3, 3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2), strides=2),
    keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'),
    keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2), strides=2),
    keras.layers.Dropout(0.2),
    keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2), strides=2),
    keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
    keras.layers.BatchNormalization(),
    keras.layers.Dropout(0.2),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(22, activation='softmax')
])

Training

The model is compiled using the Adam optimizer and sparse categorical crossentropy loss function:

KerasModel.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Results

The training and validation accuracy curves, as well as the loss curves, are provided in the notebook. These visualizations help in understanding the model's performance and identifying potential overfitting or underfitting. All results and detailed drawings will be found in the CNN Arabic 22 Letter HMBD-v1 file. The following image shows the results of the model and the accuracy of classifying letters.

Usage

To use this project:

Clone the repository
Install the required dependencies
Run the Jupyter notebook CNN Arabic 22 Letter HMBD .ipynb

Future Work

Experiment with different model architectures
Implement data augmentation techniques to improve model generalization
Explore transfer learning approaches using pre-trained models
Create models to recognize handwritten Arabic words.

Acknowledgements

This project was developed for discussion to obtain practical grades in the Neural Networks course as part of the Bachelors' of Software Engineering major at Taiz University.

We would like to thank our teachers and colleagues for their support and feedback throughout the development process. Also, thanks to everyone who contributed to the preparation and publication of the dataset HMBD-v1.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Dataset		Dataset
dataSave		dataSave
.gitattributes		.gitattributes
CNN Arabic 22 Letter HMBD .ipynb		CNN Arabic 22 Letter HMBD .ipynb
Dataset-Info.md		Dataset-Info.md
GeneratesDatasetInfo.ipynb		GeneratesDatasetInfo.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNN Arabic 22 Letter HMBD-v1

Image Classification Using CNN (Convolutional Neural Networks)

Authors

Project Overview

Dataset

Dataset Information

Project Structure

Dependencies

Model Architecture

Training

Results

Usage

Future Work

Acknowledgements

License

About

Releases

Packages

Languages

Rafiq7M/CNN-Arabic-22-Letter-HMBD-v1

Folders and files

Latest commit

History

Repository files navigation

CNN Arabic 22 Letter HMBD-v1

Image Classification Using CNN (Convolutional Neural Networks)

Authors

Project Overview

Dataset

Dataset Information

Project Structure

Dependencies

Model Architecture

Training

Results

Usage

Future Work

Acknowledgements

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages