This project utilizes neural networks for text classification, specifically focusing on categorizing sentences into different classes based on their content. The code is implemented in a Jupyter notebook environment.
The purpose of this project is to demonstrate the application of neural networks in natural language processing tasks, particularly in text classification. The notebook contains code snippets for data preprocessing, model training, cross-validation, and evaluation metrics calculation.
To run the code in this notebook, the following libraries are required:
nltk
: Natural Language Toolkit for text processingsklearn
: Scikit-learn library for machine learning algorithmsnumpy
: Numerical computing library for array operationsjson
: Library for JSON data manipulationdatetime
: Library for handling date and time operations
The training data consists of sentences categorized into four classes:
- Performance
- Usability
- Security
- Operability
Each sentence is associated with a class label and a class name.
The notebook is divided into sections covering different aspects of the implementation:
- Data Preprocessing: Tokenization, stemming, and feature extraction from the text data.
- Model Training: Training a neural network model using backpropagation and gradient descent.
- Cross-Validation: Implementing stratified k-fold cross-validation for model evaluation.
- Evaluation Metrics: Calculating accuracy, precision, recall, and F1-score for model performance assessment.
- Mean Average Calculation: Computing the mean average of model scores across multiple cross-validation folds.
To use this notebook:
- Install the required libraries mentioned in the "Requirements" section.
- Load the notebook in a Jupyter environment.
- Execute the code cells sequentially.
The notebook provides insights into the model's performance through evaluation metrics such as accuracy, precision, recall, and F1-score. Additionally, it calculates the mean average of these metrics across multiple cross-validation folds, providing a comprehensive overview of the model's effectiveness.
This project demonstrates the application of neural networks in text classification tasks. By leveraging techniques such as data preprocessing, model training, and cross-validation, it showcases a systematic approach to building and evaluating NLP models.
The neural network implementation in this project draws inspiration from a blog post titled "A Neural Network in Python, Part 2" by Andrej Karpathy. The blog post provides insights into building neural networks using Python and serves as a valuable resource for understanding the underlying concepts. We express our gratitude to Andrej Karpathy for sharing this knowledge and contributing to the development of this project.