This project focuses on the classification of items based on their descriptions using Natural Language Processing (NLP) techniques. The goal is to leverage machine learning classifiers to automatically categorize items into predefined classes, including real estate, machinery, services, information technology, and furniture.
In our daily lives, the task of classifying diverse items into distinct categories is a common challenge that spans various industries. This project addresses this challenge by employing Natural Language Processing (NLP) techniques for text classification. The primary objective is to develop a model capable of accurately categorizing items into predefined classes based on their textual descriptions.
data/
: Stores the dataset used for training and evaluation.notebooks/
: Jupyter notebooks for experimentation and analysis.
- Python 3.9
- Required Python packages can be installed using:
pip install -r requirements.txt
-
Clone the repository:
git clone https://github.com/JoaoAssalim/Class-by-Description-Classifier-with-NLP.git
This repository includes three different approaches for implementing the model:
Sklearn
: Implementation using traditional machine learning libraries.TensorFlow
: Implementation using TensorFlow for building and training neural network models.Fine-tuning BERT
: Implementation using the pre-trained model noneuralmind/bert-base-portuguese-cased for fine-tuning in Portuguese.
Contributions are welcome! If you'd like to contribute to this project, please fork the repository and submit a pull request.