- This project focuses on fine-tuning the distilbert-base-uncased model, a lightweight and efficient variant of the BERT model, for sentiment analysis. The goal is to classify text data into positive or negative sentiments, leveraging the powerful pre-trained DistilBERT model and adapting it to the nuances of sentiment classification.
The preprocessing of the review text and initial sentiment analysis was performed in separate projects. This involved extensive data cleaning, text normalization, and exploratory analysis. Details of these steps and the methodologies used can be found in my previous notebook, available in my GitHub repositories:preprocessing and analysisand sentiment classification using ML
- The dataset used in this project consists of preprocessed review data, encompassing a diverse range of sentiments.
- The DistilBERT model was fine-tuned using the Hugging Face Transformers library. The training process involved adjusting model parameters, including learning rate and batch size, to optimize performance for sentiment classification.
- Post-training, the model was used to perform inference on a separate set of data samples. This step tested the model's ability to generalize and make accurate sentiment predictions on unseen data.
- Install Dependencies: Ensure that Python, PyTorch, and the Hugging Face Transformers library are installed.
- Model Training: Run training_script.py to train the DistilBERT model on the provided dataset.
- Perform Inference: Execute inference_script.py to use the trained model to make sentiment predictions on new data samples.
- Contributions to this project are welcome. Please submit a pull request or open an issue to propose changes or additions.