An SMS Spam Classifier application built using Python, NLTK, and Streamlit. This project classifies SMS messages as "Spam" or "Ham" (not spam) using Natural Language Processing techniques. It provides a clean and interactive web interface to demonstrate the model's predictions.
- Real-Time Classification: Input any SMS message to see instant classification as Spam or Ham.
- Interactive Web Interface: Built with Streamlit, offering a user-friendly and interactive experience.
- Data Preprocessing: Utilizes NLTK for text tokenization, stopwords removal, and text transformation.
- Deployable: Easily deployable on platforms like Heroku.
- Python: Core programming language used for backend logic.
- NLTK: Natural Language Toolkit for preprocessing text data.
- Streamlit: Framework to build an interactive web application.
- Heroku: For deploying the application to the web.
- Data Preprocessing: The input SMS message is tokenized, and stopwords are removed using NLTK.
- Feature Extraction: The processed text is transformed into features using text vectorization techniques.
- Classification: A pre-trained model classifies the SMS message as Spam or Ham.
- Result Display: The classification result is displayed on the web interface in real-time.
SMS-Spam-Classifier/
│
├── app.py # Main application code
├── nltk_data/ # Pre-downloaded NLTK data (stopwords, Punkt)
├── requirements.txt # Python dependencies
├── Procfile # Heroku process file
└── README.md # Project documentation (this file)
To run this project locally, follow these steps:
-
Clone the Repository:
git clone https://github.com/MUDITJAINN/SMS-Spam-Classifier.git cd SMS-Spam-Classifier
-
Install Dependencies:
pip install -r requirements.txt
-
Download NLTK Data (if not included):
import nltk nltk.download('stopwords', download_dir='./nltk_data') nltk.download('punkt', download_dir='./nltk_data')
-
Run the Application:
streamlit run app.py
-
Deploy on Heroku (Optional):
git push heroku main
The model is trained on a public SMS Spam dataset and evaluated using standard metrics like accuracy, precision, recall, and F1-score. The classifier performs efficiently in real-time scenarios, making it ideal for SMS filtering applications.
This project showcases my expertise in Natural Language Processing (NLP) and web deployment, demonstrating my ability to build and deploy machine learning applications. It reflects my hands-on experience with Python, NLTK, and cloud platforms like Heroku.
Contributions, issues, and feature requests are welcome! Feel free to check the issues page.
This open-source project is available under the MIT License.