This project provides a text summarization tool that utilizes the Transformer-based summarizer for condensing lengthy texts into concise summaries. Additionally, it incorporates the YAKE algorithm for keyword extraction and a Naive Bayes classifier for categorizing articles into topics such as business, sports, politics, etc.
- Text Summarization: Utilizes Transformer-based summarization techniques to generate summaries of input text.
- Keyword Extraction: Implements YAKE algorithm for extracting keywords from text.
- Article Classification: Employs a Naive Bayes classifier trained on various topics with an average accuracy of 94% to classify articles.
- Recommendation System: Recommends similar articles using the News API.
- User Interface: Users can interact with the application through a Streamlit-powered web interface by running
home.py
.
-
Clone the repository to your local machine:
git clone https://github.com/yourusername/text-summarizer.git
-
Install the required dependencies:
pip install -r requirements.txt
-
Run the Streamlit app:
streamlit run home.py
-
Follow the instructions provided in the terminal to access the application in your web browser.
This project has not been deployed yet. Users can run the home.py
file locally to access the functionality of the text summarizer.
Contributions are welcome! If you'd like to contribute to this project, please fork the repository and submit a pull request with your changes.
- This project utilizes the following libraries:
- Hugging Face Transformers for text summarization.
- YAKE for keyword extraction.
- Streamlit for the web interface.
- DataSet for training the classifier model. 1.Misra, Rishabh. "News Category Dataset." arXiv preprint arXiv:2209.11429 (2022). 2.Misra, Rishabh and Jigyasa Grover. "Sculpting Data for ML: The first act of Machine Learning." ISBN 9798585463570 (2021)..