using this machine learning NLP summarization project you can short & also sort the large speech-to-text file and get important understandable topics in summary without hearing all long call you get the main point in this summary and also get sentiment on this call or on summary talk.
NLP Project :- using huggingface pre-trained transfromer fine-tuning model & pytorch library for summarization & sentiment.
- After speech to text you can use this model for summarization on the long text file.
- This model is best for call or meeting corpus summary.
- Can also use this model for group disscussion text corpus.
Download Pre-trained fine-tuning on 'facebook/bart-large-xsum' model with BART-LARGE-XSUM-SAMSUM-DIALOGSUM-AMI dataset.
-
open terminal where you want to download your model.
-
paste this in terminal
$ git lfs install $ git clone https://huggingface.co/knkarthick/MEETING-SUMMARY-BART-LARGE-XSUM-SAMSUM-DIALOGSUM-AMI
-
open terminal where you want to save this project.
$ git clone https://github.com/Yogesh0823/summarization_sentiment-for-call-analysis.git $ cd summarization_sentiment-for-call-analysis
-
copy downloaded model folder here.
-
create virtule environment in summarization_sentiment-for-call-analysis
$ python3 -m venv 'venv-name'
-
active vnev using
$ source/'venv-name'/bin/activate
-
install requirement.txt in venv.
$ pip install -r requirement.txt
-
running using fastapi for summary output.
$ uvicorn main:app --reload
-
after runing this script click on link show in terminal http://127.0.0.1:8000 you get this screen.
-
Then add '/docs' in link http://127.0.0.1:8000/docs, then you get this screen.
-
After this click on post summarization text, you get this screen.
-
Then click try it out and you get input text box and click on execute. This is with only empty string and what we get output summary from this model and also sentiment. Note :- You just have to replace "string to your corpus"
-
Replace string to some talking corpus and see what summary we get.You can see summary and sentiment both result in response body of Fastapi.
- In this project i'm using fast api for GUI output. You can use without GUI and get summary in terminal but you have to modify summarization.py file for this.
- For increase length of output summary change your input corpus len should be more then 1000 (Count of words in text corpus) , if your len of input text is less then 1000 then its give you deafult output length.
- If your corpus have len more then 1000 or == to 1000 then you can change (increase or decrease) length. for this change "num==500" in def clean function in summarization.py file.
- Increasing the num gives you short output decreasing the num gives you long summary output.
- num=500 is tested and perfect for more then 1000 words of corpus.