-
Notifications
You must be signed in to change notification settings - Fork 1
CS-Ponkoj/Fake-News-Detection-NLP
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
For CountVectorizer: 1.Taking data from kaggle of 20800 data 2.Preprocessing : Remove RE, special character, remove stop words, make all lower case 3.USe bag of words method to make feature matrix with 5000 max features and most 3 consecutive words range. 4.Train test split whith 33% test size 5.Train seven different ML algorithms to the processed dataset. For TF-IDF: 1.Take train(20800 data) and test(5200) data from kaggle 2.Preprocessing: Make new column using News Title and Whole News and News Author 3.Use TF-IDF transformer to transfer the train and test data into feature matrix. 4.Default train test split 5.Train six different ML algorithms to the processed dataset. LSTM: A sequential deep learning model has been implemented using LSTM architecture for binary text classification that performed better with around 99% accuracy. The dataset has been collected from Kaggle and is of the size 20800. The task was to predict if the news is fake or real. Therefore, the pretrained Glove text embedding algorithm has been used as a text vectorization technique. Besides, several classical models have been implemented with BOW, TF-IDF text vectorization methods. Therefore, the LSTM based deep learning model performs better to classify news.
About
20800 train and 5200 test news dataset used to classify the fake and real news using Count Vectorizer and TF-IDF. Seven ML algorithms are applied to find the best model for the dataset.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published