fake-news-detector

Combating fake news and detecting false informations is extremely crucial, because of huge manipulation possibilities. It should not be surprising then that this area is the subject of research for many scientists.

Goal of this project was to invent a binary classificator, which task was to indicate if information is false or not. Fake news can be detected based on article content or its social context. In this project we have focused on prediction from news content. Our solution was based on context word embeddings from one of the best language model currently available - Flair.

Presented mechanism was evaluated on a dataset called FakeNewsNet, and the result was compared to many other results.FakeNewsNet contains set of real and fake news from politic and gossip area, however during this project we focused only on political news from PolitiFact.

Project created by Radomir Krawczykiewicz and Grzegorz Wątor

Results

We have achived very good results on PolitiFact from FakeNewsNet:

Model	Accuracy	Precision	Recall	F1
Flair Title	0.816	0.761	0.805	0.782
Flair Url	0.804	0.733	0.860	0.791
Flair Mix	0.759	0.648	0.908	0.756
AutoKeras Mix	0.863	0.811	0.918	0.861

Documentation

One can get more information from a short presentation(in english) or full documenation(in polish)

Installation

One needs to have installed Python on one's computer, then one can just install dependencies using pip:

pip install -r requirements.txt

Starting

jupyter notebook

Dataset

We have used FakeNewsNet as our dataset.

To evalute on dataset one needs to download it firsts.

For full data one needs to use [script]https://github.com/KaiDMML/FakeNewsNet/blob/master/code/main.py) provided by FakeNewsNet.

If one wants to use just CSV data one download them from github, seperate for fake and real

For proper working of jupyter noteboks(without chaning paths in them) user needs to put:

CSV files into fakenewsnet_dataset/dataset folder
Full dataset into fakenewsnet_dataset/politifact with separation for fake and real folders

Colab notebook have cell, which is downloading CSV file from github.

Database

Start the PostgreSQL database instance. Depending on the needs, it can be run locally or on a remote server (under this project the database was running instance on AWS).
Execute the SQL commands from the table_create.sql file. Three tables should be created - for articles, users and tweets.
Open data_extractor.py
Set the PATH_TO_DATA_DIRECTORY variable to the folder pointing to the data from the politifact portal (the path should point exactly to the politifact folder!). A guide on how to download the data can be found in another chapter of this manual.
Set the CONN variable according to the information about access to the database you are running on (set host, dbname, user and password)
Run script
As a result of the script, the data should be properly uploaded to the database.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
colab		colab
jupyter		jupyter
src		src
.gitignore		.gitignore
Dokumentacja.pdf		Dokumentacja.pdf
LICENSE		LICENSE
Presentation.pdf		Presentation.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fake-news-detector

Results

Documentation

Installation

Starting

Dataset

Database

References

Papers

Libraries

About

Releases

Packages

Contributors 2

Languages

License

Kotwic4/fake-news-detector

Folders and files

Latest commit

History

Repository files navigation

fake-news-detector

Results

Documentation

Installation

Starting

Dataset

Database

References

Papers

Libraries

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages