What's the topic?

What are people talking about in my post(s)?

What is it?

A simple tool to run a basic, very simple, topic analysis on facebook posts. Additionally, it uses spaCy default models to extract named-entities from comments. It's pretty much a word counter that employs standard NLP pre-processing, plus the NER part performed by spaCy.

How does it do it?

It gets the data of given posts by calling the Facebook GraphAPI. It performs text preprocessing (tokenization, stopwords filtering, stemming) and makes plots: a word cloud plot - using this awesome library word_Cloud - and a bar plot - using seaborn - of the N most important words. The tool can be configured in page id, single post id, number of posts, etc. It can be run on the last n plots, or on a given post id.

How to install

This tool has been developed on Ubuntu 18.04 and macOS High Sierra, but has never been seriously tested. It requires Python3+ and virtualenv. With these two installed, simply clone the repo and run source install.sh

Requirements

A Facebook API token associated to an active app is an essential requirement. (See Facebook for developers documentation) The file requirements.txt contains all the needed python packages and spaCy models.

Ubuntu 18.04

It should come with Python3+ installed, so just give source install.sh and Bob's your uncle.

macOS

Follow the following steps:

Open a Terminal and run xcode-select --install
log out and back in
get Homebrew here: copy/paste the link they provide in a terminal.

At this stage, if you get an error that says

git: error: unable to locate xcodebuild, please make sure the path to the Xcode folder is set correctly!
git: error: You can set the path to the Xcode folder using /usr/bin/xcode-select -switch

follow what's been said here, and run the following in a terminal:

sudo xcode-select -switch /Library/Developer/CommandLineTools

Once Homebrew has been downloaded and installed you can install Python3 by:

brew install python

Once Python has been brewed (a.k.a. Python installation finished successfully), you should be able to run pip install virtualenv and finally source install.sh.

Windows

Sorry, I have no clue. I don't even care.

How to run

The file settings.conf contains a number of parameters, among which, access token and facebook profile/page id, that have to be edited in order for the tool to run.

Fancy word count

Two modes are allowed: (remember to edit settings.conf):

Single-post using post ID

source wc_by_id.sh settings.conf

Latest N posts

source wc_latest.sh settings.conf

Named-Entity Recognition using spaCy

Additionally, it is possible to run Named-Entity Recognition using default spaCy models (supported: en, it). No Word Cloud will be produced in this case.

Single-post using post ID

source ner_by_id.sh settings.conf

Latest N posts

source ner_latest.sh settings.conf

Considerations

The tool is designed to run until the conditionds on the variables in settings.conf are met or, shouldn't this happen, up until the max request rate, that Facebook do apply, is reached.

That's it!

Results

Here there are two images of the plots that are produced by running the tool on this post: https://www.facebook.com/GiveToTheNext/posts/477277113022512

Bar plot using the top 20 words

Word cloud with no stemming

Bar plot using the top 12 entities

Acknowledgements

Thanks to the people at spaCy for the NE part, to the people who produced facebook-sdk for the ease of access to the data, and finally to the guys who made word_cloud for the awesome word-cloud images that can be produced.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
classes		classes
sample_img		sample_img
.gitignore		.gitignore
README.md		README.md
_config.yml		_config.yml
comment2csv.py		comment2csv.py
install.sh		install.sh
ner_by_id.sh		ner_by_id.sh
ner_latest.sh		ner_latest.sh
requirements.txt		requirements.txt
run_ner_by_id.py		run_ner_by_id.py
run_ner_latest.py		run_ner_latest.py
run_wc_by_id.py		run_wc_by_id.py
run_wc_latest.py		run_wc_latest.py
settings.conf		settings.conf
stoplist.json		stoplist.json
utils.py		utils.py
wc_by_id.sh		wc_by_id.sh
wc_latest.sh		wc_latest.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What's the topic?

What is it?

How does it do it?

How to install

Requirements

Ubuntu 18.04

macOS

Windows

How to run

Fancy word count

Single-post using post ID

Latest N posts

Named-Entity Recognition using spaCy

Single-post using post ID

Latest N posts

Considerations

Results

Bar plot using the top 20 words

Word cloud with no stemming

Bar plot using the top 12 entities

Acknowledgements

About

Releases

Packages

Languages

fabriziomiano/whats-the-topic

Folders and files

Latest commit

History

Repository files navigation

What's the topic?

What is it?

How does it do it?

How to install

Requirements

Ubuntu 18.04

macOS

Windows

How to run

Fancy word count

Single-post using post ID

Latest N posts

Named-Entity Recognition using spaCy

Single-post using post ID

Latest N posts

Considerations

Results

Bar plot using the top 20 words

Word cloud with no stemming

Bar plot using the top 12 entities

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages