What are people talking about in my post(s)?
A simple tool to run a basic, very simple, topic analysis on facebook posts. Additionally, it uses spaCy default models to extract named-entities from comments. It's pretty much a word counter that employs standard NLP pre-processing, plus the NER part performed by spaCy.
It gets the data of given posts by calling
the Facebook GraphAPI.
It performs text preprocessing
(tokenization, stopwords filtering, stemming) and makes plots:
a word cloud plot - using this awesome library
word_Cloud
-
and a bar plot - using seaborn
-
of the N most important words.
The tool can be configured in page id, single post id, number of
posts, etc.
It can be run on the last n plots, or on a given post id.
This tool has been developed on Ubuntu 18.04 and macOS High Sierra, but
has never been seriously tested.
It requires Python3+ and virtualenv
.
With these two installed, simply clone the repo
and run source install.sh
A Facebook API token associated to an active app is an essential requirement.
(See Facebook for developers documentation)
The file requirements.txt
contains all the needed python packages and spaCy models.
It should come with Python3+ installed, so just give
source install.sh
and Bob's your uncle.
Follow the following steps:
- Open a Terminal and run
xcode-select --install
- log out and back in
- get Homebrew here: copy/paste the link they provide in a terminal.
At this stage, if you get an error that says
git: error: unable to locate xcodebuild, please make sure the path to the Xcode folder is set correctly!
git: error: You can set the path to the Xcode folder using /usr/bin/xcode-select -switch
follow what's been said here, and run the following in a terminal:
sudo xcode-select -switch /Library/Developer/CommandLineTools
Once Homebrew has been downloaded and installed you can install Python3 by:
brew install python
Once Python has been brewed
(a.k.a. Python installation finished successfully),
you should be able to run pip install virtualenv
and finally source install.sh
.
Sorry, I have no clue. I don't even care.
The file settings.conf
contains a number of parameters,
among which, access token and facebook profile/page id,
that have to be edited in order for the tool to run.
Two modes are allowed: (remember to edit settings.conf):
source wc_by_id.sh settings.conf
source wc_latest.sh settings.conf
Additionally, it is possible to run Named-Entity Recognition using default spaCy models (supported: en, it). No Word Cloud will be produced in this case.
source ner_by_id.sh settings.conf
source ner_latest.sh settings.conf
The tool is designed to run until the conditionds on the variables
in settings.conf
are met or, shouldn't this happen,
up until the max request rate, that Facebook do apply, is reached.
That's it!
Here there are two images of the plots that are produced by running the tool on this post: https://www.facebook.com/GiveToTheNext/posts/477277113022512
Thanks to the people at spaCy for the NE part, to the people who produced facebook-sdk for the ease of access to the data, and finally to the guys who made word_cloud for the awesome word-cloud images that can be produced.