This project is a data-driven solution that uses Twitter data to analyse the brazilian presidency candidates profiles performance.
Note: This dashboard is discontinued since Oct 30th 2022, full last screen-shot.
Resource | Function | Description |
---|---|---|
EventBridge | Trigger the ETL | Build event-driven applications at scale across AWS, existing systems, or SaaS apps |
ECR | Store the container with the ETL | Easily store, share, and deploy your container software anywhere |
Lambda | Run the ECR with the ETL | Run code without thinking about servers or clusters |
RDS | Operate MySQL Database | Set up, operate, and scale a relational database in the cloud with just a few clicks |
-
Extraction: Tweepy is a python package that makes easier the access to Twitter API. The functions that have been used here are:
- get_recent_tweets_count. This function gets the number of Tweets that mentioned the query words.
- get_user. This function gets information about the user, i.e. followers, posts, screen name and etc.
-
Transformation: The main package used to transform and manipulate data was Pandas, it was used mainly to transform data scraped from Twitter API into pandas data frame format.
-
Loading: The SQLAlchemy was used to create the connection (engine) between the python code and the MySQL database, it is possible to combine this connection with Pandas load function.
- Profile Mentions Table: Store the number of total mentions, mentions without retweets, and the respective date and time. It is an hourly table.
- Profile Info Table: Store some important info about the users and the respective date. It is a daily table.
- Last Updated View: Store the date and time of the last time that the ETL ran.
-
Connecting the app to the MySQL Database: Remote databases are an excellent solution to keep a Shiny app updated. The
pool
package helps establish and manage remote storage connections. Of course, some sensitive information is needed to build these bridges between the app and storage. That's when thedotenv
package comes to aid: it allows the developer to hide their credentials in a .env file, upload it to the host service, and easily access them. -
Leveraging the power of
purrr
: When building an app UI, one can use HTML tags inside theR
code. Just like someggplot2
layers, these tags are stored in lists. This means thatpurrr
can be used to build such structures, especially if they are repetitive. -
Interactive dataviz:
ggiraph
is aggplot2
-friendly package to build interactive plots. It helps to create plots that do not overwhelm users with data. Hover events and tooltips aid the user to focus on particular aspects of a plot.
- Tweepy functions: https://dev.to/twitterdev/a-comprehensive-guide-for-using-the-twitter-api-v2-using-tweepy-in-python-15d9
- Tweepy hands-on: https://youtu.be/q8q3OFFfY6c
- Docker + Lambda: https://youtu.be/2VtuNOEw8S4
Give a ⭐️ if you like this project!
React 👍 in our Linkedin post!
Interact ❤️ in our Twitter post!