Skip to content

This is a simple ETL using postgreSQL, Airflow and twitter API

Notifications You must be signed in to change notification settings

ghiles10/ETL-SQL-AIRFLOW

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQL-AIRFLOW

This is a simple ETL using Airflow. first, I fetch data from the twitter api. then create a databse on postgres that stores these data by creating 3 tables and validate the data (transform). Finally, I convert the tables into csv files with the specific dates at the time of the conversion.

Prerequisites

  1. Setup PostgreSQL
sudo apt-get update
sudo apt-get install postgresql postgresql-contrib

# Create a new user 
sudo -u postgres createuser --login --pwprompt user_name
sudo -u postgres createdb --owner=user_name database_name
  1. Create a .env file :
# connecting to api
ACCESS_KEY=''
ACCESS_SECRET='' 
CONSUMER_KEY=''
CONSUMER_SECRET=''
# postgres 
DB_NAME=database
DB_USER=user
DB_PASS=password
  1. Set VirtualEnv :
python3 -m venv etl
source etl/bin/activate
  1. Install Dependency :
pip install -r requirements.txt
  1. Set Airflow Home Directory (local run) :
export AIRFLOW_HOME=$pwd
cp DAG.py $AIRFLOW_HOME/dags

About

This is a simple ETL using postgreSQL, Airflow and twitter API

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages