Skip to content

Latest commit

 

History

History
55 lines (33 loc) · 1.74 KB

README.md

File metadata and controls

55 lines (33 loc) · 1.74 KB

Information

image

pip install -r requirements.txt

Steps

write_csv_to_postgres.py-> gets a csv file from a URL. Saves it into the local working directory as churn_modelling.csv. After reading the csv file, it writes it to a local PostgreSQL table

create_df_and_modify.py -> reads the same Postgres table and creates a pandas dataframe out of it, modifies it. Then, creates 3 separate dataframes.

write_df_to_postgres.py -> writes these 3 dataframes to 3 separate tables located in Postgres server.

Apache Airflow:

Run the following command to clone the necessary repo on your local

git clone https://github.com/dogukannulu/docker-airflow.git

After cloning the repo, run the following command only once:

docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow .

Then run the following command:

docker-compose -f docker-compose-LocalExecutor.yml up -d

Now you have a running Airflow container and you can reach out to that on https://localhost:8080. If there is No module: ... error, you can access to bash with the following command:

docker exec -it <container_id> /bin/bash 

Then:

pip install <necessary libraries>

After all these, we can move all .py files under dags folder in docker-airflow repo.

PostgreSQL

I am using a local Postgres Server and installed PgAdmin to control the tables instead of using psql. I defined the necessary info to access to Postgres server in .zshrc (if MacOS).