Base Airflow template

This is an Airflow 2 template repo. It contains an out of box Airflow 2 instance for

using a decoupled remote Git repo as the DAG folder
automatically synced at a given interval using a cron script
containerized and ready to be scaled horizontally

which is not supported (v2.1.0) natively by Airflow at the time of designing this solution.

If you are using commercially managed Airflow solution such as AWS MWAA, you should be able to store the DAGs in a s3 bucket, then you probably do not need this.

Disclaimer

I have not tested this. I just migrated it from one of my existing project, and abstracted as a template.

I am not an expert at OS level configurations, so some of the strategy used here may not be optimal (MOSTLY hackish), or secure. Please use at discretion, or better, summit PRs / patches and share the knowledge with me!

Usage

Prepare the DAG folder repo

Prepare another git repo to store your DAGs. Recommended to have a separate branch for different Airflow / data pipeline environments. So a possible git model could be:

master/main branch will be used to collect changes and PRs and deployed in staging environment, production for actual Production Airflow. Local testing will be automatically using your personal development branch.

Prepare the Plugin repo

This repo will be installed via pip into the image. You can either prepare it as a package, or simply do pip install git+ssh ... on a private or public repo

Modify fields for local use

One important thing when developing your DAG is to visualise Airflow Tasks when building it, so the strategy here used is to have your DAG repo and this Airflow scheduler repo resides in the same local environment.

The scheduler repo will read the DAG repo using the configured path, mount the directory as a volume into the container. Thus every changes you make in your local computer will instantly be reflected by the Airflow hosted inside the container.

To make this happen you will need to modify the following fields, all fields required modification are enclosed using <> sign.

modify .ssh/id_rsa to include your deployment key if your plugin repo and DAG repo are private
modify docker/docker-compose.yml to use the DAG folder name of your repo. This ensures that in local dev mode, we are using the local DAG folder mounted into the container
modify .script/refresh_dags.sh to use the DAG folder name of your repo
modify .env to include Environment Variables such as AWS credentials in your container environment
modify Dockerfile to include your git clone repo link and repo name, and your PRODUCTION branch, this will be overwritten if you are in local dev mode because of the docker-composes.yml file, in production depending on your container solution, remember to remove the AIRFLOW__CORE__DAGS_FOLDER overwrite to allow Airflow read DAG folder from the path configured in the airflow.cfg
modify requirements.txt to include your customized plugins as a third party library, and other libraries if needed
modify airflow.cfg to include your DAG repo name (and other configurations if you need to)

For deployment

For deployed environment, we will not be using any mounted volume, instead Environment Variables will be used to overwrite the existing functions.

Most importantly, you will need to make sure

This includes those secrets such as AIRFLOW__CELERY__RESULT_BACKEND, AIRFLOW__CORE__FERNET_KEY and AIRFLOW__CORE__SQL_ALCHEMY_CONN.

Start

run make build to build the image
run make up to start the service

Improvements

Make environment a configurable by the Environment Variable

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.ssh		.ssh
docker		docker
scripts		scripts
.bumpversion.cfg		.bumpversion.cfg
.env		.env
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
__version__.py		__version__.py
airflow.cfg		airflow.cfg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Base Airflow template

Disclaimer

Usage

Prepare the DAG folder repo

Prepare the Plugin repo

Modify fields for local use

For deployment

Start

Improvements

About

Releases

Packages

Languages

License

Shadowsong27/airflow-base

Folders and files

Latest commit

History

Repository files navigation

Base Airflow template

Disclaimer

Usage

Prepare the DAG folder repo

Prepare the Plugin repo

Modify fields for local use

For deployment

Start

Improvements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages