Schedule your DAGs (Directed Acyclic Graphs) solely on kubernetes cluster. Using kubernetes scheduler, its flexible yet powerful api, to schedule your workflows.
Write DAGs in Airflow (or other workflow manager - argo, kubeflow etc.)
Kubedag will read them, generate k8s Cronjobs as root tasks.
k8s will do the scheduling, kubedag will trigger dependant tasks or wait if needed.
very wip, ugly code, PoC-ing
- minikube start
- k run --image postgres --env="POSTGRES_USER=airflow" --env="POSTGRES_PASSWORD=airflow" --env="POSTGRES_DB=airflow" --port 5432 airflow-db
- k apply -f postgres-service.yaml
- k run --image apache/airflow:2.0.0 --env="AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@airflow-db/airflow" airflow-ctl airflow db init
- sh run.sh
- meet dependencies (A->C, B->C)
- job completed - small enhacements needed
- UI (logs?)
- from k8s stdout logs (write structlog)
- or propagade somewhere
- make CLI with click
- separate example from the lib, now its merged into one repo
- cleanup
- dind? needed?
- execution date ? how to setup?
- testing / sandbox
- k8s resoureces already exists when redeployed ? cleanup things
- airflow db? do we need it ?