Convert the LocalExecutor to run tasks using new Task SDK supervisor code #44427

ashb · 2024-11-27T17:10:59Z

This also lays the groundwork for a more general purpose "workload" execution
system, make a single interface for executors to run tasks and callbacks.

Also in this PR we set up the supervise function to send Task logs to a file,
and handle the task log template rendering in the scheduler before queueing
the workload.

Additionally we don't pass the activity directly to supervise() but instead
the properties/fields of it to reduce the coupling between SDK and Executor.
(More separation will appear in PRs over the next few weeks.)

The big change of note here is that rather than sending an airflow command
line to execute (["airflow", "tasks", "run", ...]) and going back in via the
CLI parser we go directly to a special purpose function. Much simpler.

It doesn't remove any of the old behaviour (CeleryExecutor still uses
LocalTaskJob via the CLI parser etc.), nor does anything currently send
callback requests via this new workload mechanism.

The airflow.executors.workloads module currently needs to be shared between
the Scheduler (or more specifically the Executor) and the "worker" side of
things. In the future these will be separate python dists and this module will
need to live somewhere else.

Right now we check the if executor.queue_workload is different from the
BaseExecutor version (which just raises an error right now) to see which
executors support this new version. That check will be removed as soon as all
the in-tree executors have been migrated.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

airflow/executors/local_executor.py

task_sdk/src/airflow/sdk/execution_time/supervisor.py

task_sdk/src/airflow/sdk/execution_time/task_runner.py

kaxil

Minor comments/questions but directionally lgtm

airflow/executors/workloads.py

task_sdk/src/airflow/sdk/execution_time/supervisor.py

jscheffl

I do not fully understand the code but maybe I need a night of sleep. Just some comments not blocking.

I was scratching my head regarding the general supervisor approach - Do we have platform limitations by this? I assume no, all major/general operating systems have the socket mechanisms, also Windows, correct? Do we see limitations for running outside Linux?

ashb · 2024-11-27T21:31:10Z

Yes, windows supports the socketpair etc (it's why I chose that API). This supervisor approach is what we do today with LocalTaskJob and the StandardRunner etc, this is just a simplified re-implementation of it.

To fully support windows we will need to have something other than os.fork based approach, the rest should be possible. Windows supports keeping open "inheritable" sockets open to launched processes, so it'll be slower because of launching a whole new python interpreter, but again, this is the current behaviour on Windows of StandardTask runner.

…code This also lays the groundwork for a more general purpose "workload" execution system, make a single interface for executors to run tasks and callbacks. Also in this PR we set up the supervise function to send Task logs to a file, and handle the task log template rendering in the scheduler before queueing the workload. Additionally we don't pass the activity directly to `supervise()` but instead the properties/fields of it to reduce the coupling between SDK and Executor. (More separation will appear in PRs over the next few weeks.) The big change of note here is that rather than sending an airflow command line to execute (`["airflow", "tasks", "run", ...]`) and going back in via the CLI parser we go directly to a special purpose function. Much simpler. It doesn't remove any of the old behaviour (CeleryExecutor still uses LocalTaskJob via the CLI parser etc.), nor does anything currently send callback requests via this new workload mechanism. The `airflow.executors.workloads` module currently needs to be shared between the Scheduler (or more specifically the Executor) and the "worker" side of things. In the future these will be separate python dists and this module will need to live somewhere else. Right now we check the if `executor.queue_workload` is different from the BaseExecutor version (which just raises an error right now) to see which executors support this new version. That check will be removed as soon as all the in-tree executors have been migrated.

…code (apache#44427) This also lays the groundwork for a more general purpose "workload" execution system, make a single interface for executors to run tasks and callbacks. Also in this PR we set up the supervise function to send Task logs to a file, and handle the task log template rendering in the scheduler before queueing the workload. Additionally we don't pass the activity directly to `supervise()` but instead the properties/fields of it to reduce the coupling between SDK and Executor. (More separation will appear in PRs over the next few weeks.) The big change of note here is that rather than sending an airflow command line to execute (`["airflow", "tasks", "run", ...]`) and going back in via the CLI parser we go directly to a special purpose function. Much simpler. It doesn't remove any of the old behaviour (CeleryExecutor still uses LocalTaskJob via the CLI parser etc.), nor does anything currently send callback requests via this new workload mechanism. The `airflow.executors.workloads` module currently needs to be shared between the Scheduler (or more specifically the Executor) and the "worker" side of things. In the future these will be separate python dists and this module will need to live somewhere else. Right now we check the if `executor.queue_workload` is different from the BaseExecutor version (which just raises an error right now) to see which executors support this new version. That check will be removed as soon as all the in-tree executors have been migrated.

ashb requested review from XD-DENG, o-nikolas and pierrejeambrun as code owners November 27, 2024 17:11

ashb added the area:task-execution-interface-aip72 AIP-72: Task Execution Interface (TEI) aka Task SDK label Nov 27, 2024

ashb requested a review from hussein-awala as a code owner November 27, 2024 17:11

boring-cyborg bot added area:Executors-core LocalExecutor & SequentialExecutor area:Scheduler including HA (high availability) scheduler area:task-sdk labels Nov 27, 2024

ashb requested review from kaxil, amoghrajesh and jscheffl November 27, 2024 17:18

kaxil reviewed Nov 27, 2024

View reviewed changes

airflow/executors/local_executor.py Outdated Show resolved Hide resolved

kaxil reviewed Nov 27, 2024

View reviewed changes

task_sdk/src/airflow/sdk/execution_time/supervisor.py Show resolved Hide resolved

kaxil reviewed Nov 27, 2024

View reviewed changes

task_sdk/src/airflow/sdk/execution_time/supervisor.py Show resolved Hide resolved

kaxil reviewed Nov 27, 2024

View reviewed changes

task_sdk/src/airflow/sdk/execution_time/task_runner.py Outdated Show resolved Hide resolved

ashb force-pushed the localexecutor-uses-task-sdk branch from c9a0845 to 17f1a5e Compare November 27, 2024 17:36

kaxil approved these changes Nov 27, 2024

View reviewed changes

jscheffl reviewed Nov 27, 2024

View reviewed changes

airflow/executors/workloads.py Outdated Show resolved Hide resolved

jscheffl reviewed Nov 27, 2024

View reviewed changes

airflow/executors/workloads.py Show resolved Hide resolved

jscheffl reviewed Nov 27, 2024

View reviewed changes

airflow/executors/workloads.py Outdated Show resolved Hide resolved

jscheffl reviewed Nov 27, 2024

View reviewed changes

task_sdk/src/airflow/sdk/execution_time/supervisor.py Outdated Show resolved Hide resolved

jscheffl approved these changes Nov 27, 2024

View reviewed changes

ashb force-pushed the localexecutor-uses-task-sdk branch from 17f1a5e to c660a89 Compare November 27, 2024 21:55

ashb force-pushed the localexecutor-uses-task-sdk branch from c660a89 to 5a7aca4 Compare November 28, 2024 10:55

ashb merged commit 14919fa into main Nov 28, 2024
49 of 50 checks passed

ashb deleted the localexecutor-uses-task-sdk branch November 28, 2024 12:45

ashb mentioned this pull request Nov 28, 2024

Swap internal RPC server for API server in the helm chart #44463

Merged

This was referenced Jan 6, 2025

Change Celery executor to accept new "Activity" payload format and update it to run tasks via task SDK supervisor instead of LocalTaskJob #45426

Closed

Convert the KubernetesExecutor to run tasks using new Task SDK supervisor code #45427

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert the LocalExecutor to run tasks using new Task SDK supervisor code #44427

Convert the LocalExecutor to run tasks using new Task SDK supervisor code #44427

ashb commented Nov 27, 2024

kaxil left a comment

jscheffl left a comment

ashb commented Nov 27, 2024

Convert the LocalExecutor to run tasks using new Task SDK supervisor code #44427

Convert the LocalExecutor to run tasks using new Task SDK supervisor code #44427

Conversation

ashb commented Nov 27, 2024

kaxil left a comment

Choose a reason for hiding this comment

jscheffl left a comment

Choose a reason for hiding this comment

ashb commented Nov 27, 2024