Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup/Teardown in nested Task Groups creating unwanted dependencies with downstream tasks #36345

Closed
1 of 2 tasks
ardubev16 opened this issue Dec 21, 2023 · 3 comments · Fixed by #36456
Closed
1 of 2 tasks
Assignees
Labels
area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet

Comments

@ardubev16
Copy link

ardubev16 commented Dec 21, 2023

Apache Airflow version

2.8.0

If "Other Airflow 2 version" selected, which one?

No response

What happened?

The code below, in the "How to reproduce" section, produces the following DAG, I think there is an extra connection rendered between test_task and outer_end.
Screenshot from 2023-12-20 16-09-22

What you think should happen instead?

I would expect that middle_end is executed after test_group and, that outer_end is executed after test_group2. Since inner_end is a teardown task it is not considered a leaf of test_group, thus the only leaf of test_group is test_task. So, I would expect only the connection between test_task and middle_end, and not the connection between test_task and outer_end.

How to reproduce

Ddecorate the following function to become a DAG and call it to create the DAG:

def nesting_groups_test():
    @task
    def test_task():
        print("Hello world!")

    @task_group
    def test_group():
        inner_start = EmptyOperator(task_id="inner_start")
        inner_end = EmptyOperator(task_id="inner_end")

        test_task_r = test_task()
        inner_start >> test_task_r >> inner_end.as_teardown(setups=inner_start)

    @task_group
    def test_group2():
        middle_end = EmptyOperator(task_id="middle_end")
        test_group_r = test_group()
        test_group_r >> middle_end

    outer_start = EmptyOperator(task_id="outer_start")
    outer_end = EmptyOperator(task_id="outer_end")
    test_group2_r = test_group2()
    outer_start >> test_group2_r >> outer_end

Operating System

Ubuntu 22.04.3 LTS

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

No response

Anything else?

This seems to be just a rendering issue since I've done some simple tests and the tasks have been executed in order, but I'm not 100% sure.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@ardubev16 ardubev16 added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Dec 21, 2023
Copy link

boring-cyborg bot commented Dec 21, 2023

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@potiuk
Copy link
Member

potiuk commented Dec 21, 2023

CC: @dstandish - I might take a close look (I want to learn more about the setup/teardown implementation) - possibly after holidays but If you happen to be around and know by heart that could be also helpful :)

@dstandish
Copy link
Contributor

Looks like bug, thanks.

@dstandish dstandish self-assigned this Dec 27, 2023
dstandish added a commit to astronomer/airflow that referenced this issue Dec 27, 2023
When arrowing `group` >> `task`, the "leaves" of `group` are connected to `task`. When calculating leaves in the group, teardown tasks are ignored, and we recurse upstream to find non-teardowns.

What was happening, and what this fixes, is you might recurse to a work task that already has another non-teardown downstream in the group.  In that case you should ignore the work task (because it already has a non-teardown descendent).

Resolves apache#36345
dstandish added a commit that referenced this issue Dec 27, 2023
When arrowing `group` >> `task`, the "leaves" of `group` are connected to `task`. When calculating leaves in the group, teardown tasks are ignored, and we recurse upstream to find non-teardowns.

What was happening, and what this fixes, is you might recurse to a work task that already has another non-teardown downstream in the group.  In that case you should ignore the work task (because it already has a non-teardown descendent).

Resolves #36345
ephraimbuddy pushed a commit that referenced this issue Jan 11, 2024
When arrowing `group` >> `task`, the "leaves" of `group` are connected to `task`. When calculating leaves in the group, teardown tasks are ignored, and we recurse upstream to find non-teardowns.

What was happening, and what this fixes, is you might recurse to a work task that already has another non-teardown downstream in the group.  In that case you should ignore the work task (because it already has a non-teardown descendent).

Resolves #36345

(cherry picked from commit 949fc57)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants