Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually marking dag run state will change end date #31555

Closed
1 of 2 tasks
bbovenzi opened this issue May 25, 2023 · 3 comments · Fixed by #36144
Closed
1 of 2 tasks

Manually marking dag run state will change end date #31555

bbovenzi opened this issue May 25, 2023 · 3 comments · Fixed by #36144
Labels
affected_version:main_branch Issues Reported for main branch area:core kind:bug This is a clearly a bug

Comments

@bbovenzi
Copy link
Contributor

Apache Airflow version

main (development)

What happened

May-25-2023 14-22-49

When manually marking a dag run as success or failure, the end date of the run will also change. This is most noticable in the grid view since the duration will immediately change.

This has been the case for a while.

What you think should happen instead

We probably shouldn't alter the end_date.

How to reproduce

Manually mark a dag run, especially an old one, and see the dag run duration bar charts change dramatically with one outlier, ruining the value of the rest of the chart.

Operating System

any

Versions of Apache Airflow Providers

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@bbovenzi bbovenzi added kind:bug This is a clearly a bug area:core needs-triage label for new issues that we didn't triage yet and removed needs-triage label for new issues that we didn't triage yet labels May 25, 2023
@tirkarthi
Copy link
Contributor

While setting task state the task's existing end_date is checked. Meanwhile for dagrun it seems the check is not done and it seems intentional and given the commit has tests it might be breaking backwards compatibility.

if self.state in State.finished or self.state == State.UP_FOR_RETRY:
self.end_date = self.end_date or current_time
self.duration = (self.end_date - self.start_date).total_seconds()

commit b4f1c73
Author: Yingbo Wang ybwang@gmail.com
Date: Fri Aug 31 16:49:39 2018 -0700

[AIRFLOW-2951] Update dag_run table end_date when state change (#3798)

The existing airflow only change dag_run table end_date value when
a user teminate a dag in web UI. The end_date will not be updated
if airflow detected a dag finished and updated its state.

This commit add end_date update in DagRun's set_state function to
make up tho problem mentioned above.

@eladkal
Copy link
Contributor

eladkal commented Aug 11, 2023

Came across this one when testing 2.7.0rc1
I think it's not intuitive that marking DagRun as failure/success changes the end date of the DagRun thus causing the UI effect as seen in the gif.

This bug can also easily mess metrics collectors.

@eladkal eladkal added the affected_version:main_branch Issues Reported for main branch label Aug 11, 2023
@boushphong
Copy link
Contributor

boushphong commented Nov 13, 2023

I just tried to fix this.
I have no clue how dag_run.end_date field here is always equal to the timezone.utcnow() to be honest. Even before assigning dag_run.end_date = timezone.utcnow() at line 368. This seems to be the problem, it just doesn't get the end_date from it's latest run but always resolves to the value of timezone.utcnow(). Hence the session.merge(dagrun) will always alter the end_date.

dag_run = session.execute(
select(DagRun).where(DagRun.dag_id == dag_id, DagRun.run_id == run_id)
).scalar_one()
dag_run.state = state
if state == DagRunState.RUNNING:
dag_run.start_date = timezone.utcnow()
dag_run.end_date = None
else:
dag_run.end_date = timezone.utcnow()
session.merge(dag_run)

perhaps someone could pick this up from my clue maybe. Maybe the code below would work after one could resolve the above problem. I suspect the problem comes from the @provide_session decorator. But I'm not too sure.

if state == DagRunState.RUNNING: 
     dag_run.start_date = timezone.utcnow() 
     dag_run.end_date = None 
elif state == DagRunState.QUEUE: 
     dag_run.end_date = timezone.utcnow()
session.merge(dagrun)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:main_branch Issues Reported for main branch area:core kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants