-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add detailed information to logging when a dag or a task finishes. #19097
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
Also, please check linter errors locally; this will not pass CI. |
airflow/models/dagrun.py
Outdated
@@ -526,6 +526,28 @@ def update_state( | |||
else: | |||
self.set_state(State.RUNNING) | |||
|
|||
if self.get_state() == State.FAILED or self.get_state() == State.SUCCESS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use self.state
here, but I kind of feel maybe this logging should be done in set_state
, or when the state is actually set to instead? Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider doing this in set_state(), but decided against it a for a few reasons
- The existing logging messages are at this layer
- While I believe think it would logically equivalent right now, it's possible the further code changes could end up tweaking the value of the fields that this is logging after the set_state() call, and that would be more obvious if the logging was at this layer rather than in the setter.
- Generally i prefer not having logging in the lowest level getter/setter method because they rarely have the full context across why the value is being referenced. In this case, the logging message isn't using anything that's specific to the update_state() context, but we could easily augment the message in the future to provide more context. E.g. failed due to deadlock vs task failure.
I can certainly change to use self._state directly here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the reasoning makes sense. In that case, would it be possible to move the TaskInstance logging to a similar abstraction level, instead of doing it in SchedulerJob? Or does the TI lack such abstraction to do the same? I see there's a TaskInstance._log_state()
function, which is called (for SUCCESS and FAILURE states) in _run_raw_task
, but this new log seems to be emitted at a difference place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think having it in TaskInstance._log_state()
would certainly make sense from an abstraction point of view. However, looking at the code, _log_state()
seems like it doesn't get called if the task fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indirectly called in handle_failure
; not sure if it covers all cases, can you check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't appear to cover all of the cases. e.g.
airflow/airflow/models/taskinstance.py
Lines 1363 to 1370 in f5ad26d
except AirflowException as e: | |
if not test_mode: | |
self.refresh_from_db(lock_for_update=True, session=session) | |
# for case when task is marked as success/failed externally | |
# or dagrun timed out and task is marked as skipped | |
# current behavior doesn't hit the callbacks | |
if self.state in State.finished: | |
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah. Would it be a good idea to just add a _log_state
here? (and other similar paths in _run_raw_task
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to make major changes to code flow just for the additional logging. doesn't seem the worth the risk tradeoff of potentially breaking existing behavior. It seems like there are a lot of edge conditions in terms of error/skipping handling in this code.
…uration based on start_date and end_date
I think this looks sufficient for getting a quick fix in and we can move around if needed.
Awesome work, congrats on your first merged pull request! |
…19097) * add detailed information to logging when a dag or a task finishes. * make logging of start_date/end_date ISO format to be consist * fix pre-commit * use only %s in new logging statements to gracefully handle when certain variables are None * fix precommit * use self._state instead of get_state(). add computation for dag run duration based on start_date and end_date * make linter happy with format * fix typo * put back missing reference to _state Co-authored-by: Daniel Imberman <daniel.imberman@gmail.com> (cherry picked from commit 324c31c)
…19097) * add detailed information to logging when a dag or a task finishes. * make logging of start_date/end_date ISO format to be consist * fix pre-commit * use only %s in new logging statements to gracefully handle when certain variables are None * fix precommit * use self._state instead of get_state(). add computation for dag run duration based on start_date and end_date * make linter happy with format * fix typo * put back missing reference to _state Co-authored-by: Daniel Imberman <daniel.imberman@gmail.com> (cherry picked from commit 324c31c)
…pache#19097) * add detailed information to logging when a dag or a task finishes. * make logging of start_date/end_date ISO format to be consist * fix pre-commit * use only %s in new logging statements to gracefully handle when certain variables are None * fix precommit * use self._state instead of get_state(). add computation for dag run duration based on start_date and end_date * make linter happy with format * fix typo * put back missing reference to _state Co-authored-by: Daniel Imberman <daniel.imberman@gmail.com>
Adding 2 new log messages for when a task run and a dag run reaches a terminal state (SUCCESS/FAILED).
Examples: