Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix task_instance can not get task attribute correctly when turning on sentry #36968

Conversation

Lee-W
Copy link
Member

@Lee-W Lee-W commented Jan 23, 2024

Closes: #36957

As reported by @jkramer-ginkgo, we also encountered a similar issue, and I'm able to reproduce it with the steps provided. Among the suggested solutions by @jkramer-ginkgo , I think it might make more sense to get ti from kwargs instead of removing this decorator. The root cause is that

cannot get task attribute
task = task_instance.task
. But this decorator is also used in https://github.com/apache/airflow/blob/2.8.0/airflow/models/taskinstance.py#L2280, so the original logic would still be needed. I also change _schedule_downstream_tasks arguments to be keyword-lonly so that we can ensure that we can get ti from kwargs in enrich_error.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@Lee-W Lee-W force-pushed the get_ti_from_kwargs_for_schedule_downstream_tasks branch from 31880fe to cd40c4c Compare January 23, 2024 09:30
@eladkal eladkal added this to the Airflow 2.8.2 milestone Jan 23, 2024
@eladkal eladkal added the type:bug-fix Changelog: Bug Fixes label Jan 23, 2024
@Lee-W Lee-W force-pushed the get_ti_from_kwargs_for_schedule_downstream_tasks branch from cd40c4c to 69e23d8 Compare January 23, 2024 10:10
@Lee-W Lee-W changed the title fix(sentry): get ti from kwargs for TaskInstnace._schedule_downstream_tasks fix task_instance can not get task attribute correctly when turning on sentry Jan 23, 2024
airflow/sentry.py Outdated Show resolved Hide resolved
tests/models/test_taskinstance.py Outdated Show resolved Hide resolved
@Lee-W Lee-W force-pushed the get_ti_from_kwargs_for_schedule_downstream_tasks branch from 7bccb9b to 040f23b Compare January 23, 2024 11:22
@jkramer-ginkgo
Copy link
Contributor

I'm not totally convinced this is the best fix. Per my comment I don't think @Sentry.enrich_errors makes sense on _schedule_downstream_tasks. The task is already wrapped by @Sentry.enrich_errors in _run_raw_task and the task being passed to _schedule_downstream_tasks is not being run again: it's downstream tasks will be scheduled and again wrapped by @Sentry.enrich_errors within _run_raw_task

@Lee-W Lee-W force-pushed the get_ti_from_kwargs_for_schedule_downstream_tasks branch 3 times, most recently from 53fa87c to 9f5e895 Compare January 24, 2024 00:25
@Lee-W
Copy link
Member Author

Lee-W commented Jan 24, 2024

I'm not totally convinced this is the best fix. Per my comment I don't think @Sentry.enrich_errors makes sense on _schedule_downstream_tasks. The task is already wrapped by @Sentry.enrich_errors in _run_raw_task and the task being passed to _schedule_downstream_tasks is not being run again: it's downstream tasks will be scheduled and again wrapped by @Sentry.enrich_errors within _run_raw_task

I thought we still wanted to have enriched errors when we scheduled the tasks because errors might also occur at scheduling time. But I too don't have the full context of why this was designed this way back then 🤔

@@ -178,8 +179,12 @@ def wrapper(_self, *args, **kwargs):

with sentry_sdk.push_scope():
try:
# Is a LocalTaskJob get the task instance
if hasattr(_self, "task_instance"):
ti_from_kwargs = kwargs.get("ti", None)
Copy link
Contributor

@dirrao dirrao Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why task instance is mandatory for sentry error logging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed by add_tagging and add_breadcrumbs

@Lee-W Lee-W force-pushed the get_ti_from_kwargs_for_schedule_downstream_tasks branch from 9f5e895 to 0750bb8 Compare January 25, 2024 04:41
@jedcunningham jedcunningham deleted the get_ti_from_kwargs_for_schedule_downstream_tasks branch February 9, 2024 17:24
@ephraimbuddy ephraimbuddy removed this from the Airflow 2.8.2 milestone Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scheduler exceptions when Sentry and schedule_after_task_execution are enabled
6 participants