Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add log lookup exception for empty op subtypes #35536

Merged
merged 2 commits into from
Jan 12, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions airflow/utils/log/file_task_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
from typing import TYPE_CHECKING, Any, Callable, Iterable
from urllib.parse import urljoin

import httpx
import pendulum

from airflow.configuration import conf
Expand Down Expand Up @@ -78,8 +79,6 @@ def _set_task_deferred_context_var():


def _fetch_logs_from_service(url, log_relative_path):
import httpx

Comment on lines -81 to -82
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this import moved to top of the file? is it due to pre commit change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I did this because I need this module on line 506 (httpx.UnsupportedProtocol). I looked at the blame and there didn't seem to be any reason for this to be function scoped so moving it to module scoped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it looks like Ash moves this out of module scope to function for perf optimization here: 1a8a897#diff-e7f34f73940eb52d92bb991abedc1c963431c5373c12dff739c8fb7d03e93d3aL24

Going to move it back to function scope in a follow up PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR: #36753

from airflow.utils.jwt_signer import JWTSigner

timeout = conf.getint("webserver", "log_fetch_timeout_sec", fallback=None)
Expand Down Expand Up @@ -170,6 +169,9 @@ class FileTaskHandler(logging.Handler):
"""

trigger_should_wrap = True
inherits_from_empty_operator_log_message = (
"Operator inherits from empty operator and thus does not have logs"
)

def __init__(self, base_log_folder: str, filename_template: str | None = None):
super().__init__()
Expand Down Expand Up @@ -555,8 +557,11 @@ def _read_from_logs_server(self, ti, worker_log_rel_path) -> tuple[list[str], li
messages.append(f"Found logs served from host {url}")
logs.append(response.text)
except Exception as e:
messages.append(f"Could not read served logs: {e}")
logger.exception("Could not read served logs")
if isinstance(e, httpx.UnsupportedProtocol) and ti.task.inherits_from_empty_operator is True:
messages.append(self.inherits_from_empty_operator_log_message)
else:
messages.append(f"Could not read served logs: {e}")
logger.exception("Could not read served logs")
return messages, logs

def _read_remote_logs(self, ti, try_number, metadata=None) -> tuple[list[str], list[str]]:
Expand Down