Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled emails are randomly missed #25058

Closed
kevinn-veeva opened this issue Aug 22, 2023 · 0 comments · Fixed by #25239
Closed

Scheduled emails are randomly missed #25058

kevinn-veeva opened this issue Aug 22, 2023 · 0 comments · Fixed by #25239

Comments

@kevinn-veeva
Copy link

Occasionally, scheduled emails and alerts won't send. From what we've observed, the scheduled task where this happens takes longer than 1 second to complete:

Jul 12, 2023 @ 12:00:00.050   [2023-07-12 12:00:00,050: INFO/MainProcess] Scheduler: Sending due task reports.scheduler (reports.scheduler)
Jul 12, 2023 @ 12:00:00.050   [2023-07-12 12:00:00,050: INFO/MainProcess] Scheduler: Sending due task reports.scheduler (reports.scheduler)
Jul 12, 2023 @ 12:00:01.227    [2023-07-12 12:00:01,227: INFO/MainProcess] Task reports.scheduler[5fdabb10-1acb-4791-8348-23f5805b49f9] succeeded in 1.1726092300377786s

In that one second period, the cron_schedule_window method is unable to recognize that a daily scheduled email for 12:00PM wasn't sent.

This seems to be due to how the croniter constructs its time period to check against:

   time_now = time_now.astimezone(tz)
    start_at = time_now - timedelta(seconds=1)
    stop_at = time_now + timedelta(seconds=window_size)
    crons = croniter(cron, start_at)
    for schedule in crons.all_next(datetime):
        if schedule >= stop_at:
            break
        # convert schedule back to utc
        yield schedule.astimezone(utc).replace(tzinfo=None)

In the case of a 12:00:00 PM daily job - since start_at ends up as time_now - 1 second, if the task takes longer than 1 second to run, then the time_now ends up being 12:00:01 (time_now) - 1 second = 12:00:00 instead of 12:00:00 (time_now) - 1 second = 11:59:59.

The croniter is searching for the next run time based on the specified start_at value. So, if the start_at is 12:00 and the scheduled email is supposed to run at 12:00, then croniter won't find anything since 12:00 has already passed.

For example, if the start_at is 12:00:00, it finds the next day:

from croniter import croniter
from datetime import datetime

base = datetime(2023, 7, 12, 12, 00, 00, 00)
iter = croniter('0 12 * * mon,tue,wed,thu,fri', base)
print(iter.get_next(datetime))

2023-07-13 12:00:00

Whereas if the base start datetime is 11:59:59, it picks up the cron job:

from croniter import croniter
from datetime import datetime

base = datetime(2023, 7, 12, 11, 59, 59, 00)
iter = croniter('0 12 * * mon,tue,wed,thu,fri', base) print(iter.get_next(datetime))

2023-07-12 12:00:00 

Expected results

Scheduled report gets sent to the provided email - "Scheduling alert" and "Report sent to email" logs are visible.

Actual results

Scheduled report does not get sent to the provided email randomly. There are no error log indicating that there was a failure - "Scheduling alert" and "Report sent to email" messages are not in the logs

Environment

  • superset version: 2.0
  • python version: 3.9
  • node.js version: v16.17.1
  • any feature flags active:
  •   "SQLLAB_BACKEND_PERSISTENCE": True,
      "THUMBNAILS": True,
      "THUMBNAILS_SQLA_LISTENERS": True,
      "LISTVIEWS_DEFAULT_CARD_VIEW": True,
      "DASHBOARD_RBAC": True,
      "DASHBOARD_NATIVE_FILTERS": True,
      "ENABLE_TEMPLATE_PROCESSING": True,
      "ALERT_REPORTS": True,
      "DASHBOARD_CROSS_FILTERS": True
    
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant