Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Scheduler] add "hashed" cron expression support #49792

Closed
wants to merge 0 commits into from

Conversation

kbond
Copy link
Member

@kbond kbond commented Mar 24, 2023

Q A
Branch? 6.3
Bug fix? no
New feature? yes
Deprecations? no
Tickets n/a
License MIT
Doc PR todo

This is an interesting feature from zenstruck/schedule-bundle that I borrowed from Jenkins. It helps with the problem touched upon in Fabien's talk at SFLive Paris 2023.

From the zenstruck/schedule-bundle docs (some of the issues discussed below don't apply to symfony/scheduler the same way):

If you have many tasks scheduled at midnight (0 0 * * *) this could create a very long running schedule right at this time. Tasks scheduled at the same time are run synchronously. This may cause an issue if a task has a memory leak.

This bundle extends the standard Cron expression syntax by adding a # (for hash) symbol. # is replaced with a random value for the field. The value is deterministic based on the task's description. This means that while the value is random, it is predictable and consistent. A task with the description my task and a defined frequency of # # * * * will have a calculated frequency of 56 20 * * * (every day at 8:56pm). Changing the task's description will change it's calculated frequency. If the task from the previous example's description is changed to another task, it's calculated frequency would change to 24 12 * * * (every day at 12:24pm).

A hash range #(x-y) can also be used. For example, # #(0-7) * * * means daily, some time between midnight and 7am. Using the # without a range creates a range of any valid value for the field. # # # # # is short for #(0-59) #(0-23) #(1-28) #(1-12) #(0-6). Note the day of month range is 1-28, this is to account for February which has a minimum of 28 days.

Usage

// MyMessage MUST be stringable to use "hashed cron"
$schedule->add(RecurringMessage::cron('#midnight', new MyMessage()));

@vtsykun
Copy link
Contributor

vtsykun commented Mar 25, 2023

With this approach, you cannot set the start time to the nearest second, so the tasks will always start at 00 seconds and we will have a burst of CPU load at every minutes.

At the same time, this component is very accurate, so the example below will run at a random time depending on when the task was received.

        return (new Schedule())
            ->add(RecurringMessage::cron('* * * * *', new \stdClass()))
            ->add(RecurringMessage::every('15 minutes',  new \stdClass()));

Selection_1374

@kbond
Copy link
Member Author

kbond commented Mar 25, 2023

With this approach, you cannot set the start time to the nearest second, so the tasks will always start at 00 seconds and we will have a burst of CPU load at every minutes.

This problem already exists with the existing CronExpressionTrigger, no?

Copy link
Member

@fabpot fabpot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could "just" create a HashedCronExpression instead.
This could even be part of the CronExpression package instead.

@kbond
Copy link
Member Author

kbond commented Mar 25, 2023

I wonder if we could "just" create a HashedCronExpression instead.

Yes, if not using #'s, it works exactly like a normal cron expression.

This could even be part of the CronExpression package instead.

I'll see if there is interest in this there. (dragonmantank/cron-expression#155)

@kbond
Copy link
Member Author

kbond commented Mar 27, 2023

I wonder if we could "just" create a HashedCronExpression instead.

I've modified this to add hash support directly on CronExpression via the fromHash named constructor.

@kbond
Copy link
Member Author

kbond commented Mar 29, 2023

I've made the context optional if the message is stringable.

@kbond kbond force-pushed the hashed-cron branch 2 times, most recently from 051c256 to ab9ca22 Compare March 29, 2023 19:34
@fabpot
Copy link
Member

fabpot commented Apr 4, 2023

IIUC, Jenkins is using P and not #. Why not use the same here?

@kbond
Copy link
Member Author

kbond commented Apr 4, 2023

I personally thought the # better conveyed the meaning but I can change to P.

@fabpot
Copy link
Member

fabpot commented Apr 4, 2023

I think it makes sense to keep the "standard" set by Jenkins. I know that some hosting platforms are also using the Jenkins P letter for hashed crons. I agree that using # would be "better", but being more like others is probably a better idea.

@kbond
Copy link
Member Author

kbond commented Apr 4, 2023

It's been a while since I've used Jenkins - I believe the syntax is H (for hash), not P.

I can change to H but how should we handle the aliases (#hourly, #daily, #midnight, etc). I find these exceptionally useful.

(We could support both # and H and keep the hash aliases prefixed with #.)

@@ -33,17 +54,54 @@ public function __toString(): string
return $this->expression->getExpression();
}

public static function fromSpec(string $expression = '* * * * *'): self
public static function fromSpec(string $expression = '* * * * *', ?string $context = null): self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the context? Can't we just always use the message object string representation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do this when using RecurringMessage::cron(). You're thinking to pass the message in here? I think it's good as-is so users's can use this manually if they can't/don't make their Message stringable.

Also, is fromSpec still an appropriate name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fromSpec name seems ok to me.
Forcing messages to be stringable does not seem to be a huge constraint (as we don't care about the content itself).

Copy link
Member Author

@kbond kbond Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want the type to be ?\Stringable $message = null?

@fabpot
Copy link
Member

fabpot commented Apr 25, 2023

Thank you @kbond.

fabpot added a commit that referenced this pull request Apr 25, 2023
This PR was squashed before being merged into the 6.3 branch.

Discussion
----------

[Scheduler] add "hashed" cron expression support

| Q             | A
| ------------- | ---
| Branch?       | 6.3
| Bug fix?      | no
| New feature?  | yes
| Deprecations? | no
| Tickets       | n/a
| License       | MIT
| Doc PR        | todo

This is an interesting feature from zenstruck/schedule-bundle that I borrowed from Jenkins. It helps with the problem touched upon in [Fabien's talk at SFLive Paris 2023](https://speakerdeck.com/fabpot/s?slide=33).

From the [zenstruck/schedule-bundle docs](https://github.com/zenstruck/schedule-bundle/blob/1.x/doc/define-tasks.md#hashed-cron-expression) _(some of the issues discussed below don't apply to symfony/scheduler the same way)_:

> If you have many tasks scheduled at midnight (0 0 * * *) this could create a very long running schedule right at this time. Tasks scheduled at the same time are run synchronously. This may cause an issue if a task has a memory leak.
>
> This bundle extends the standard Cron expression syntax by adding a # (for hash) symbol. # is replaced with a random value for the field. The value is deterministic based on the task's description. This means that while the value is random, it is predictable and consistent. A task with the description my task and a defined frequency of # # * * * will have a calculated frequency of 56 20 * * * (every day at 8:56pm). Changing the task's description will change it's calculated frequency. If the task from the previous example's description is changed to another task, it's calculated frequency would change to 24 12 * * * (every day at 12:24pm).
>
> A hash range #(x-y) can also be used. For example, # #(0-7) * * * means daily, some time between midnight and 7am. Using the # without a range creates a range of any valid value for the field. # # # # # is short for #(0-59) #(0-23) #(1-28) #(1-12) #(0-6). Note the day of month range is 1-28, this is to account for February which has a minimum of 28 days.

### Usage

```php
// MyMessage MUST be stringable to use "hashed cron"
$schedule->add(RecurringMessage::cron('#midnight', new MyMessage()));
```

Commits
-------

6ff8c28 [Scheduler] add "hashed" cron expression support
@fabpot fabpot closed this Apr 25, 2023
@kbond kbond deleted the hashed-cron branch April 25, 2023 12:25
@kbond
Copy link
Member Author

kbond commented Apr 25, 2023

Thanks again for all the help with this @TimWolla!

javiereguiluz added a commit to symfony/symfony-docs that referenced this pull request Jan 22, 2024
This PR was squashed before being merged into the 6.3 branch.

Discussion
----------

[Scheduler] Document hashed Cron expression

Follow #19440

Try document symfony/symfony#49792

I copied a lot https://github.com/zenstruck/schedule-bundle/blob/1.x/doc/define-tasks.md#hashed-cron-expression

Commits
-------

f41721c [Scheduler] Document hashed Cron expression
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants