Return an error when scheduling a reducer with a long delay #77

gefjon · 2023-07-13T15:26:29Z

Description of Changes

Prior to this commit, it was possible for a module to crash SpacetimeDB by scheduling a reducer with a delay longer than ~2yrs. This was due to our use of tokio_utils::time::DelayQueue to handle scheduling. DelayQueue's internal data structure imposes a limit of 64^6 ms on delays, a little more than two years.
Attempting to insert with a delay longer than that panics.

With this commit, we avoid the panic by checking ourselves that the requested delay is not longer than 64^6 ms.
This requires bubbling a ScheduleError up from Scheduler::schedule to WasmInstanceEnv::schedule, where it is converted into a RuntimeError which crashes the module.

Scheduler::schedule could also fail because its transaction to compute a new id was fallible. This seems unlikely to ever fail, and if it does, we have bigger problems, so unwrapping might still be reasonable for that case, but this commit converts it into a handle-able Error anyway, as there's essentially no cost in complexity to doing so.

API

This is a breaking change to the module API
This is a breaking change to the ClientAPI

If the API is breaking, please state below what will break

Prior to this commit, it was possible for a module to crash SpacetimeDB by scheduling a reducer with a delay longer than ~2yrs. This was due to our use of `tokio_utils::time::DelayQueue` to handle scheduling. `DelayQueue`'s internal data structure imposes a limit of 64^6 ms on delays, a little more than two years. Attempting to insert with a delay longer than that panics. With this commit, we avoid the panic by checking ourselves that the requested delay is not longer than 64^6 ms. This requires bubbling a `ScheduleError` up from `Scheduler::schedule` to `WasmInstanceEnv::schedule`, where it is converted into a `RuntimeError` which crashes the module. `Scheduler::schedule` could also fail because its transaction to compute a new id was fallible. This seems unlikely to ever fail, and if it does, we have bigger problems, so `unwrap`ping might still be reasonable for that case, but this commit converts it into a handle-able `Err`or anyway, as there's essentially no cost in complexity to doing so.

jdetter

Tested, SpacetimeDB no longer crashes. We don't really get a meaningful error in the case of using spacetime call from the command line:

Error: Response text: The Wasm instance encountered a fatal error.

Caused by:
    HTTP status server error (530 <unknown status code>) for url (http://localhost:3000/database/call/93dda09db9a56d8fa6c024d843e805d8/test)

And also it appears that the module logs are empty. I think this error is going to be extremely uncommon so this is fine for now.

Prior to this commit, it was possible for a module to crash SpacetimeDB by scheduling a reducer with a delay longer than ~2yrs. This was due to our use of `tokio_utils::time::DelayQueue` to handle scheduling. `DelayQueue`'s internal data structure imposes a limit of 64^6 ms on delays, a little more than two years. Attempting to insert with a delay longer than that panics. With this commit, we avoid the panic by checking ourselves that the requested delay is not longer than 64^6 ms. This requires bubbling a `ScheduleError` up from `Scheduler::schedule` to `WasmInstanceEnv::schedule`, where it is converted into a `RuntimeError` which crashes the module. `Scheduler::schedule` could also fail because its transaction to compute a new id was fallible. This seems unlikely to ever fail, and if it does, we have bigger problems, so `unwrap`ping might still be reasonable for that case, but this commit converts it into a handle-able `Err`or anyway, as there's essentially no cost in complexity to doing so.

gefjon requested review from cloutiertyler and jdetter July 13, 2023 15:26

jdetter approved these changes Jul 13, 2023

View reviewed changes

gefjon merged commit 882d4cf into master Jul 13, 2023
4 checks passed

cloutiertyler deleted the phoebe/handle-schedule-duration-too-long branch August 1, 2023 21:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return an error when scheduling a reducer with a long delay #77

Return an error when scheduling a reducer with a long delay #77

gefjon commented Jul 13, 2023

jdetter left a comment

Return an error when scheduling a reducer with a long delay #77

Return an error when scheduling a reducer with a long delay #77

Conversation

gefjon commented Jul 13, 2023

Description of Changes

API

jdetter left a comment

Choose a reason for hiding this comment