A simple globally available store of data per-registration #1331

slightlyoff · 2018-06-28T21:18:24Z

Something we've long punted has been the question of how to deal with volatile but infrequently changing data that is globally useful on a per-registration basis; e.g. a "kill switch deadline" for a SW version.

Thoughts about a .data property on registrations similar to the one we provide for notificationclick? I could imagine getters/setters at both registration time and later in the lifecycle, allowing services to move a lot of this sort of information out of globals.

The text was updated successfully, but these errors were encountered:

jakearchibald · 2018-06-29T08:39:30Z

Can you give some more details on the use case? As in, what would the data be? When would it be set? When would it be used?

If the data was associated with the registration it would need to be immutable, since multiple threads have access to the registration. Unless the access was async of course, but then you may as well use async key val.

I guess any proposal like this should be vs async key val.

jungkees · 2018-06-29T17:02:37Z

Ah.. right. The notification's data attribute is readonly and not concurrently accessed from multiple threads. For service workers, we're queuing the (lifecycle) jobs that can concurrently access a registration. @jakearchibald, what's "async key val" here?

jungkees · 2018-06-29T17:11:34Z

FWIW, if we really need this, we would be able to run the getter and setter in https://html.spec.whatwg.org/#parallel-queue?

mkruisselbrink · 2018-06-29T17:20:47Z

I think Jake's (and at least mine) question is: if this has to be an asynchronous API anyway, what's the benefit of having something SW specific over just using a "normal" storage API (i.e. cache storage, IDB or some not-yet-existing async localstorage style API)?

slightlyoff · 2018-06-29T18:20:12Z

Opening IDB is frequently slow and IDB's API is promise-hostile. Neither are great arguments, but taken together it creates a real hurdle for storing global configuration data (e.g., SW "timeout"; a date after which a SW should stop handling requests and perhaps unregister itself).

…

On Fri, 29 Jun 2018, 18:20 Marijn Kruisselbrink, ***@***.***> wrote: I think Jake's (and at least mine) question is: if this has to be an asynchronous API anyway, what's the benefit of having something SW specific over just using a "normal" storage API (i.e. cache storage, IDB or some not-yet-existing async localstorage style API)? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1331 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAF8M498WBiXCvTl0bJCFXi8_z4hB4Hsks5uBmHwgaJpZM4U8F_I> .

jakearchibald · 2018-06-30T07:50:21Z

Does https://github.com/domenic/async-local-storage solve these problems?

…

On Fri, 29 Jun 2018, 19:20 Alex Russell, ***@***.***> wrote: Opening IDB is frequently slow and IDB's API is promise-hostile. Neither are great arguments, but taken together it creates a real hurdle for storing global configuration data (e.g., SW "timeout"; a date after which a SW should stop handling requests and perhaps unregister itself). On Fri, 29 Jun 2018, 18:20 Marijn Kruisselbrink, ***@***.*** > wrote: > I think Jake's (and at least mine) question is: if this has to be an > asynchronous API anyway, what's the benefit of having something SW specific > over just using a "normal" storage API (i.e. cache storage, IDB or some > not-yet-existing async localstorage style API)? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#1331 (comment) >, > or mute the thread > < https://github.com/notifications/unsubscribe-auth/AAF8M498WBiXCvTl0bJCFXi8_z4hB4Hsks5uBmHwgaJpZM4U8F_I > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1331 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFtmsGvBcZZSr7-1tk3WTy8WByJusNwks5uBm_egaJpZM4U8F_I> .

slightlyoff · 2018-06-30T13:27:16Z

Not for global configuration data one might want to guard fetch-handling on, e.g.; the goal here would be to avoid needing this sort of thing to be more delayed (asynchronous) than necessary.

…

On Sat, 30 Jun 2018, 08:50 Jake Archibald, ***@***.***> wrote: Does https://github.com/domenic/async-local-storage solve these problems? On Fri, 29 Jun 2018, 19:20 Alex Russell, ***@***.***> wrote: > Opening IDB is frequently slow and IDB's API is promise-hostile. Neither > are great arguments, but taken together it creates a real hurdle for > storing global configuration data (e.g., SW "timeout"; a date after which a > SW should stop handling requests and perhaps unregister itself). > > On Fri, 29 Jun 2018, 18:20 Marijn Kruisselbrink, < ***@***.*** > > > wrote: > > > I think Jake's (and at least mine) question is: if this has to be an > > asynchronous API anyway, what's the benefit of having something SW > specific > > over just using a "normal" storage API (i.e. cache storage, IDB or some > > not-yet-existing async localstorage style API)? > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > < #1331 (comment) > >, > > or mute the thread > > < > https://github.com/notifications/unsubscribe-auth/AAF8M498WBiXCvTl0bJCFXi8_z4hB4Hsks5uBmHwgaJpZM4U8F_I > > > > . > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#1331 (comment) >, > or mute the thread > < https://github.com/notifications/unsubscribe-auth/AAFtmsGvBcZZSr7-1tk3WTy8WByJusNwks5uBm_egaJpZM4U8F_I > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1331 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAF8MyNSdSTXmPLp5-gbsRK1rWW27ykkks5uBy2-gaJpZM4U8F_I> .

asakusuma · 2018-07-09T18:09:21Z

We've run into the exact problem for the exact use case mentioned by @slightlyoff. Another use case is storing the CSRF token, since cookies aren't available in the service worker.

Side note, perhaps the expiration use-case is universal enough that we might want to introduce some sort of response header for the service worker file that dictates how long the service worker is allowed to be used?

asakusuma · 2018-07-09T18:25:31Z

Forgot to mention: we ran some experiments against real traffic, which supports the idea that IndexedDB is too slow for critical path operations. IndexedDB access time was often over 1 second, and we experienced a significant amount of IndexedDB access timeouts.

Would it be possible to have a read and write sync store, with the caveat that any properties that are writable from the worker are not guaranteed to be up-to-date when reading? We have experimented with a caching scheme with multiple caches, where a sync-accessed property would be very helpful for figuring out which cache to use for a certain request. Trying to request from the wrong cache is acceptable.

asutherland · 2018-07-09T21:22:44Z

This also came up in #1157. Having some small set of synchronously available data seems to be a recurring request.

As an implementer, I'd posit the most important things in such an API are:

It only be allowed a very small amount of data. So small that it's clear the data is precious and content code immediately be worried about headroom rather than letting it be a can that gets kicked down the road and eventually browsers have to preload a megabyte of data every time a SW gets loaded and/or turn to complicated data-race implementations like they use for localStorage. AKA If it's small enough, no one has to worry about falling off the fast path.
The exact data limits be exactly specified by the standard so there's not a circular cycle of browser implementations loosening their limits as the vagaries of structured clone persistence sizes cause site-breakage which enables more cycles of expansion.

URLSearchParams is roughly an example of such an existing API that could be abused. It has a string serialization, and the size constraint could be on registration.params.toString().length. This encoding also lends itself to simple tunneling of the state from the server.

asakusuma · 2018-07-16T19:52:56Z

@asutherland as a user/consumer, both your points seem good to me.

On another note, does it make more sense for the store to be per-registration, or per version? Or do we need both? If someone wanted to implement an expiration date for a service worker version, I think that would make more sense as part of a per-version store.

Another use-case for a small, sync store would be feature flags. This follows the same theme of "small piece of data that blocks critical path logic."

jakearchibald · 2018-07-17T11:29:02Z

Why do we feel that this thing will be faster than indexeddb? Depending on the design, we may be taking whatever hit per service worker wakeup, even if the data isn't needed.

asakusuma · 2018-07-17T17:44:33Z

@jakearchibald I would expect a simple key-value store that has a very small size limit to be faster than IndexedDB. I believe we are appealing to the general principle of smaller storage space === faster retrieval.

As @asutherland put it:

If it's small enough, no one has to worry about falling off the fast path

asutherland · 2018-07-17T22:07:44Z

Why do we feel that this thing will be faster than indexeddb? Depending on the design, we may be taking whatever hit per service worker wakeup, even if the data isn't needed.

I think the more realistic competition here is the https://github.com/WICG/cookie-store API. In Firefox, cookieStore.get() calls will not involve any I/O for the foreseeable future. Even if we took tips from spinning-disk optimizing-disk-defragmenters and tried to pre-heat the IndexedDB connection, IndexedDB would still lose every race. I do think browsers should optimize IndexedDB, but realistically, hacks like abusing cookies will catch on if they're the easiest path to consistently low latency.

Implementation-wise, in Firefox, I think we'd throw my ugly proposal on the registration. Because the reality is that what I proposed above is already available. You can do (new URLSearchParams(location.search)).get("foo") in your SW after having registered it via navigator.serviceWorker.register("sw.js?foo=bar&baz=bop"); to synchronously get out "bar". But that's only available to those willing to tunnel their SW configurations through a GET and deal with constant reinstallations of their SW whenever they want to update things. But if we're willing to pay the storage price for the information on the registration there, why not make it slightly more ergonomic and let it be separable?

Also, a related factor that I expect weighs heavily in requesting synchronous access is that not invoking respondWith() in fetch both looks and actually is faster than pass-through fetch in many cases. No matter how fast we make the async operation, it's not going to look as fast as never going async in the first place.

asakusuma · 2018-07-17T22:29:58Z

I do think browsers should optimize IndexedDB, but realistically, hacks like abusing cookies will catch on if they're the easiest path to consistently low latency.

I agree. For instance, we have been experimenting with using Cache as a key value store. URL is the key, value is put into the response body as a string. It's a hack, but it works better than IndexedDB.

Implementation-wise, in Firefox, I think we'd throw my ugly proposal on the registration. Because the reality is that what I proposed above is already available.

Sounds like a great start. For our needs, it would be nice to eventually have a store that is writable from the install hook of the service worker. So that we can implement a timeout/expiration date specific to each version. i.e., once a version is installed, it can only operate for a set amount of days.

Also, a related factor that I expect weighs heavily in requesting synchronous access is that not invoking respondWith() in fetch both looks and actually is faster than pass-through fetch in many cases. No matter how fast we make the async operation, it's not going to look as fast as never going async in the first place.

Makes sense. I'm curious as to if there any cases where not invoking respondWith is actually slower, and on average, how much faster is not invoking respondWith, compared to pass-through fetch.

To be clear, I don't really care what the final solution is, as long as it provides a fast, reliable way to persist a small amount of information that can written to from the window scope and from the install step.

jakearchibald · 2018-07-18T08:53:45Z

I'd still like to know some use-cases. What kinds of data do developers want to store? When do they want to set it? When do they want to update it?

The only example I've seen so far is from @asakusuma:

Another use case is storing the CSRF token, since cookies aren't available in the service worker.

But the answer there is https://github.com/WICG/cookie-store.

asakusuma · 2018-07-18T17:17:37Z

@jakearchibald some more use cases:

Expiration date or timeout for a service worker

Goal here is to create a safeguard against a service worker operating too long or forever. On install, the service worker records an entry in the store noting the version (string uniquely identifying the version, embedded in the script) and the current timestamp. Then, on fetch (or any functional event), the service worker checks the timestamp in the store, associated with it's version. If the timestamp was too long ago, the handler exists early and is essentially a no-op. Variations here: it might be more useful to record the timestamp in the activate step, though I think this would be more challenging to allow writes during activate. It might be nice to have the store be per-version, if we decide that there are enough use cases that are specific to service worker versions.

Throttling or debouncing

In order to throttle an operation, you need to know the last time it happened. Let's say you want to implement some sort of heartbeat or need to check the server regularly for something. You could do the check on every fetch, but this would be way to noisy. So you could throttle the check that is kicked off by fetch. We tried doing this with IndexedDB, but IndexedDB was far too unreliable. This use case would require allowing an active worker to make writes, which would complicate things.

Cache pointers

Consider a single service worker that handles requests for multiple applications. You might have multiple apps on a single domain, or you might serve a different app at the exact same url, depending on network speed or locale. Or you may simply want to be able to update the app immediately without having to always use skipWaiting(). At this point, the service worker as more of a generalized traffic server for a domain, deployed to browsers, rather than a tightly coupled part of a particular app. Thus, the upgrade lifecycle of the app(s) is separate from the install/activate lifecycle of the service worker. So the service worker would have to orchestrate updating the cache with a new version of a certain app, all while active. You could do this by keeping a reference to the active cache. This reference to the cache would need to be read on every fetch event that might be cached. When a new version of the app is detected, you populate a new cache, and once the new cache is ready, you change the pointer to point to the new cache and delete the old cache. Such a feature would require allowing an active worker to make writes.

Feature flags

You want to either run an A/B experiment, or be able to turn on/off a feature, without going through the normal install/activate cycle. There are a number of ways to implement, but there would need to be some sort of server endpoint that provided the latest feature flags. This endpoint could be called periodically (would probably need the throttle use case above) to update the flags, which would be stored in the proposed store. These flags might determine critical path behavior for fetch, which means reads must be more or less immediate. The flags could also be updated from the client, which could update the store via some api on the registration object.

The last two use cases boil down to being able to make adjustments to behavior without going through the normal install/activate phase. This is needed when you don't want to use skipWaiting() all the time, but you want some changes to be immediate.

mfalken · 2018-10-26T13:26:52Z

F2F: There was a fairly long discussion. It boiled down to: let's wait for Convenience Storage (https://github.com/domenic/async-local-storage) and then reevaluate.

wanderview · 2018-10-29T15:15:46Z

To clarify for those not present at the meeting, "convenience storage" is whatever final name is chosen for this proposal:

https://domenic.github.io/async-local-storage/

asakusuma · 2018-10-29T19:06:16Z

Since Convenience Storage is just a layer on top of IndexedDB, Convenience Storage won't be able to solve the latency issues. We need an API that can be on the critical path for responding to HTTP requests. So ideally should be less than 5ms.

wanderview · 2018-10-29T19:52:19Z

@asakusuma Do I remember correctly that you experienced the IDB slowness in both chrome and firefox? And that you were able to replace some IDB usage with Cache API as a key/val?

This would be somewhat surprising to me since I know that IDB and Cache API are both implemented on sqlite in firefox. I would expect them to have similar performance characteristics.

Perhaps the difference is that Cache API gets warmed up in the process of loading the service worker scripts themselves, so any slowness has already been experienced in the worker startup time. Then when the script hits IDB it has to warm a second database.

Having a separate bug for this slow IDB behavior might be useful, though.

asakusuma · 2018-10-29T21:18:29Z

@wanderview

Do I remember correctly that you experienced the IDB slowness in both chrome and firefox? And that you were able to replace some IDB usage with Cache API as a key/val?

Correct. However, I did not do a statistically high fidelity analysis of the IDB timeouts across browsers. I just eyeballed looking at the data and knowing our browser traffic breakdown. So I could be off. I'd like to do a more detailed experiment that can provide latency percentiles, instead of just an error count when the timeout is blown. The error count was in the seconds. I'd say that anything over 100ms is a non-starter for anything time sensitive. Over 10ms is sub-optimal.

asutherland · 2018-10-30T13:48:18Z

It's worth calling out that:

Google has recently created an IndexedDB macro-benchmark at https://github.com/google/forklift
At Mozilla it's our intent to focus on optimizing our IndexedDB and ServiceWorker implementations in the next few quarters with an eye on ServiceWorker time-to-first-read from IDB.
Mozilla intends to contribute to and build on the forklift macro-benchmark.

Re: IDB latency in Firefox, our implementation currently does an awkward open-close-open thing (due to mozStorage's ownership assertions interacting with Gecko's IDB threading model) that could be bad for latency due to WAL checkpointing and wasted/serialized I/O. This is among the low-hanging fruit we plan to look at.

asakusuma mentioned this issue Oct 30, 2018

Allow caches to opt-in to granular cleanup #863

Open

This was referenced Jan 28, 2019

Disallow starting readwrite transactions while readonly transactions are running? w3c/IndexedDB#253

Closed

Add a script version property to the ServiceWorker object #1387

Open

asutherland mentioned this issue Sep 15, 2019

Top-level await integration for ServiceWorkers running modules #1407

Closed

asutherland mentioned this issue Oct 16, 2020

consider allowing a non-scope identifier for registrations #1512

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A simple globally available store of data per-registration #1331

A simple globally available store of data per-registration #1331

slightlyoff commented Jun 28, 2018

jakearchibald commented Jun 29, 2018

jungkees commented Jun 29, 2018

jungkees commented Jun 29, 2018

mkruisselbrink commented Jun 29, 2018

slightlyoff commented Jun 29, 2018 via email

jakearchibald commented Jun 30, 2018 via email

slightlyoff commented Jun 30, 2018 via email

asakusuma commented Jul 9, 2018

asakusuma commented Jul 9, 2018

asutherland commented Jul 9, 2018

asakusuma commented Jul 16, 2018

jakearchibald commented Jul 17, 2018

asakusuma commented Jul 17, 2018

asutherland commented Jul 17, 2018

asakusuma commented Jul 17, 2018

jakearchibald commented Jul 18, 2018

asakusuma commented Jul 18, 2018 •

edited

Loading

mfalken commented Oct 26, 2018 •

edited

Loading

wanderview commented Oct 29, 2018

asakusuma commented Oct 29, 2018

wanderview commented Oct 29, 2018

asakusuma commented Oct 29, 2018

asutherland commented Oct 30, 2018

A simple globally available store of data per-registration #1331

A simple globally available store of data per-registration #1331

Comments

slightlyoff commented Jun 28, 2018

jakearchibald commented Jun 29, 2018

jungkees commented Jun 29, 2018

jungkees commented Jun 29, 2018

mkruisselbrink commented Jun 29, 2018

slightlyoff commented Jun 29, 2018 via email

jakearchibald commented Jun 30, 2018 via email

slightlyoff commented Jun 30, 2018 via email

asakusuma commented Jul 9, 2018

asakusuma commented Jul 9, 2018

asutherland commented Jul 9, 2018

asakusuma commented Jul 16, 2018

jakearchibald commented Jul 17, 2018

asakusuma commented Jul 17, 2018

asutherland commented Jul 17, 2018

asakusuma commented Jul 17, 2018

jakearchibald commented Jul 18, 2018

asakusuma commented Jul 18, 2018 • edited Loading

Expiration date or timeout for a service worker

Throttling or debouncing

Cache pointers

Feature flags

mfalken commented Oct 26, 2018 • edited Loading

wanderview commented Oct 29, 2018

asakusuma commented Oct 29, 2018

wanderview commented Oct 29, 2018

asakusuma commented Oct 29, 2018

asutherland commented Oct 30, 2018

asakusuma commented Jul 18, 2018 •

edited

Loading

mfalken commented Oct 26, 2018 •

edited

Loading