Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider relying on eTags (or other headers) for service worker dependencies to check for updates #839

Closed
delapuente opened this issue Feb 23, 2016 · 57 comments

Comments

@delapuente
Copy link

For the sake of modularization and isolation. Could it be possible to improve the update algorithm to rely on the eTag / content-size / other headers sent by the server to decide when a service worker changed? Now we need to include a mark of change in the sw and this forces a lot of developments to postprocess service worker files.

@jakearchibald jakearchibald added this to the Version 2 milestone Feb 23, 2016
@annevk
Copy link
Member

annevk commented Feb 23, 2016

I think @ehsan and I had a proposal for this at some point.

@jakearchibald
Copy link
Contributor

If the request is made with If-Modified-Since or If-None-Match, and the response is 200, we could assume this is a new SW even if it's byte identical. Although this would cause unnecessary service worker updates on servers that send out Etag or Last-Modified but don't correctly return 304.

Else we just add Service-Worker-Force-Update: 1 or something.

@delapuente
Copy link
Author

More than checking if the main sw file changed, I'm referring to its dependencies imported via importScripts().

@jakearchibald
Copy link
Contributor

Yep, understand that, you'd be sending an etag (or last-modified) that represented the SW and its dependencies.

Maybe that's too hacky?

@delapuente
Copy link
Author

I think that undermines the purpose of the ETag about digesting the content being served. It would do the trick but I think is better to come up with something more pluggable in the actual ecosystem not requiring to hack with semantics. ;)

@jakearchibald
Copy link
Contributor

Yeah, I think you're right. Service-Worker-Force-Update: 1 would work.

@jakearchibald
Copy link
Contributor

I think we should have a JS API for this, reg.update({force: true}) perhaps.

@jakearchibald
Copy link
Contributor

F2F: agreement on serviceworker.skipWaiting() and reg.update({force: true}) - but may need to look at naming of force.

Lots of problems using etags for this, eg if CDN stop serving etags for some reasons.

Rough idea around some-header-name: value where value is a digest like etags.

@jakearchibald
Copy link
Contributor

reg.update({force: true}) will leave you with the existing worker if the update fails, as usual.

@annevk
Copy link
Member

annevk commented Apr 14, 2016

Was anyone able to dig up the previous proposal from @ehsan and I? I can't find it.

@collimarco
Copy link

+1 for this

There should be a way to update the imported scripts even when the main code of the service worker is unchanged.

This is also a big issue for BaaS whose scripts are imported in the customers' service workers: I have more extensively described our situation in this blog post.

@wanderview
Copy link
Member

@jakearchibald Can you remind me why we don't just check importScripts() for byte changes as well? I seem to recall it was since importScripts could be called async way back when, but now we've made that throw. Since we only allow sync importScripts maybe we can just include it in the byte check.

@mfalken
Copy link
Member

mfalken commented Jul 11, 2016

See issue #639 for the importScripts discussion, which roughly concluded with "let's revisit this later".

@jakearchibald
Copy link
Contributor

@wanderview

Can you remind me why we don't just check importScripts() for byte changes as well?

It could mean a lot of network requests just for an update check. I was worried about that. But also it means a change to a third party script would result in a whole new SW install, which sounds a bit… invasive.

I like reg.update({force: true}) as it give us a script-land way to do Chrome's "update on reload", but maybe for third party scripts we need to revisit the idea of providing access to the cache SW uses to store its scripts.

importScripts('//example.com/whatever.js');

// then later…
self.registration.getCache().then(cache => {
  cache.add('//example.com/whatever.js');
});

My big worry here is security. We'd only want the SW to be able to access this, as giving access from pages would turn a minor XSS into a huge problem.

@wanderview
Copy link
Member

It could mean a lot of network requests just for an update check.

Sure, but this is a trade off sites can make for themselves. Do I want the convenience of structuring my code in separate files? Or do I want to minimize the number of SW script 304s my server has to send?

I could see smaller shops opting for developer convenience while huge sites compacting into a single file to minimize server load.

But also it means a change to a third party script would result in a whole new SW install, which sounds a bit… invasive.

What I'm hearing is that this is what developers expect to happen and they are surprised when it doesn't.

I like reg.update({force: true}) as it give us a script-land way to do Chrome's "update on reload"

I like this too, but I think its a different use case. From what I can tell developers want to compose their service worker scripts from decoupled sources and have things just update to the latest. Every step we add to get the updates to trigger creates friction and requires more tight coupling between modules.

At the very least it seems we could do this as an opt-in to register(). Something like update-checks-imports:true or something.

@jakearchibald
Copy link
Contributor

update-checks-imports:true is interesting. Or something called during the install event to set which scripts should be checked.

If I show a "please refresh for latest version" message when there's an update waiting, I'm not sure I'd want that just because Google Analytics or whoever had updated something.

Then again, third party services like Analytics will live in Foreign Fetch instead.

@collimarco
Copy link

this is what developers expect to happen and they are surprised when it doesn't.

Exactly! IMHO The web is so great (and Javascript is becoming pervasive) for its simplicity. Please don't create a giant and complex monster: leave that to native apps. And please don't fall into premature optimization.

update-checks-imports:true is interesting

I agree. But I don't think that that choice should be left to the user. What if he denies? Then he would never get updates and the scripts will finally break.

I think that if you don't want all scripts to be refreshed by default you should leave to the developer the choice. For example: importScripts('//example.com/whatever.js', check-for-updates: true);. So the developer can prevent the refresh for large files (like Analytics) and allow small and more useful scripts to be refreshed automatically.

@delapuente
Copy link
Author

update-checks-imports:true is interesting. Or something called during the install event to set which scripts should be checked.

I was thinking about this:

importScripts('//3rd-party.com/whatever.js', { forceUpdate: true }); // makes update algorithm to byte-to-byte compare the dependency.

This way, the developer can mark the dependencies causing updates, this trade off @wanderview was talking about is made explicit at the same time you can preserve file sanity via modularization.

We could make forceUpdate to be true by default (so we should change current spec but it will end with a more predictable API) or false (preserving current spec).

Furtheremore, if, at some time, the developer want a dependency to be part of the checking no longer, she simply flips the flag.

@jakearchibald
Copy link
Contributor

jakearchibald commented Jul 27, 2016

Having the option in importScripts makes real sense. Nice. Unfortunately it doesn't cover JavaScript modules so well.

@wanderview
Copy link
Member

But do we want to make a service worker specific importScripts() interface? This would not work if someone uses a library that then uses the existing importScripts() API internally. The top level import would get updated but not any of its dependencies.

I think it would be better to put this on the install or activate event personally. It can then automatically apply for all importScripts, modules, or other future added methods of bringing in script.

@jakearchibald
Copy link
Contributor

Sure. The thing I liked about the importScripts solution is it was at a resource level, but I'm sure we can achieve that via another API.

pondering

If the API was something like alsoCheckTheByteEqualityOfThese(requests), could you include things that you weren't using in modules and importScripts? That would enable you to have a single resource that echoed the version number. Dunno if that's useful.

@delapuente
Copy link
Author

Sure. The thing I liked about the importScripts solution is it was at a resource level, but I'm sure we can achieve that via another API.

That was the idea.

But do we want to make a service worker specific importScripts() interface? This would not work if someone uses a library that then uses the existing importScripts() API internally.

Well, what I would find extremely uncomfortable is to re-declare my imports for marking purposes only. Perhaps introducing a new import function (importScriptForcingUpdate(...))?

Dealing with ES6 modules is complicated but what about a pragma:

import "my-library";

"force update";
import "my-other-library";

I don't really like it and I don't really know if there is a standard mechanism to introduce "use strict"-like pragmas in ES6 but declarative APIs are this kind of inconvenient.

@jakearchibald
Copy link
Contributor

F2F:

  • Should we check all imported scripts by default? Yes
  • Check the flattened imported scripts & the main script, if any of them are byte-by-byte different, including !ok responses, trigger an update (where a !ok response will fail the update)
  • The browser may optimise for this, eg if the main script has changed it doesn't need to check its imported scripts
  • No opt-out of this

@jakearchibald
Copy link
Contributor

@ithinkihaveacat

Are the byte-for-byte checks shared across different service workers? (If two service workers import the same script, are the etags/hashes "shared"?)

When a service worker fetches a script (either the main script or imported), it will go to the network (optionally) via the HTTP cache. It won't go to the cache API or the script cache of any other service workers.

@ithinkihaveacat
Copy link
Contributor

@jakearchibald Can that lead to a situation where the resources A and B are both updated at the origin, but the browser only notices that B has changed (because of timing-related artefacts of the HTTP cache), and so updates the SW using the "old" version of A and the "new" version of B?

@jakearchibald
Copy link
Contributor

@ithinkihaveacat yep. Same is true for HTML documents. We have made the HTTP cache opt-in because of developer confusion around this (#893). Developers who opt into the HTTP cache should understand how it works.

jungkees added a commit that referenced this issue Dec 12, 2016
This changes the behavior of the service worker script resource
comparison. Before this, only the main service worker script was
compared to a new version. With this change, all the imported scripts
stored in the imported scripts map as well as the main script are
inspected against the corresponding network resources (based on the
urls.)

Note:
 - Service worker's script resource map has been renamed and moved to
 service worker's script resource's imported scritps map.
 - registration's last update check time's always updated whenever the
 response is fetched from the network (regardless it's a main script or
 an imported script.)

Fixes #839.
@wanderview
Copy link
Member

Yea, I agree they are orthogonal issues.

@jakearchibald
Copy link
Contributor

An interesting point has been raised internally - is it possible we could damage sites relying on caching by making this change. I'll reach out to our biggest users and see how they feel. Worst comes to worst, we could make no-cache opt-in.

@wanderview
Copy link
Member

@jakearchibald When you talk to these sites, can you also mention there is a work around if they are serving unique hashed resources? They can set cache-control:immutable with a very large max-age to avoid these network requests at all in firefox/chrome.

@ithinkihaveacat
Copy link
Contributor

ithinkihaveacat commented Dec 13, 2016

@wanderview Sites that are able to generate unique hashed resources wouldn't really need this feature though, right?

I thought the point was to make it possible for service workers to do e.g. importScripts('https://www.gstatic.com/firebasejs/firebase-app.js'); and be able to quickly and reliably pick up changes to https://www.gstatic.com/firebasejs/firebase-app.js even if the service worker itself remained byte-for-byte identical.

If the default for all network activity related to service worker update checks becomes no-cache (as per #839 (comment)) then that's going to result in a lot of 304s for any widely deployed resource. (However, not doing this would lead to browsers potentially getting themselves into an inconsistent state #839 (comment).)

@wanderview
Copy link
Member

I thought the point was to make it possible for service workers to do e.g. importScripts('https://www.gstatic.com/firebasejs/firebase-app.js'); and be able to quickly and reliably pick up changes to https://www.gstatic.com/firebasejs/firebase-app.js even if the service worker itself remained byte-for-byte identical.

I guess I thought people typically versioned 3rd party dependencies. Allowing external dependencies to float at-will in production seems kind of crazy to me.

@ithinkihaveacat
Copy link
Contributor

I guess I thought people typically versioned 3rd party dependencies. Allowing external dependencies to float at-will in production seems kind of crazy to me.

I suppose it depends on the use case. Something like https://www.google-analytics.com/ga.js isn't versioned, and that works out fine.

https://www.hodinkee.com/OneSignalSDKWorker.js consists of one line:

importScripts('https://cdn.onesignal.com/sdks/OneSignalSDK.js');

Obviously whatever's in OneSignalSDK.js could be inlined into OneSignalSDKWorker.js (would even save a network request), but then OneSignal need to get Hodinkee to deploy a new version every time they update their SDK.

jungkees added a commit that referenced this issue Dec 15, 2016
This changes the behavior of the service worker script resource
comparison. Before this, only the main service worker script was
compared to a new version. With this change, all the imported scripts
stored in the imported scripts map as well as the main script are
inspected against the corresponding network resources (based on the
urls.)

Note:
 - Service worker's script resource map has been renamed and moved to
 service worker's script resource's imported scritps map.
 - registration's last update check time's always updated whenever the
 response is fetched from the network (regardless it's a main script or
 an imported script.)

Fixes #839.
@delapuente
Copy link
Author

Is this already implemented in Chrome or Firefox?

@wanderview
Copy link
Member

Is this already implemented in Chrome or Firefox?

Updating based on importScripts() in FF has been started, but not completed:

https://bugzilla.mozilla.org/show_bug.cgi?id=1290951

Related to this, defaulting updates to no-cache is already implemented in FF53:

https://bugzilla.mozilla.org/show_bug.cgi?id=1290944

@jakearchibald
Copy link
Contributor

@KenjiBaheux & I should email SW users to make sure big users of SW are aware of this.

@mfalken
Copy link
Member

mfalken commented Apr 4, 2017

The F2F resolution was to check importScripts in the byte-for-byte comparison; however issue #893 changed so that useCache would specify caching the importScripts by default.

poshaughnessy added a commit to SamsungInternet/dashboard that referenced this issue Jun 19, 2017
NB. Currently, updating the data resource files will not make the
service worker re-cache them. The service worker file itself will
need to be updated. However, browser are working on this. See:
w3c/ServiceWorker#839
@jungkees
Copy link
Collaborator

jungkees commented Mar 7, 2018

While working on this issue with @mattto, I found out we need to discuss about when to fetch and compare the imported classic scripts for Update. (See #1283 (comment).) Now we have two options:

  1. Fetch imported scripts during the first evaluation of the main script in Update.
  2. Fetch imported scripts (of newestWorker) before evaluating the main script.

(2) allows us to return early even before starting a worker. @jakearchibald seemed to be concerned about double-download (https://github.com/w3c/ServiceWorker/pull/1023/files#r92201798) here, but we can avoid importScripts() in the main script downloading the scripts from the network because we fill in the cache before that time.

But if the imported scripts in (2) have errors, we can't avoid running the main script and the cached scripts before catching those errors anyway.

Thoughts?

/cc @jakearchibald @wanderview @aliams @cdumez

EDIT: I tried it with (1) in #1283.

@mfalken
Copy link
Member

mfalken commented Mar 7, 2018

I responded in #1283 (comment), but to reiterate here, I'm much more concerned about needlessly starting a service worker in the common case than in the error case. I think we should avoid starting a service worker until the byte-to-byte update check (including importScripts) shows that an update is possible. Otherwise almost every navigation will start a new service worker to do an update check which will usually just be wasted.

@jungkees
Copy link
Collaborator

jungkees commented Mar 7, 2018

That's a fair point. I agree to (2). We can do this without doing a double-download.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants