Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_Update algorithm should unregister SW on 404 and 410 errors #204

Closed
jyasskin opened this issue Mar 25, 2014 · 31 comments
Closed

_Update algorithm should unregister SW on 404 and 410 errors #204

jyasskin opened this issue Mar 25, 2014 · 31 comments

Comments

@jyasskin
Copy link
Member

Currently the _Update algorithm says

  • If fetching the script fails due to the server returns a 4xx or 5xx response or equivalent, or there is a DNS error, or the connection times out, then:
    1. Reject promise with a new NetworkError.
    2. Set serviceWorkerRegistration.updatePromise to null.
    3. Abort these steps.

Both @annevk and @jakearchibald have suggested that all 4xx responses should be interpreted as unregistering the SW instead of as an update failure. AppCache only treates a 404 or 410 as deleting the AppCache, and I think some other responses like 403 may be transient, so I'm only proposing to match AppCache. If y'all think all 4xx responses should unregister instead, I won't argue.

@jakearchibald
Copy link
Contributor

👍 Happy to unregister on 404/410 in line with appcache.

@annevk
Copy link
Member

annevk commented Mar 26, 2014

The main reason I suggested 4xx was that in that case the server can be reached and is active but apparently is no longer hosting an accessible service worker in that location.

With your suggested approach HTTP-only sites with a 403 space would be more vulnerable to attacks.

@jyasskin
Copy link
Member Author

This is a proposal for HTTPS sites. Even unregistering on all 4xx errors wouldn't be enough for HTTP sites; see #199 for that.

@piranna
Copy link

piranna commented Mar 27, 2014

I would unregister for all 4xx & 5xx errors. If a user agent receive them, both the client and the server are online so it's feasable to receive the updated ServiceWorker in the near future, and during this time the ServiceWorker served resources can be available directly by the server. I would only prevent to unregister the SW after the stated 24 hours life limit if the client is offline.

@annevk
Copy link
Member

annevk commented Mar 27, 2014

@jyasskin the same reasoning applies to HTTPS I think, apart from the attacks.

@piranna no, 5xx could be because the server is undergoing maintenance or is under heavy traffic or some such. Makes no sense to unregister there.

@piranna
Copy link

piranna commented Mar 27, 2014

@annevk, I didn't take in account that situation. Could it make sense to automatically unregister on any 4xx error, and left th SW untouched until the 5xx error gets fixed so a decision can be done? I think it should fix the problems we are having here...

@annevk
Copy link
Member

annevk commented Mar 27, 2014

Yes, that follows from OP and subsequent comments.

jungkees added a commit that referenced this issue Mar 31, 2014
@slightlyoff
Copy link
Contributor

Unregistration on 4xx is broken. It is hostile to intermittent data center downtime, etc. If you need this for security, use SSL.

@annevk annevk reopened this Apr 4, 2014
@annevk
Copy link
Member

annevk commented Apr 4, 2014

Why would a data center return 4xx in that case and not 5xx?

@jyasskin
Copy link
Member Author

IIUC, restricting to SSL obsoletes this issue.

@jeffposnick
Copy link
Contributor

What if a 404/410 triggered a SW unregistration, but only on localhost origins?

The decision to close this issue made assumptions that we were dealing with a production server, using HTTPS, but this has become a recurring pain point for developers who are using SWs on a common localhost:port combination, e.g. localhost:3000 or localhost:8000. As soon as the first project that uses a SW is registered, that SW will remain in control for subsequent navigations, even if the developer switches to a project that has no SW.

See facebook/create-react-app#2438, for one example of the attempts to work around this, but that type of one-off solution doesn't scale.

@asutherland
Copy link

I think it's absolutely desirable to address this. I worry about the complexity and potential security fallout of putting the developer special-case in the heart of the spec, though. This seems like a use-case best handled by each browser's devtools. They can be more proactive with UI and heuristics than would ever make sense in the spec.

For example, an update check might take up to 24 hours to hit the network and trigger a 404/410-based uninstall if updateViaCache="all" (formerly useCache, #1107) and Cache-Control says the data is still good. But devtools can by default show a UI affordance in the URL bar that indicates the page was intercepted by a ServiceWorker against localhost, and for good measure throw up a door-hangar or panel resulting from a proactive update check that notices the SW is gone, etc.

This is also more likely to play well with devtools' affordances like Chrome's "Update on reload" checkbox option under the "Service Workers" page of the "Application" tab.

Having said that, there are header-based mitigations that tools that are running their own webserver on localhost could use to try and proactively mitigate the situation, such as is discussed in facebook/create-react-app#2438. For example, the link header could be used to install a "cleanup" service worker when the webserver gets a chance to serve the user a page because the refresh button was hit and bypassed the existing SW. I think there's also been talk of a header to let an origin request that its persistent storage be blown away?

@wanderview
Copy link
Member

Note, we have seen production servers orphaning service worker scripts on clients with a 404. We have many users continually trying to update:

https://plus.google.com/serviceworker.js

@jakearchibald
Copy link
Contributor

That's a 400 rather than a 404/410 which is a bit weird.

FWIW I'm still not convinced we made the right choice here, and would be happy to make 404/410 unregister.

I get that this may potentially trigger some false positives in event of a server glitch (#204 (comment)), but I feel these situations are rarer than issues with orphaning service worker scripts as @wanderview describes.

@slightlyoff is your view here unchanged?

@wanderview
Copy link
Member

For the orphaned production scripts would could unregister after N consecutive 404s or something. Or 404s over time period X.

That would not help with the localhost case, though.

@jakearchibald
Copy link
Contributor

Are intermittent 404s a thing that happens? Are they more common than server glitches we aren't defending against, such as serving a blank file.

Appcache unregistered on 404 and I don't recall any complaints about that, but they may have been drowned out by other complaints.

@jeffposnick
Copy link
Contributor

Re: #204 (comment), regarding whether specialized behavior for localhost might be best addressed via browsers' DevTools, that's another avenue that I'm exploring, with some FRs for Chrome:

(I haven't filed similar requests against Firefox.)

@wanderview
Copy link
Member

Are intermittent 404s a thing

I don't know. I defer to people running sites on this.

@wanderview
Copy link
Member

Note, if we do auto-unregister on 404, what do we do with data in the Cache API? Will this cause problems if a new service worker is registered on top of the old storage?

@jakearchibald
Copy link
Contributor

Note, if we do auto-unregister on 404, what do we do with data in the Cache API?

The Cache API is just part of origin storage, so I wouldn't expect the cache API to be removed.

@wanderview
Copy link
Member

The Cache API is just part of origin storage, so I wouldn't expect the cache API to be removed.

Sure, but does this solve the use case of "I tested this on localhost ages ago and I have a bunch of unexpected state on localhost messing up my localhost today"?

Still, it would be nice to fix the orphaned update case as that uses actually CPU, bandwidth, etc on real sites today.

@NekR
Copy link

NekR commented Jun 10, 2017

Cache API could be used from main thread for many things. Will browsers browser be able to detect if cache is ServiceWorker only? Unless it will be stored as cache's metadata then probably not.

Sure, but does this solve the use case of "I tested this on localhost ages ago and I have a bunch of unexpected state on localhost messing up my localhost today"?

Only if the same (hard-coded?) cache name is used, but I see how this may happen too. SW tools should save from this.

Still, it would be nice to fix the orphaned update case as that uses actually CPU, bandwidth, etc on real sites today.

Can you share stats on how often that happens with how many websites? (assuming Google Plus SW isn't the only case)

@NekR
Copy link

NekR commented Jun 10, 2017

Throwing in some other ideas:

  • Unregister on 404 only if page (on navigation)[1] doesn't call navigator.serviceWorker.register() anymore
  • Have an unregisterOnFailure option in navigator.serviceWorker.register() (I don't really like this)

[1] During a navigation, i.e. between current navigation and the next one in the same tab

@jakearchibald
Copy link
Contributor

Reopening as there's some interest.

@ro-savage
Copy link

ro-savage commented Jun 27, 2017

I'm all for unregistering the service worker to avoid issues I've encountered in Create React App.

Beyond localhost issues described by @jeffposnick. What if I create an app that has a service-worker.js, deploy it on something.com. Then later on I released a new app without service workers on something.com

As it stands my new app, even though it has never touched a service worker would have to have something to deregister old service workers, or else every user would always see the old version of the app.

If the server responds with a 404 that is quite an obvious error of not finding a file and I have never encountered a server responding with 404 when it was actually a 500 error (but yes, it could happen under certain edge-case circumstances).

Even if it does, worst case someone loses access to the site temporarily (no more offline after 1st reload). On the other hand, worst case of ignoring 404 is someone is stuck forever seeing the incorrect webapp.

@NekR
Copy link

NekR commented Jun 27, 2017

If the server responds with a 404 that is quite an obvious error of not finding a file and I have never encountered a server responding with 404 when it was actually a 500 error something..

I'm pretty sure there are such webservers, e.g. an entry is temporarily missing from database or just a glitch happened. But yes, that's probably not very common for static files such as SW file.

Even if it does, worst case someone loses access to the site temporarily (no more offline after 1st reload). On the other hand, worst case of ignoring 404 is someone is stuck forever seeing the incorrect webapp.

This sounds reasonable to me.

@jakearchibald
Copy link
Contributor

F2F: Facebook & Google servers will occasionally 404 due to bugs and do not want service workers to disappear because of this.

Service worker offers you enough tools & headers to see what's happening.

Jake promises never to bring this up again.

If we want to look at this in future, it should be in the form of an updatefailure event or some-such, that developers could use to monitor updates and unregister the service worker themselves if they get a series of 404s in a row.

@annevk
Copy link
Member

annevk commented Nov 8, 2017

This still seems somewhat sad. Even if 404 is triggered on accident, I have a hard time believing the same is true for 410.

@jakearchibald
Copy link
Contributor

Unregistering on 410 is fine, but is it useful? It doesn't fall with the "I removed the service worker and forgot about it" case. If you can put the effort into serving a 410, you can just create a service worker that unregisters itself.

@annevk
Copy link
Member

annevk commented Nov 8, 2017

A 410 seems easier to manage, but maybe Clear-Site-Data will also take care of this which is equally easy to manage.

@TomPradat
Copy link

I'm all for unregistering the service worker to avoid issues I've encountered in Create React App.

Beyond localhost issues described by @jeffposnick. What if I create an app that has a service-worker.js, deploy it on something.com. Then later on I released a new app without service workers on something.com

As it stands my new app, even though it has never touched a service worker would have to have something to deregister old service workers, or else every user would always see the old version of the app.

If the server responds with a 404 that is quite an obvious error of not finding a file and I have never encountered a server responding with 404 when it was actually a 500 error (but yes, it could happen under certain edge-case circumstances).

Even if it does, worst case someone loses access to the site temporarily (no more offline after 1st reload). On the other hand, worst case of ignoring 404 is someone is stuck forever seeing the incorrect webapp.

I can't agree more, this is likely to cause serious problems in the future. I don't think the fact that Google and Facebook don't want this is a solid argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests