Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve service worker script caching and update #1283

Merged
merged 4 commits into from
Mar 13, 2018
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 50 additions & 31 deletions docs/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,9 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe

A [=/service worker=] has an associated <dfn export id="dfn-skip-waiting-flag">skip waiting flag</dfn>. Unless stated otherwise it is unset.

A [=/service worker=] has an associated <dfn export id="dfn-imported-scripts-updated-flag">imported scripts updated flag</dfn>. It is initially unset.
A [=/service worker=] has an associated <dfn export id="dfn-classic-scripts-imported-flag">classic scripts imported flag</dfn>. It is initially unset.

A [=/service worker=] has an associated <dfn export id="dfn-include-updated-resources-flag">include updated resources flag</dfn>. It is initially unset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "includes updated resources flag"? "Include" sounds it an action, as in something will be included, whereas "includes" sounds more like a statement.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this flag as part of the new snapshot.


A [=/service worker=] has an associated <dfn export id="dfn-set-of-event-types-to-handle">set of event types to handle</dfn> (a [=ordered set|set=]) whose [=list/item=] is an <a>event listener</a>'s event type. It is initially an empty set.

Expand Down Expand Up @@ -2027,22 +2029,28 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe
When the <dfn method for="ServiceWorkerGlobalScope" id="importscripts-method"><code>importScripts(|urls|)</code></dfn> method is called on a {{ServiceWorkerGlobalScope}} object, the user agent *must* <a>import scripts into worker global scope</a>, given this {{ServiceWorkerGlobalScope}} object and |urls|, and with the following steps to [=fetching scripts/perform the fetch=] given the [=/request=] |request|:

1. Let |serviceWorker| be |request|'s [=request/client=]'s [=environment settings object/global object=]'s [=ServiceWorkerGlobalScope/service worker=].
1. If |serviceWorker|'s <a>imported scripts updated flag</a> is unset, then:
1. Let |registration| be |serviceWorker|'s [=containing service worker registration=].
1. Set |request|'s [=service-workers mode=] to "`none`".
1. Set |request|'s [=request/cache mode=] to "<code>no-cache</code>" if any of the following are true:
* |registration|'s [=service worker registration/update via cache mode=] is "`none`".
* The [=current global object=]'s [=force bypass cache for importscripts flag=] is set.
* |registration|'s [=last update check time=] is not null and the time difference in seconds calculated by the current time minus |registration|’s [=last update check time=] is greater than 86400.
1. Let |response| be the result of <a lt="fetch">fetching</a> |request|.
1. If |response|’s <a for="response" href="https://github.com/whatwg/fetch/issues/376">cache state</a> is not "<code>local</code>", set |registration|’s [=service worker registration/last update check time=] to the current time.
1. [=Extract a MIME type=] from the |response|'s [=unsafe response=]'s [=response/header list=]. If this MIME type (ignoring parameters) is not a [=JavaScript MIME type=], return a [=network error=].
1. If |response|'s <a>unsafe response</a>'s [=response/type=] is not "<code>error</code>", and |response|'s [=response/status=] is an <a>ok status</a>, then:
1. [=map/Set=] <a>script resource map</a>[|request|'s [=request/url=]] to |response|.
1. Return |response|.
1. Else:
1. If <a>script resource map</a>[|url|] [=map/exists=], return <a>script resource map</a>[|url|].
1. Else, return a <a>network error</a>.
1. If |serviceWorker|'s [=service worker/script resource map=][|request|'s [=request/url=]] [=map/exists=], return [=service worker/script resource map=][|request|'s [=request/url=]].
1. If |serviceWorker|'s [=state=] is *installed*, *activating*, *activated*, or *redundant*, return a [=network error=].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would it be better to say "Not parsed or installing"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Addressed.

1. Let |registration| be |serviceWorker|'s [=containing service worker registration=].
1. Set |request|'s [=service-workers mode=] to "`none`".
1. Set |request|'s [=request/cache mode=] to "<code>no-cache</code>" if any of the following are true:
* |registration|'s [=service worker registration/update via cache mode=] is "`none`".
* The [=current global object=]'s [=force bypass cache for importscripts flag=] is set.
* |registration|'s [=last update check time=] is not null and the time difference in seconds calculated by the current time minus |registration|’s [=last update check time=] is greater than 86400.
1. Let |response| be the result of <a lt="fetch">fetching</a> |request|.
1. If |response|’s <a for="response" href="https://github.com/whatwg/fetch/issues/376">cache state</a> is not "<code>local</code>", set |registration|’s [=service worker registration/last update check time=] to the current time.
1. [=Extract a MIME type=] from the |response|'s [=unsafe response=]'s [=response/header list=]. If this MIME type (ignoring parameters) is not a [=JavaScript MIME type=], return a [=network error=].
1. If |response|'s <a>unsafe response</a>'s [=response/type=] is not "<code>error</code>", and |response|'s [=response/status=] is an <a>ok status</a>, then:
1. Let |newestWorker| be the result of running [=Get Newest Worker=] with |registration|.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't newestWorker just be serviceWorker, it's the new installing worker?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the importScripts() is called during the evaluation in Update, serviceWorker ins't in the installing state yet. newestWorker here would be one of the workers of the registration. So, we're comparing the new pre-installing worker's resources with the current newest worker's resources here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, I think I saw the Run Service Worker algorithm being called in Install and thought that was the first time we are running the worker.

1. Let |resource| be null.
1. If |newestWorker| is not null, set |resource| to |newestWorker|'s [=service worker/script resource map=][|request|'s [=request/url=]], or null if it does not [=map/exist=].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this, it makes it look like importScripts() in one service worker (serviceWorker) is affecting the state of another service worker (newestWorker), which I thought #1041 decided was wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a case where serviceWorker doesn't have any stored response and hasn't gone to the installed state. (Cases other than that return early in the step 2 and 3.)

My intention here is to compare the response retrieved from the network (or HTTP cache depending on the cache mode) with the registration's newest worker's stored response, if any. I think we'd want to make it continue with Install only when importScripts() for serviceWorker has byte-differnce against newestWorker's stored resources. The service worker's include updated resources flag is used to mark this condition that is later checked in Update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm understanding more.

I thought when we discussed this a while back (#839 (comment)), we decided the "update" service worker shouldn't run at all if the main script and imported scripts of the newest installed worker haven't changed. I.e., the update check should first fetch the main script and importScripts of the newest worker, if those haven't changed you halt the update. If there is a change you launch the new service worker. It'd be wasteful to spawn a worker just to do the importScripts then discover there was no change, and then terminate the worker before installation. But maybe we should talk about this one of the GitHub issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm understanding more.

I thought when we discussed this a while back (#839 (comment)), we decided the "update" service worker shouldn't run at all if the main script and imported scripts of the newest installed worker haven't changed. I.e., the update check should first fetch the main script and importScripts of the newest worker, if those haven't changed you halt the update. If there is a change you launch the new service worker. It'd be wasteful to spawn a worker just to do the importScripts then discover there was no change, and then terminate the worker before installation. But maybe we should talk about this one of the GitHub issues.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see.. We had already discussed this a while back ;). But I was worried about the comment in https://github.com/w3c/ServiceWorker/pull/1023/files#r92201798, too. And what should we do if any of the imported scripts were updated, later start a worker with them in the cache, and they contain errors. We couldn't avoid starting a work in this case anyway.

But I'm open to this option or any other smarter ways to improve performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to spec it in a way that doesn't do a double download...

Currently in the common case a navigation most likely will not cause a separate service worker to run for the update check. This spec change will change it so that most likely one will, so the browser will often start two service workers for a navigation. That would really impact memory/performance for us.

I wouldn't worry too much about the uncommon error case causing an extra service worker to be started.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point. I think we can also avoid a double download by populating the cache before the first execution of the main script.

1. Set |serviceWorker|'s [=service worker/include updated resources flag=] if any of the following are true:
* |newestWorker| is null.
* |resource| is null.
* |resource|'s [=response/body=] is not byte-for-byte identical with |response|'s [=response/body=].
1. [=map/Set=] |serviceWorker|'s [=service worker/script resource map=][|request|'s [=request/url=]] to |response|.
1. Set |serviceWorker|'s [=classic scripts imported flag=].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I'm a bit confused what about the meaning of include updated resources flag and classic scripts imported flag. Could you help me understand what they are doing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The classic scripts imported flag is used to indicate the main script has some imported scripts so Update must not return early before actually checking the imported scripts even if the main scripts are byte identical. I named it classic .. because module scripts don't support importScripts() in the first place.

The job's potentially include updated resources flag is used to check if we can return early even without evaluating the script. That is, the script received from the network is byte identical and doesn't import any script. For module scripts, it checks all the responses gotten during fetching the module worker script graph.

The service worker's include updated resources flag is checked when Update couldn't determined from the earlier steps whether it could stop early.

1. Return |response|.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To aid my understanding, what exactly happens in the error response case? Does it result in an error so the installation/update fails, or does it mean we basically treat it like the importScripts() call never happened?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An error response makes it throw a "NetworkError" DOMException in fetch a classic worker-imported script step 4. That again makes Update reject the job promise in https://github.com/w3c/ServiceWorker/pull/1283/files#diff-27b79860afe28f01aed4f1f6228367faR2467. This particular behavior hasn't been changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. That means if you do try { importScripts('fail.js'); } and catch the exception, you can later try to do importScripts('fail.js') and it'll retry again. I think that matches our implementation too.

(Acknowledged that this hasn't changed, just confirming my understanding.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if you catch the exception, it just installs it without populating the cache. When the service worker starts up later, importScript('fail.js') will get a network error again that throws and be caught again.

</section>
</section>

Expand All @@ -2069,7 +2077,7 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe
<section>
<h3 id="privacy">Privacy</h3>

[=/Service workers=] introduce new persistent storage features including <a>scope to registration map</a> (for [=/service worker registrations=] and their [=/service workers=]), [=request response list=] and <a>name to cache map</a> (for caches), and <a>script resource map</a> (for script resources). In order to protect users from any potential <a biblio data-biblio-type="informative" lt="unsanctioned-tracking">unsanctioned tracking</a> threat, these persistent storages *should* be cleared when users intend to clear them and *should* maintain and interoperate with existing user controls e.g. purging all existing persistent storages.
[=/Service workers=] introduce new persistent storage features including <a>scope to registration map</a> (for [=/service worker registrations=] and their [=/service workers=]), [=request response list=] and <a>name to cache map</a> (for caches), and [=service worker/script resource map=] (for script resources). In order to protect users from any potential <a biblio data-biblio-type="informative" lt="unsanctioned-tracking">unsanctioned tracking</a> threat, these persistent storages *should* be cleared when users intend to clear them and *should* maintain and interoperate with existing user controls e.g. purging all existing persistent storages.
</section>
</section>

Expand Down Expand Up @@ -2147,8 +2155,12 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe

A <a>job</a> has a <dfn id="dfn-job-worker-type">worker type</dfn> ("<code>classic</code>" or "<code>module</code>").

A [=job=] has a <dfn export id="dfn-job-script-resource-map">script resource map</dfn> which is an <a>ordered map</a> where the keys are [=/URLs=] and the values are [=/responses=].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used in the update algorithm. Unless I'm missing something, it could just be a variable within the update algorithm.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I made it a variable.


A <a>job</a> has an <dfn id="dfn-job-update-via-cache-mode">update via cache mode</dfn>, which is "`imports`", "`all`", or "`none`".

A [=job=] has a <dfn id="dfn-job-potentially-include-updated-resources-flag">potentially include updated resources flag</dfn>. It is initially unset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is only used in the update algorithm.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I made it a variable.


A <a>job</a> has a <dfn id="dfn-job-client">client</dfn> (a [=/service worker client=]). It is initially null.

A <a>job</a> has a <dfn>referrer</dfn> (a [=/URL=] or null).
Expand Down Expand Up @@ -2418,6 +2430,11 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe
1. Else:
1. Invoke [=Reject Job Promise=] with |job| and "{{SecurityError}}" {{DOMException}}.
1. Asynchronously complete these steps with a <a>network error</a>.
1. Set |job|'s [=job/potentially include updated resources flag=] if any of the following are true:
* |newestWorker| is null.
* |newestWorker|'s [=classic scripts imported flag=] is set.
* |newestWorker|'s [=service worker/script resource map=][|request|'s [=request/url=]]'s [=response/body=] is not byte-for-byte identical with |response|'s [=response/body=].
1. Set |job|'s [=job/script resource map=][|request|'s [=request/url=]] to |response|.
1. If |response|'s <a for="response" href="https://github.com/whatwg/fetch/issues/376">cache state</a> is not "<code>local</code>", set |registration|'s <a>last update check time</a> to the current time.

Issue: The response's cache state concept had been removed from fetch. The fetch issue <a href="https://github.com/whatwg/fetch/issues/376">#376</a> tracks the request to restore the concept or add some similar way to check this state.
Expand All @@ -2434,20 +2451,23 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe
1. Invoke <a>Finish Job</a> with |job| and abort these steps.

Else, continue the rest of these steps after the algorithm's asynchronous completion, with |script| being the asynchronous completion value.

1. If |newestWorker| is not null, |newestWorker|'s [=service worker/script url=] [=url/equals=] |job|'s [=job/script url=], and |script|'s [=source text=] is a byte-for-byte match with |newestWorker|'s [=script resource=]'s [=source text=], if |script| is a [=classic script=], and |script|'s [=module script/module record=]'s \[[ECMAScriptCode]] is a byte-for-byte match with |newestWorker|'s [=script resource=]'s [=module script/module record=]'s \[[ECMAScriptCode]] otherwise, then:
1. If |job|'s [=job/potentially include updated resources flag=] is unset, then:
1. Invoke [=Resolve Job Promise=] with |job| and |registration|.
1. Invoke [=Finish Job=] with |job| and abort these steps.
1. Let |worker| be a new [=/service worker=].
1. Set |worker|'s [=service worker/script url=] to |job|'s [=job/script url=], |worker|'s <a>script resource</a> to |script|, and |worker|'s [=service worker/type=] to |job|'s <a>worker type</a>.
1. [=map/For each=] |url| → |response| of |job|'s [=job/script resource map=]:
1. Set |worker|'s [=service worker/script resource map=][|url|] to |response|.
1. Set |worker|'s <a>script resource</a>'s <a>HTTPS state</a> to |httpsState|.
1. Set |worker|'s <a>script resource</a>'s [=script resource/referrer policy=] to |referrerPolicy|.
1. Invoke <a>Run Service Worker</a> algorithm given |worker|, and with the *force bypass cache for importscripts flag* set if |job|'s [=job/force bypass cache flag=] is set.
1. If an uncaught runtime script error occurs during the above step, then:
1. Invoke [=Reject Job Promise=] with |job| and `TypeError`.
1. If |newestWorker| is null, invoke <a>Clear Registration</a> algorithm passing |registration| as its argument.
1. Invoke <a>Finish Job</a> with |job| and abort these steps.
1. Else:
1. Let |worker| be a new [=/service worker=].
1. Set |worker|'s [=service worker/script url=] to |job|'s [=job/script url=], |worker|'s <a>script resource</a> to |script|, and |worker|'s [=service worker/type=] to |job|'s <a>worker type</a>.
1. Set |worker|'s <a>script resource</a>'s <a>HTTPS state</a> to |httpsState|.
1. Set |worker|'s <a>script resource</a>'s [=script resource/referrer policy=] to |referrerPolicy|.
1. Invoke <a>Run Service Worker</a> algorithm given |worker|, and with the *force bypass cache for importscripts flag* set if |job|'s [=job/force bypass cache flag=] is set.
1. If an uncaught runtime script error occurs during the above step, then:
1. Invoke [=Reject Job Promise=] with |job| and `TypeError`.
1. If |newestWorker| is null, invoke <a>Clear Registration</a> algorithm passing |registration| as its argument.
1. Invoke <a>Finish Job</a> with |job| and abort these steps.
1. If |worker|'s [=classic scripts imported flag=] is set, and |worker|'s [=service worker/include updated resources flag=] is unset, then:
1. Invoke [=Resolve Job Promise=] with |job| and |registration|.
1. Invoke [=Finish Job=] with |job| and abort these steps.
1. Invoke <a>Install</a> algorithm with |job|, |worker|, and |registration| as its arguments.
</section>

Expand Down Expand Up @@ -2509,7 +2529,6 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe
1. Run the <a>Update Registration State</a> algorithm passing |registration|, "<code>installing</code>" and null as the arguments.
1. If |newestWorker| is null, invoke <a>Clear Registration</a> algorithm passing |registration| as its argument.
1. Invoke <a>Finish Job</a> with |job| and abort these steps.
1. Set |registration|'s <a>installing worker</a>'s <a>imported scripts updated flag</a>.
1. If |registration|'s <a>waiting worker</a> is not null, then:
1. [=Terminate Service Worker|Terminate=] |registration|'s [=waiting worker=].
1. Run the [=Update Worker State=] algorithm passing |registration|'s [=waiting worker=] and *redundant* as the arguments.
Expand Down