Include imported scripts to byte-check #1023

jungkees · 2016-12-12T11:52:18Z

This changes the behavior of the service worker script resource
comparison. Before this, only the main service worker script was
compared to a new version. With this change, all the imported scripts
stored in the imported scripts map as well as the main script are
inspected against the corresponding network resources (based on the
urls.)

Note:

Service worker's script resource map has been renamed and moved to
service worker's script resource's imported scritps map.
registration's last update check time's always updated whenever the
response is fetched from the network (regardless it's a main script or
an imported script.)

Fixes #839.

jungkees · 2016-12-12T11:53:30Z

/cc @wanderview @mattto @annevk

jakearchibald · 2016-12-13T15:51:03Z

docs/index.bs

+      : Output
+      :: True or false, a boolean
+
+      1. If |sourceScript| and |targetScript| are not both [=classic scripts=] or not both [=module scripts=], return false.


Oh wow good catch

jakearchibald · 2016-12-13T16:00:14Z

docs/index.bs

+              1. If |targetResponse|'s [=response/type=] is "<code>error</code>", or |targetResponse|'s [=response/status=] is not an [=ok status=], return false.
+              1. If the result of [=UTF-8 decoding=] |response|'s [=response/body=] is not a byte-for-byte match with the result of [=UTF-8 decoding=] |targetResponse|'s [=response/body=], return false.
+      1. If |sourceScript| is a [=module script=], then:
+          1. If |sourceScript|’s [=module script/module record=]'s \[[ECMAScriptCode]] is not a byte-for-byte match with |targetScript|’s [=module script/module record=]'s \[[ECMAScriptCode]], return false.


Shouldn't we check imported scripts too?

importScripts() is not supported in module scripts: https://html.spec.whatwg.org/#import-scripts-into-worker-global-scope step 1.

Yeah, I meant imported as in import whatever from "./foo.js"

I thought those urls are fetched, parsed and stored as part of the module record, but maybe I'm wrong. I noticed an environment settings object has a module map which is referenced throughout traversing the module script graph. I'll check the algorithms again.

Oh does it flatten it somehow? I couldn't figure it out 😞

@jakearchibald, I think you're right. I can't find any step that flattens the entire imported assets. It seems each module script has module record that has the parsed source text and info to the required modules (record's [[RequestedModules]] in ECMAScript language.) Getting to ModuleEvaluation() when running a module script, it evaluates all the related module scripts recursively finding the dependencies.

And I think we should compare all the module scripts stored in an environment settings object's module map. I'd want to confirm this with @domenic before updating this part.

@domenic, SW intends to compare all the script assets including the main script and the imported scripts for both classic and module scripts. For classic scripts, SW has it's own imported scripts map that's used for this byte-by-byte check. But for module scripts, I think I misunderstood that the main script's module record has a flattened asset that include all the imported scripts in the graph. I'd like to check with you whether this is wrong, and if so, iterating an environment setting's object's module map and compare the source texts of all the module scripts in the map would be a right way for our purpose?

Hmm, [[ECMAScriptCode]] appears to be a "parse result" according to the spec, but the spec doesn't define what that is 😢 . But since it's the result of parsing it doesn't feel appropriate for a byte-for-byte check.

The way ModuleDeclarationInstantiation performs "For each ImportEntry" makes me think that there isn't any flattening going on. Maybe HTML's module map is an easier way in, since it's at least a flattened view of the imported scripts - although this may contain dynamically imported items that we're not interested in.

If I'm right that [[ECMAScriptCode]] isn't the source text, and source text is only available on classic scripts, we need to store this ourselves.

"Perform the fetch" is called for the top level script, but also each module fetch, so we could use this hook to store our own collection of source texts for byte-by-byte comparison.

@domenic does the above sound right at all? Our intent is to refetch the top-level script and all its imports, and see if any are byte-different to the original.

@jungkees whoa, I replied before seeing your message. Weird how similar they are!

Right, there is no flattened representation of the graph. I think checking everything in the module map is more correct.

I guess [[ECMAScriptCode]] is indeed not correct :-/. However, what we can do is update HTML to move "source text" from classic scripts only to the base script type. We'd probably want to add a note saying something like "source text is only used for classic scripts in this specification, but the Service Workers specification needs to store it for reasons". I don't think you all should create a separate parallel registry :).

That sounds good. Moving the source text to the base script type will help us use the source text of scripts for both classic and module script cases. I'll spec it to checking everything in the module map with that change.

Another thing that needs a solution is the lifetime of the scripts and the map for SW. With other web workers, the lifetime of the scripts is tied to the WorkerGlobalScope object, so it doesn't store the main script and only stores the module scripts in the module map on the global. But with SW, the lifetime of the scripts outlives those globals, so it stores the main script and the imported scripts on a service worker concept rather than on the SW global. Because of this difference, I think we need to discuss around how to make the fetch a classic worker script and the fetch a module worker script graph work more correctly with SW. There's an opened issue for this: #1013.

jakearchibald · 2016-12-13T16:09:01Z

docs/index.bs

+          1. Let |sourceMap| be |sourceScript|'s [=script resource/imported scripts map=].
+          1. [=map/For each=] |url| → |response| of |sourceMap|:
+              1. Let |request| be a new [=/request=] whose [=request/url=] is |url|, [=request/client=] is |job|'s [=job/client=], [=request/type=] is "<code>script</code>", [=request/destination=] is "<code>script</code>", [=request/parser metadata=] is "<code>not parser-inserted</code>", [=request/synchronous flag=] is set, and whose [=request/use-URL-credentials flag=] is set.
+              1. Set |request|'s [=request/cache mode=] to "<code>no-cache</code>" if any of the following are true:


This is going to cause a double-download. I'll explain this in another comment…

jakearchibald · 2016-12-13T16:14:36Z

When the imported scripts are checked, we're doing it with no-cache, but then later when we run the script it'll execute importScripts and once again request the resources no-cache.

I hadn't thought about this until reading this PR, but I think it's something we need to solve.

"Check If Service Worker Resources Are Identical" could return a boolean as it does now, but also a map of all the scripts it fetched. Then, importScripts could check this cache before going to the network. This cache can be deleted once install completes - or whenever we set the "imported scripts updated flag" flag, which discards imports that were used in the previous version, but not this one.

jungkees · 2016-12-14T02:29:59Z

When the imported scripts are checked, we're doing it with no-cache, but then later when we run the script it'll execute importScripts and once again request the resources no-cache.

Right. I concerned about this too, but it was still better than invoking Run Service Worker for every navigation.

"Check If Service Worker Resources Are Identical" could return a boolean as it does now, but also a map of all the scripts it fetched.

It seems a bit irony that we should introduce this kind of temporary cache to support bypassing HTTP cache. But yeah I think this is definitely a way to avoid the double-fetch of the seemingly identical resource (unless the content is dynamically changing in the server.) I'd like to also hear opinions from @wanderview and @mattto.

jakearchibald · 2016-12-14T10:35:09Z

It seems a bit irony that we should introduce this kind of temporary cache to support bypassing HTTP cache

Hah, yeah. But right now we fetch the main script, do the comparison, then reuse that script we fetched. It seems weird to reuse for the main script but refetch imported scripts. Also, this double-fetch means the scripts we byte-check could be different to the ones we end up running, which seems weird.

wanderview · 2016-12-14T14:14:40Z

When the imported scripts are checked, we're doing it with no-cache, but then later when we run the script it'll execute importScripts and once again request the resources no-cache.

How about:

The byte-for-byte check could populate a "script resource map" like importScripts() does now.
Then the that "script resource map" can be set as the SW's map.
The importScripts() algorithm could then check the map before going to network/http cache.
After script eval or install would need something to remove resources put in the map in step (1), but no longer used by the new service worker script. This is annoying, but not too with some book keeping.

Note, step (3) would also fix any issues where we go to network/http cache multiple times if script eval calls importScripts() multiple times.

jungkees · 2016-12-15T04:23:25Z

I think @jakearchibald and @wanderview largely suggested the same thing. Basically byte-check will populate a temporary map if the fetched resource is new. importScripts() will consult this map (to populate SW's script resource's imported scripts map*) before fetching through HTTP cache/network until the imported scripts updated flag is set. Once the flag is set, importScripts() always returns the results from the imported scripts map.

I think if we define and maintain the temporary map on a service worker registration, we can just add and update entries throughout the registration's lifetime.

I'll add this change to the PR.

(*) This is the SW's script resource map in the current spec. I suggested changing the name and the location of the concept as such through this PR for clarity and a better interface to other algorithms.

This changes the behavior of the service worker script resource comparison. Before this, only the main service worker script was compared to a new version. With this change, all the imported scripts stored in the imported scripts map as well as the main script are inspected against the corresponding network resources (based on the urls.) Note: - Service worker's script resource map has been renamed and moved to service worker's script resource's imported scritps map. - registration's last update check time's always updated whenever the response is fetched from the network (regardless it's a main script or an imported script.) Fixes #839.

This change includes/considers the following: - Include imported scripts to byte-check (for classic scripts). - Compare responses' body instead of source text as per whatwg/html#3316. - Handle duplicate importScripts() as per #1041. - Replace *imported scripts updated flag* referenced in importScripts() by using service worker's state item. - Have Update's perform the fetch steps cover module scripts. - Avoid dobule-download of imported scripts pointed out in #1023 (comment). This change basically makes it check out if the main script resource is identical to the existing resource. If so, it returns; otherwise, it creates a new service worker and evalute it to check out if any imported scripts are changed. It continues with Install only when any of the resources has been changed. With the change, importScripts() returns resources from the cache for any duplicated requests including the request for the main script. Fixes #1041, #1212, #1023.

This change includes/considers the following: - Include imported scripts to byte-check (for classic scripts). - Compare responses' body instead of source text as per whatwg/html#3316. - Handle duplicate importScripts() as per #1041. - Remove imported scripts updated flag referenced in importScripts() by using service worker's state item instead. - Have Update's perform the fetch steps cover module scripts. - Confirm dobule-download of imported scripts pointed out in #1023 (comment) never happens; importing a script again always gets the result from the cache. This change basically gets the imported scripts included to the byte-check for updates. Update continues with Install if any of the (main or imported) resources has been changed. This change also makes any duplicated requests (including the main script) from importScripts() return the result from the cache. Fixes #839, #1041, #1212, #1023.

jungkees · 2018-03-13T08:49:48Z

#1283 obsoleted this PR.

jungkees requested review from mkruisselbrink and jakearchibald December 12, 2016 11:52

jakearchibald reviewed Dec 13, 2016

View reviewed changes

jakearchibald requested changes Dec 13, 2016

View reviewed changes

jungkees force-pushed the include-imported-scripts-for-byte-check branch from 7cd7668 to 548e59a Compare December 15, 2016 05:38

jeffposnick mentioned this pull request Jan 27, 2017

Move PrecacheConfig to external file GoogleChromeLabs/sw-precache#236

Closed

jungkees mentioned this pull request Oct 20, 2017

The classic script's source text isn't defined in HTML any more #1212

Closed

jungkees mentioned this pull request Mar 5, 2018

Improve service worker script caching and update #1283

Merged

jungkees closed this Mar 13, 2018

jungkees deleted the include-imported-scripts-for-byte-check branch March 13, 2018 08:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include imported scripts to byte-check #1023

Include imported scripts to byte-check #1023

jungkees commented Dec 12, 2016

jungkees commented Dec 12, 2016

jakearchibald Dec 13, 2016

jakearchibald Dec 13, 2016

jungkees Dec 14, 2016 •

edited

Loading

jakearchibald Dec 14, 2016

jungkees Dec 14, 2016

jakearchibald Dec 14, 2016

jungkees Dec 14, 2016

jakearchibald Dec 14, 2016

jakearchibald Dec 14, 2016

domenic Dec 15, 2016

jungkees Dec 16, 2016

jakearchibald Dec 13, 2016

jakearchibald commented Dec 13, 2016

jungkees commented Dec 14, 2016

jakearchibald commented Dec 14, 2016

wanderview commented Dec 14, 2016

jungkees commented Dec 15, 2016 •

edited

Loading

jungkees commented Mar 13, 2018

Include imported scripts to byte-check #1023

Include imported scripts to byte-check #1023

Conversation

jungkees commented Dec 12, 2016

jungkees commented Dec 12, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jungkees Dec 14, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakearchibald commented Dec 13, 2016

jungkees commented Dec 14, 2016

jakearchibald commented Dec 14, 2016

wanderview commented Dec 14, 2016

jungkees commented Dec 15, 2016 • edited Loading

jungkees commented Mar 13, 2018

jungkees Dec 14, 2016 •

edited

Loading

jungkees commented Dec 15, 2016 •

edited

Loading