-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define the "Preload Cache" #590
Comments
What preload cache behavior is desired when |
I think we should just pick something and make sure it's tested. If we can get away with not creating a new global for |
From a developer ergonomics perspective, it would be nice if that wasn't the case. As it stands, there are subtle (potentially) observable differences in behavior between pushing and preloading a resource — one ends up in h2 push cache, other in renderer's memory cache, at least for Chrome. It would be nice to normalize all this, for everyone's sanity.. if we can.
Given that, afaik, every implementation has one. Yes? Which is what we're after here.. :) |
Gecko at least does not have a memory cache like chromium. We have things like image cache, but that is defined in the html spec. From my perspective the thing I want to know about these caches is: Do they sit above or below the network stack where service worker interception occurs? SW sits above the http cache and below the image cache in the spec today. |
Wanted to know if people expect that To give more context Chromium's current impl is like this: preload cache sits next to a memory cache, i.e. it's above the network stack and also above SW. Prefetch basically just puts things in HTTP cache (and never put them in memory cache or |
@wanderview wrote:
The H/2 PUSH cache sits above the network and above the HTTP cache, below SW. |
I think it's conceptually below the HTTP cache at least as implemented today (as resources don't get committed to the HTTP cache until they are "claimed" from the H2 push cache) |
Looking back at this, I think this work was simplified by @yutakahirano's recent(ish) refactoring of Chromium's implementation, which split the preload cache from the memory cache. |
Oh, I missed that the memory cache and the preload cache got separated. It seemed kinda preferable to me if all these mechanisms would end up in the same cache. I think at least that in Firefox that's what we're planning to use for |
I believe WebKit does the same thing. |
whatwg/html#154 is relevant here. |
The memory cache in blink is shared across documents but I think the preload cache should be per-document. I expect more predictability for the preload cache than the memory cache, and that's why we have the separate caches. |
Yutaka makes a good point. Our memory cache is a map of weak references, while our preload cache has strong references. This seems pretty reasonable to me. |
I think separation is better. In Gecko, there is a concept of resource memory cache, which each resource loader implements on its own (= Memory Cache). There is also a concept of sharing e.g. stylesheets among documents, this is also part of this Memory Cache concept. This is independent of preloads. Then, each DOM document instance keeps a strong map of preloads that consuming tags can look for and consume (and remove from the map). In reality, a preload creates an entry in the Memory Cache (in resource loaders), because preload in Gecko is nothing else than a speculative load with just a flag for higher priority. The map in the document (= Preload Cache) is there to have a central spot to look at when |
Are they fully separate? If you have |
@yutakahirano does per-document mean that a worker (whatever the type) does not have access to preloaded resources? |
Yes. In Blink, there is one memory cache in a renderer process, and only the main thread can use it. That means workers don't have access to the memory cache. There is one preload cache for each environment settings object. That means a preloaded resource is only matched in the preload cache with a request initiated by the same document. Please note that the preloaded resource can be matched in the memory cache with a request initiated by a different document - but the matching criteria (e.g., |
I would like to suggest that there "preload" is not (or shouldn't be) defined as a cache at all, or at least not cache in the way we usually use the word, an ephemeral storage with a size limit and something like an LRU mechanism. Rather, proposing this alternative, which seems roughly equivalent to how @mayhemer describes the Gecko implementation here:
It puts the preload "cache" above the HTTP and SW caches, but below the resource-specific memory caches. This would make it easier to clarify how preloads should behave when the |
I agree that the "preload cache" doesn't need an eviction policy, size limits, etc. I'm not sure I see the benefits of defining that cache above Fetch. It seems like it would significantly increase the room for mistakes on the part of the different Fetch callers. I'm similarly not sure we need to evict preloads from the cache if their corresponding Also, I think it's worthwhile to also think of generalizing the "list of available images" from HTML as a generic cache for all resource types, which is how it's implemented in at least 2 engines. While I wouldn't want to couple both efforts, it'd be good to have a holistic high-level design to how they'd both work. |
I meant above fetch conceptually, like above what fetch currently does. Though I think we anyway need to add a layerthat's equivalent to the browser engines' "resource loader" concepts, which sits between the individual resources and the current fetch, and does things like report resource timing. I think that preload handling should be done at that layer, as unlike fetch it has a concept of the document and it would be easier to create 1:1 mappings with a link element there rather than create elaborate new API and storage in FETCH.
Allowing the applications to free memory if it preloaded a lot of resources that are no longer needed. But it's sort of a "side" case. Also, forcing the browser to make another request. If you want to preload a resource that has
I like the idea, but I think it's totally separated from the preload list. Preload (IMO) should be roughly equivalent to asking the document whether it currently has a |
Yeah, that's the question really: Do preload fetches go via the preload cache? The answer isn't clear to me. |
True, maybe the simplest would be to define it as reading from the same list. The trouble with this whole approach is that it makes preloads cancel their URLs' |
I'd love to get the opinions of @pmeenan, @yutakahirano, @ddragana, @achristensen07 and @cdumez on this. I suspect it'd require some implementation alignment, so it'd be great to know there's willingness for that. |
I'm happy to help work on aligning implementations to make it easier to reason about and more predictable. The no-cache/no-store semantics and multiple accesses in particular feel like the hairiest part to get right (multiple preloads of the same URL and if they create new references or not complicating the logic somewhat). Treating it like a one-time key with a strong reference to the link element vs a key-value cache that de-dupes multiple references. |
One thing that might make sense here is to make use of Something like:
This allows fine-grained control over the cache both from the server and from the client. |
I don't think we should tie preload to any specific cache headers. That doesn't seem web compatible, nor what we'd want. |
In a way current preload is not compatible with existing cache headers - preloading something that has I believe that if we don't tie preload to cache headers at all, then we should say that it trumps cache headers - not just for the "first load" - meaning that if you preloaded something it stays in the preload list regardless of whether its content on the server might have changed, and is treated as if it was immutable (during the lifetime of the |
This test is meant to support the discussion in whatwg/fetch#590. The test accesses a URL that returns an integer that gets incremented with each request (starting with 0). The test preloads it once, and then loads it twice. In Firefox, the test returns 0,0 In Chrome, the test returns 0,1 (preload is used for one request) In Safari, the test returns 1,2 (preload makes an unused request)
This test is meant to support the discussion in whatwg/fetch#590. The test accesses a URL that returns an integer that gets incremented with each request (starting with 0). The test preloads it once, and then loads it twice. In Firefox, the test returns 0,0 In Chrome, the test returns 0,1 (preload is used for one request) In Safari, the test returns 1,2 (preload makes an unused request)
I added a test that shows the problem here. For a response that returns an integer that increments with each request, fetch (after preload) would return different values for the second request in Firefox (0), Chrome (1) and Safari (2). I believe that one of the goals of this effort is to make sure we can set the expected results for that test (which are currently not specified) :) |
I created a table summarizing the current browser behavior, based on this test. The test fetches different resources with different response types (Cache enabled/disabled or 404 error).
It's a bit difficult to tell which of this comes from behavior of preload vs. from behavior of the HTTP cache, but that difficulty is currently passed on to web developers - when browsers deal with cache so differently for simple use cases, it's difficult to understand how to optimize a website.
|
I think I'm most surprised with the "Once" result for Chrome a valid image with cache enabled (and somewhat surprised that different content types behave differently given the fetch path in Chrome). I'd have expected the resource to land in the disk cache and be re-used (though I guess that's the point of this discussion, to get the behaviors to make sense). The "multiple" results from no-cache feel like they will likely break developer expectations. |
I think the "multiple" results for no-cache for images at least come from https://html.spec.whatwg.org/#the-list-of-available-images, and I think generally they are needed for compat (e.g., different CSS image loads from a style change should result in the same image). For stylesheets in the same document it's also long-standing behavior of Gecko at least. |
Yea, that makes sense to me. I'm surprised of the "once" behavior in Chrome though, seems like preload does something extra that counteracts that memory cache. Note also that the tests activate many moving parts, there could always be some fragile mistake in them (those fragile mistakes unfortunately also happen in websites that try to take advantage of preload, unfortunately). |
I updated the table to include comparison between load and preload, and also fix some issues with the test. |
After the TPAC conversation, this is the rough definition of preload cache I propose:
This definition totally separates preloads from the different resource caches or any type-specific behavior, though implementations are welcome to further optimize. Loading once and before the load event reduces issues with cache headers and reloading due to errors, and focuses preload on "loading something before it's used". The tests will ensure that there are no extra network fetches (e.g, in case of invalid images), but they will allow implementations to have less fetches. |
A document keeps a list of preloaded resources, with a request and response for each. A preloaded resource is a result of <link rel=preload> When consumed (from the FETCH algorithm), the response is reused if the request matches all relevant parameters, and removed from the store. When the document is fully loaded ("load" event) the store is cleared. See whatwg/fetch#590
A document keeps a list of preloaded resources, with a request and response for each. A preloaded resource is a result of <link rel=preload> When consumed (from the FETCH algorithm), the response is reused if the request matches all relevant parameters, and removed from the store. When the document is fully loaded ("load" event) the store is cleared. See whatwg/fetch#590
A document keeps a list of preloaded resources, each with relevant parameters from the request, and the response once available. Once a <link rel=preload> element starts fetching a resource, that entry is added, and once the response is fully loaded, the fetch consuming the resource receives the response. See whatwg/fetch#590
A document keeps a list of preloaded resources, each with relevant parameters from the request, and the response once available. Once a <link rel=preload> element starts fetching a resource, that entry is added, and once the response is fully loaded, the fetch consuming the resource receives the response. See whatwg/fetch#590
A document keeps a list of preloaded resources, each with relevant parameters from the request, and the response once available. Once a <link rel=preload> element starts fetching a resource, that entry is added, and once the response is fully loaded, the fetch consuming the resource receives the response. See whatwg/fetch#590.
Before any particular fetch steps are performed, see if there is a matching request already in the preload store and consume it. This is called from the main fetch to avoid race conditions. Depends on whatwg/html#7260, and together they fix #590. Tests: web-platform-tests/wpt#31539.
Before any particular fetch steps are performed, see if there is a matching request already in the preload store and consume it. This is called from the main fetch to avoid race conditions. Depends on whatwg/html#7260, and together they fix whatwg#590. Tests: web-platform-tests/wpt#31539.
A document keeps a list of preloaded resources, each with relevant parameters from the request, and the response once available. Once a <link rel=preload> element starts fetching a resource, that entry is added, and once the response is fully loaded, the fetch consuming the resource receives the response. See whatwg/fetch#590.
We need to define the preload cache, as currently it is not defined and different implementations are doing observably different things.
This issue was opened on the Preload spec, but as the cache would mostly sit inside Fetch and it's not clear what action would be needed on the preload spec side, I'm "moving" it here.
Related previous discussion here was at #354 where @jakearchibald made a proposal to address this. I open this as a separate issue as I think the H2 push cache and the preload cache are inherently different things in different layers.
I think that anything we define here should, to some extent, look at how the different implementations tackle this today. The logic that they apply for this (e.g. in Blink or in WebKit) seems a bit complex, but was created for the "memory cache" case. We need to think if it can be safely simplified for the preload case. (and if memory cache itself should be standardized)
The text was updated successfully, but these errors were encountered: