Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support efficient offline redirects #1457

Open
wanderview opened this issue Aug 13, 2019 · 18 comments
Open

support efficient offline redirects #1457

wanderview opened this issue Aug 13, 2019 · 18 comments

Comments

@wanderview
Copy link
Member

From talking with a number of sites it seems we don't currently have a great way of handling offline redirects. In this use case a site may have

vanity.com => (redirect to) => actual.com

If actual.com uses a service worker to provide offline support it has no way to make the vanity.com redirect function offline.

Of course, it is possible to register a service worker on vanity.com with a FetchEvent handler that provides the offline redirect, but that is very heavyweight. It not only requires spinning up a worker thread to handle the event, but in browsers that implement site isolation it also requires creating a new process.

You can further extend this because some sites I've talked to actually have multiple redirects in a chain they have to deal with.

A comprehensive declarative API like proposed in #1373 would be one way to solve this. It would allow the service worker on vanity.com to declare the redirect without requiring a FetchEvent to be fired.

I also wonder if there is some way to address this in a smaller API via something like #1454.

I wanted to file this as a separate issue, though, just to get the use case on people's minds.

@asutherland
Copy link

Have the sites indicated what problems they're running into with HTTP 301 redirects? (Aggressive HTTP cache eviction? It's not actually one URL but instead a whole family of URLs? Concern about ability to later re-sell the domain and so they don't actually want a truly permanent redirect? Actually want the vanity domain to be the domain the user sees but don't want to actually serve off the domain?)

Because one advantage of something like a 301 redirect is that it can be propagated back towards the input method, for example Firefox's awesomebar, so that URLs can be updated and the redirect chain doesn't need to be followed on every visit.

@wanderview
Copy link
Member Author

They use normal redirects today, but the problem is http cache eviction. Users often type vanity.com while offline and get a broken tab instead of redirected to the offline assets on actual.com.

For the sites I've spoken to the vanity.com service worker solution would still result in a 301 response, just synthetically instead of from the network stack. In theory browsers could still integrate that into things like awesomebar.

@annevk
Copy link
Member

annevk commented Aug 15, 2019

If they're indeed 301s it seems the browser could take better care of keeping these around if the target is a service worker. Probably have to be somewhat careful that it continues to behave identically (there've been a number of problems with special redirects over the years), but other than that it seems fine.

@wanderview
Copy link
Member Author

wanderview commented Aug 15, 2019

They may be temporary redirects to give them the flexibility to change the shape of actual.com in the future. I would have to check.

But still, my impression was there was no guarantee even 301 permanent redirects would remain cached and they could still be evicted. Stack overflow suggests chrome and firefox set no expiry, but they will still get evicted to make room for new entries:

https://stackoverflow.com/questions/9130422/how-long-do-browsers-cache-http-301s

@wanderview
Copy link
Member Author

Also, I would argue "enable offline support" and "permanently redirect URL A to B for all time" are semantically different things. We should not require permanent redirects without the flexibility to revert them in order to support offline IMO.

@annevk
Copy link
Member

annevk commented Aug 15, 2019

It might well be that we need something else for 302/303/307, but what's the problem with what I mentioned for 301/308?

@wanderview
Copy link
Member Author

Are you saying the http cache should be spec'd to never evict 301/308 even if its hit its configured storage limit? That seems impractical to me. Maybe we could make it less likely to evict, but it seems like there is always some point where eviction could happen and in a way that is separate from origin storage. That still leaves us with an unreliable offline experience for sites with a redirect flow.

@annevk
Copy link
Member

annevk commented Aug 15, 2019

Something to the extent of that if the user entered them in the address bar and they redirected to a service worker controlled resource, you add them to a cache tied to the lifetime of that service worker (or registration or some such).

@wanderview
Copy link
Member Author

Interesting suggestion, but it feels rather magical to me. I guess I was hoping we could come up with a solution using the networking primitives we've exposed to sites so they can clearly define their desired behavior.

@wanderview
Copy link
Member Author

Actually, for a 301 maybe that would be a reasonable thing to do. It would be some complicated implementation, but it might be worth if it solves the problem for this class of sites. I still don't love the magical implied behavior change based on the destination, but such is life.

@jakearchibald
Copy link
Contributor

jakearchibald commented Sep 15, 2019

Thoughts ahead of TPAC:

What problems are we looking to solve here? Performance, developer ergonomics, or both?

How would #1457 (comment) interact with clear-site-data of either origin?

Here's some code to compare:

Handling redirects in a service worker

addEventListener('fetch', event => {
  const { request } = event;
  if (request.mode !== 'navigate') return;

  const url = new URL(request.url);

  if (url.pathname === '/cna/') {
    event.respondWith(
      Response.redirect('https://example.com/cool-new-app/', 301),
    );
  }
});

With static routing

I didn't sketch anything specifically for redirects, but I think it'd look something like this:

addEventListener('install', event => {
  event.router.add(
    {
      mode: 'navigate',
      url: { is: '/cna/', ignoreSearch: true },
    },
    { type: 'redirect', url: 'https://example.com/cool-new-app/', status: 301 },
  );
});

@jakearchibald
Copy link
Contributor

jakearchibald commented Sep 16, 2019

Notes:

  • With cross origin, there isn't really an opportunity to install the service worker, since it's a redirect. We'd have to bring back header registration. Seems complex.

With the 301:

  • What about querystring? It would be exact match.
  • We could add another header to make matching vaguer, but ew. Maybe that's less common though.

@wanderview
Copy link
Member Author

My take away was that there was tentative consensus to investigate the 301 approach. We need to evaluate if it's reasonably implementable.

@wanderview
Copy link
Member Author

wanderview commented Sep 24, 2019

Chromium bug 1007289

@ralphch0
Copy link

What would be the expected flow for these apps? When the service worker is being installed (or at some later point), requests would be preemptively made to these vanity urls to ensure that they are cached ahead of the user going offline? I recall that there are stability issues with this: i.e. if the request fails, it will invalidate the cache entry (?).

Also, would the resource ever be cached beyond the timeframe specified in its caching headers? I assume no? i.e. that the service worker would only prevent eviction up until the time specified by the resource cache headers.

@wanderview
Copy link
Member Author

Good questions.

What would be the expected flow for these apps? When the service worker is being installed (or at some later point), requests would be preemptively made to these vanity urls to ensure that they are cached ahead of the user going offline? I recall that there are stability issues with this: i.e. if the request fails, it will invalidate the cache entry (?).

We don't have a good solution for how to prime the redirect without the user actually navigating to it. You can force this to happen with an iframe for same-origin urls, but for a cross-origin vanity url this would not reliably work since some browsers double-key storage for nested iframes.

I think our position was that for users who have a particular workflow they would like visit the vanity url if they use it and get it cached. Not great, but its an improvement over the current behavior.

Also, note the cross-origin issues I note above also make it impossible to register a service worker on cross-origin vanity urls. You can't register the service worker if the url is always redirecting and double-keying in browsers prevents using an iframe to the origin to register the service worker.

Also, would the resource ever be cached beyond the timeframe specified in its caching headers? I assume no? i.e. that the service worker would only prevent eviction up until the time specified by the resource cache headers.

What we talked about at TPAC face-to-face was using the heuristic that if max-age is 1 year or more then the redirect would be permanent. As in we would not age it out even after a year.

We could try to respect age headers, but it does raise the issue of how a site would re-prime the redirect. With our current idea I think the most natural thing would be for the redirect to expire and it would get re-primed on the next visit by the user. So you would have one request in the middle that is potentially not offlined.

@ralphch0
Copy link

ralphch0 commented Oct 4, 2019

I think our position was that for users who have a particular workflow they would like visit the vanity url if they use it and get it cached. Not great, but its an improvement over the current behavior.

This makes sense as a best effort solution. Though this makes it hard for us to promote that vanity url, as it can be unreliable since it's dependent on usage patterns (eg: user uses it all the time on their desktop, and then decides to use it for the first time offline on their laptop).

Also, note the cross-origin issues I note above also make it impossible to register a service worker on cross-origin vanity urls. You can't register the service worker if the url is always redirecting and double-keying in browsers prevents using an iframe to the origin to register the service worker.

It seems to me that there is a general issue here, that installing service workers on third party sites is simply impossible in browsers that do double-keying. This is not just needed for vanity urls, but also for sites that own multiple domains that are interconnected (eg: you enable a feature, which should cause a service worker to be registered on two domains). We do this today on Chrome, by registering service workers via iframes.

What we talked about at TPAC face-to-face was using the heuristic that if max-age is 1 year or more then the redirect would be permanent. As in we would not age it out even after a year.

Doesn't this introduce a security issue? Let's say site A, wants to serve a permanent redirect with a lifetime of 1 year to site B. Site B registers a service worker, and effectively hijacks the redirect forever. Site A, may be owned by a separate entity from site B, and may want to lease their domain name this way. Storing a resource for less time than suggested is fine obviously (eg: cache eviction), but it seems to me that we should never exceed max-age dictated by the source server.

@mangelozzi
Copy link

mangelozzi commented Feb 22, 2022

Taking it a step back, how about caching a edirect on same website. Say you wish to cache a unique url at start up for the user, which is returned as a redirect after login, e.g. /login/ -> /redirect/ -> /some/default/url/123/. It seems like if one tries to cache the /redirect/ response in the service worker, it caches the /some/default/url/123/ response instead. One should be able to specify to cache the followed redirect or the redirect itself. I hope this even makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants