Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new cache-policy & cache middleware structure to support full page caching #1856

Merged
merged 12 commits into from
Sep 30, 2022

Conversation

syphar
Copy link
Member

@syphar syphar commented Sep 25, 2022

Resolves #1552

This is the implementation of #1552 , following the idea in #1552 (comment).

Basically:

  • set a default TTL in cloudfront.
  • always set no-cache headers by default in all pages,
  • override with longer caching times where useful,
  • remove the TTL for the default TTL to take effect.

This enables us to have things cached for longer in the CDN and just invalidating the needed paths when the content changes, without having to think about browser caches or other proxies / CDNs.

Invalidation after build was added in #1825.

For now I only focused on rustdoc pages and their redirects, and putting the policy structure & safeguard middleware into place. Some other pages I added it too.

Around CSP we believe that we can cache pages, even with nonces ( see #1569 (comment) ), online I can find sources that support both claims (see also https://serverfault.com/a/1064775/549071 ). To keep in mind too: for rustdoc pages we don't add CSP nonces yet, so the biggest part of the new caching is for now not affected by this caching change.

The only option to combine random CSP nonces & CDN caching would be to use edge workers to rewrite the HTML and update the nonce in the header. ( with fastly or cloudflare we could even do this in rust ;) ).

@github-actions github-actions bot added the S-waiting-on-review Status: This pull request has been implemented and needs to be reviewed label Sep 25, 2022
@syphar
Copy link
Member Author

syphar commented Sep 25, 2022

@jsha @rust-lang/docs-rs I would love your feedback on this before I invest more time.
This would fastly improve the user experience for everyone outside of the US.

Before merging I would probably add some more cache-policy checks in the rustdoc tests.

Copy link
Contributor

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks great to me, and I like the direction. A couple of high-level comments:

Prior to #1569, docs.rs set no Cache-Control and no Expires on rustdoc pages, which put us in the heuristic caching world. Normally in heuristic caching, browsers look at the Last-Modified to figure out how long to cache (10% of time since Last-Modified). Since docs.rs also doesn't set Last-Modified, I think that means 0 cache time, but it's not guaranteed by spec.

After #1569, we explicitly set max-age=0[, stale-while-revalidate=XX] on /latest/ pages. After #1856 (this PR), we will revert the max-age=0 part and go back to heuristic caching in browsers. I think the result probably winds up the same in practice, though it was nice to be able to be explicit. I don't see a way to continue to give the explicit instruction to browsers without interfering with CloudFront's default TTL.

src/test/mod.rs Outdated Show resolved Hide resolved
src/web/cache.rs Outdated Show resolved Hide resolved
src/web/cache.rs Show resolved Hide resolved
src/web/cache.rs Show resolved Hide resolved
src/web/cache.rs Outdated Show resolved Hide resolved
src/web/cache.rs Outdated Show resolved Hide resolved
src/web/mod.rs Show resolved Hide resolved
src/web/rustdoc.rs Outdated Show resolved Hide resolved
@syphar
Copy link
Member Author

syphar commented Sep 26, 2022

Prior to #1569, docs.rs set no Cache-Control and no Expires on rustdoc pages, which put us in the heuristic caching world. Normally in heuristic caching, browsers look at the Last-Modified to figure out how long to cache (10% of time since Last-Modified). Since docs.rs also doesn't set Last-Modified, I think that means 0 cache time, but it's not guaranteed by spec.

10% of time since Last-Modified is good to know, this means we should never set it so we're sure we can invalidate in the CDN.

After #1569, we explicitly set max-age=0[, stale-while-revalidate=XX] on /latest/ pages. After #1856 (this PR), we will revert the max-age=0 part and go back to heuristic caching in browsers. I think the result probably winds up the same in practice, though it was nice to be able to be explicit. I don't see a way to continue to give the explicit instruction to browsers without interfering with CloudFront's default TTL.

Yeah, I thought the same, I like it better explicit. Without switching CDNs or lambda@edge I don't see any alternative.

@syphar syphar requested a review from jsha September 27, 2022 03:52
Copy link
Contributor

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Looking forward to seeing this deployed.

Before deploy, do you want to collect some timing measurements so we get a before / after? For instance here are some popular pages that are likely to be available in the CloudFront cache shortly after the deploy:

https://docs.rs/reqwest/latest/reqwest/
https://docs.rs/clap/latest/clap/
https://docs.rs/reqwest/latest/reqwest/
https://docs.rs/serde_json/latest/serde_json/
https://docs.rs/regex/latest/regex/
https://docs.rs/chrono/0.4.0/chrono/struct.DateTime.html
https://docs.rs/anyhow/latest/anyhow/
https://docs.rs/rand/latest/rand/

It would also be interesting to see the drop in requests to origin after deploy.

@syphar
Copy link
Member Author

syphar commented Sep 28, 2022

Nice! Looking forward to seeing this deployed.

Before deploy, do you want to collect some timing measurements so we get a before / after? For instance here are some popular pages that are likely to be available in the CloudFront cache shortly after the deploy: [...]

To be clear: directly after deploy nothing will happen, apart from probably caching less in some cases because we explicitly set max-age=0 as default.
The change will come when infra sets the default TTL in the cloudfront config, which I was planning on just slowly increasing, perhaps starting with just 30m, then 1h, ... so we can easily just remove the setting again if anything is odd.

I don't think we to collect CDN-level metrics on this for now.

It would also be interesting to see the drop in requests to origin after deploy.

This we definitely will see in our metrics.

in general:

@jsha were you able to manually test too while reviewing?

I would also love another pair of eyes on this by @Nemo157 or @jyn514 to be safe that I didn't forget anything important.

@jsha
Copy link
Contributor

jsha commented Sep 28, 2022

I have not manually tested but I can try and do that later today.

@jsha
Copy link
Contributor

jsha commented Sep 29, 2022

Some notes from testing:

  • On ForeverInCdn URLs, we now emit Cache-Control: . I think this is legal according to HTTP but it's a minor waste of bytes, and potentially confusing to someone who sees it. It would be nice to omit the header entirely in this case.

  • NoStoreMustRevalidate is used in only one place, for builds.json. It's not clear to me that that file needs anything more stringent than NoCaching. Following the principle I mentioned above ("don't change incidental things in an already-big change"), I think it's fine to leave it; but we should consider aligning it with the NoCaching policy in a followup.

  • The crate page (e.g. http://0.0.0.0:3000/crate/regex-syntax/latest) seems to be getting Cache-Control: , while other pages from that tab group, like features (http://0.0.0.0:3000/crate/regex-syntax/latest/features) are getting Cache-Control: max-age=0. Since the CDN invalidation targets the whole /crate/<foo> prefix, I think it's safe to allow all these pages to be cached by the CDN. Though, thinking about it some more: /crate/builds needs to be updated even on a failed build. I think "normal" failed builds will purge the cache, but there are a few cases where it looks like an error return might cause the cache not to be purged.

@syphar
Copy link
Member Author

syphar commented Sep 29, 2022

Some notes from testing:

Thank you for investing the time!

On ForeverInCdn URLs, we now emit Cache-Control: . I think this is legal according to HTTP but it's a minor waste of bytes, and potentially confusing to someone who sees it. It would be nice to omit the header entirely in this case.

valid point, I removed it for these cases.

NoStoreMustRevalidate is used in only one place, for builds.json. It's not clear to me that that file needs anything more stringent than NoCaching. Following the principle I mentioned above ("don't change incidental things in an already-big change"), I think it's fine to leave it; but we should consider aligning it with the NoCaching policy in a followup.

I can see a point where builds.json should get must-revalidate where normal rustdoc pages don't. But also I agree on not changing more than we have to here.

The crate page (e.g. http://0.0.0.0:3000/crate/regex-syntax/latest) seems to be getting Cache-Control: , while other pages from that tab group, like features (http://0.0.0.0:3000/crate/regex-syntax/latest/features) are getting Cache-Control: max-age=0. Since the CDN invalidation targets the whole /crate/<foo> prefix, I think it's safe to allow all these pages to be cached by the CDN. Though, thinking about it some more: /crate/builds needs to be updated even on a failed build. I think "normal" failed builds will purge the cache, but there are a few cases where it looks like an error return might cause the cache not to be purged.

This PR is far from covering all edge-cases where we could cache more and it aims mainly for rustdoc pages. I would leave out adding new pages to cache for the sake of finishing this PR. When this change alive and kicking we can try to cache more.

@jsha
Copy link
Contributor

jsha commented Sep 29, 2022

Sounds good to me. 👍🏻

@syphar syphar merged commit b5451e0 into rust-lang:master Sep 30, 2022
@syphar syphar deleted the forever-cdn-cache branch September 30, 2022 04:50
@github-actions github-actions bot added S-waiting-on-deploy This PR is ready to be merged, but is waiting for an admin to have time to deploy it and removed S-waiting-on-review Status: This pull request has been implemented and needs to be reviewed labels Sep 30, 2022
@syphar syphar removed the S-waiting-on-deploy This PR is ready to be merged, but is waiting for an admin to have time to deploy it label Oct 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

full page caching for rustdoc pages in the CDN
2 participants