Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1668MB cache storage in chrome #9415

Closed
hackhat opened this issue Oct 26, 2018 · 19 comments · Fixed by #9907
Closed

1668MB cache storage in chrome #9415

hackhat opened this issue Oct 26, 2018 · 19 comments · Fixed by #9907
Labels
status: confirmed Issue with steps to reproduce the bug that’s been verified by at least one reviewer.

Comments

@hackhat
Copy link

hackhat commented Oct 26, 2018

image

image

image

image

image

Even if I delete all these items the cache doesn't go down.

image

I went to "\AppData\Local\Google\Chrome\User Data***\Cache" and checked the whole folder doesn't increase on refresh and is not even the size of the storage chrome show. It might be a chrome bug?

config:

module.exports = {
  siteMetadata: {
    title: '***',
    siteUrl: `https://www.***.com/`,
  },
  plugins: [
    `gatsby-transformer-sharp`,
    `gatsby-plugin-sharp`,
    {
      resolve: `gatsby-plugin-sitemap`,
      options: {
        exclude: [
          '/internal/*',
        ]
      }
    },
    `gatsby-plugin-catch-links`,
    `gatsby-plugin-emotion`,
    `gatsby-plugin-typescript`,
    'gatsby-plugin-react-helmet',
    {
      resolve: `gatsby-plugin-manifest`,
      options: {
        name: 'gatsby-starter-default',
        short_name: 'starter',
        start_url: '/',
        background_color: '#000000',
        theme_color: '#000000',
        display: 'minimal-ui',
        icon: 'src/content/logo/square.png', // This path is relative to the root of the site.
      },
    },
    'gatsby-plugin-offline',
    {
      resolve: `gatsby-source-filesystem`,
      options: {
        path: `${__dirname}/src/content/pages`,
        name: 'markdown-pages',
      },
    },
    {
      resolve: `gatsby-source-filesystem`,
      options: {
        path: `${__dirname}/src/content/images`,
        name: 'images',
      },
    },
    {
      resolve: `gatsby-transformer-remark`,
      options: {
        plugins: [
          {
            resolve: `gatsby-remark-images`,
            options: {
              // It's important to specify the maxWidth (in pixels) of
              // the content container as this plugin uses this as the
              // base for generating different widths of each image.
              maxWidth: 1200,
              linkImagesToOriginal: false,
              backgroundColor: 'transparent',
              withWebp: true,
            },
          },
        ],
      },
    },
    {
      resolve: `gatsby-plugin-google-analytics`,
      options: {
        trackingId: "****",
        // Puts tracking script in the head instead of the body
        head: false,
        // Setting this parameter is optional
        anonymize: false,
        // Setting this parameter is also optional
        respectDNT: false ,
        // Avoids sending pageview hits from custom paths
        exclude: [],
        // Any additional create only fields (optional)
        sampleRate: 100,
        siteSpeedSampleRate: 100,
        cookieDomain: "www.***.com",
      },
    },
  ],
}

Description

1668MB cache storage

Looks like every refresh adds 10MB. Doesn't look like is based on some kind of time (the more you wait the more it adds). Just on every refresh I get more 10-20MB of cache.

Steps to reproduce

I don't know

Expected result

certainly less than 1.6GB...

Actual result

1668MB cache storage

Environment

System:
OS: Windows 10
CPU: x64 Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
Binaries:
Yarn: 1.9.4 - C:\Program Files (x86)\Yarn\bin\yarn.CMD
npm: 5.5.1 - C:\Program Files\nodejs\npm.CMD
Browsers:
Edge: 42.17134.1.0
npmPackages:
gatsby: ^2.0.19 => 2.0.21
gatsby-image: ^2.0.17 => 2.0.17
gatsby-plugin-catch-links: ^2.0.4 => 2.0.4
gatsby-plugin-emotion: ^2.0.5 => 2.0.5
gatsby-plugin-google-analytics: ^2.0.6 => 2.0.6
gatsby-plugin-manifest: ^2.0.6 => 2.0.6
gatsby-plugin-offline: ^2.0.5 => 2.0.6
gatsby-plugin-react-helmet: ^3.0.0 => 3.0.0
gatsby-plugin-sharp: ^2.0.8 => 2.0.8
gatsby-plugin-sitemap: ^2.0.1 => 2.0.1
gatsby-plugin-typescript: ^2.0.0 => 2.0.0
gatsby-plugin-typography: ^2.2.0 => 2.2.0
gatsby-remark-images: ^2.0.4 => 2.0.4
gatsby-source-filesystem: ^2.0.3 => 2.0.3
gatsby-transformer-remark: ^2.1.7 => 2.1.7
gatsby-transformer-sharp: ^2.1.5 => 2.1.5

@KyleAMathews
Copy link
Contributor

Can you reproduce this with the default starter?

@hackhat
Copy link
Author

hackhat commented Oct 26, 2018

Only happens when deployed and not in localhost... Check the updates:

Even if I delete all these items the cache doesn't go down.

image

I went to "\AppData\Local\Google\Chrome\User Data***\Cache" and checked the whole folder doesn't increase on refresh and is not even the size of the storage chrome show. It might be a chrome bug?

@kakadiadarpan kakadiadarpan added the status: needs more info Needs triaging and reproducible examples or more information to be resolved label Oct 26, 2018
@hackhat
Copy link
Author

hackhat commented Oct 26, 2018

@kakadiadarpan what more info do you need?

@kakadiadarpan
Copy link
Contributor

kakadiadarpan commented Oct 26, 2018

@hackhat Is it possible for you to provide us with a reproduction repo?

@krismorf
Copy link
Contributor

krismorf commented Oct 26, 2018

I also notice a similar issue:

  1. go to https://5bd3136f02ed8308afdcbfc5--gyftshop.netlify.com/
  2. open console
  3. start browsing

You'll see that each page takes around 3-4MB of application storage.

@hackhat
Copy link
Author

hackhat commented Oct 27, 2018

@kakadiadarpan sorry I can't provide production urls.

@hackhat
Copy link
Author

hackhat commented Oct 27, 2018

@kmorf can you please list all your gatsby plugins?

@krismorf
Copy link
Contributor

System:
    OS: macOS 10.14
    CPU: x64 Intel(R) Core(TM) i5-6360U CPU @ 2.00GHz
    Shell: 5.3 - /bin/zsh
  Binaries:
    Node: 11.0.0 - /usr/local/bin/node
    npm: 6.4.1 - /usr/local/bin/npm
  Browsers:
    Chrome: 70.0.3538.77
    Safari: 12.0
  npmPackages:
    gatsby: ^2.0.31 => 2.0.31
    gatsby-image: ^2.0.17 => 2.0.17
    gatsby-plugin-canonical-urls: ^2.0.6 => 2.0.6
    gatsby-plugin-create-client-paths: ^2.0.1 => 2.0.1
    gatsby-plugin-feed: ^2.0.8 => 2.0.8
    gatsby-plugin-google-analytics: ^2.0.6 => 2.0.6
    gatsby-plugin-lunr: ^1.2.0 => 1.2.0
    gatsby-plugin-manifest: ^2.0.6 => 2.0.6
    gatsby-plugin-netlify: ^2.0.2 => 2.0.2
    gatsby-plugin-netlify-cache: ^1.0.0 => 1.0.0
    gatsby-plugin-netlify-cms: ^3.0.4 => 3.0.4
    gatsby-plugin-offline: ^2.0.9 => 2.0.9
    gatsby-plugin-react-helmet: ^3.0.0 => 3.0.0
    gatsby-plugin-robots-txt: ^1.3.0 => 1.3.0
    gatsby-plugin-sharp: ^2.0.8 => 2.0.8
    gatsby-plugin-sitemap: ^2.0.1 => 2.0.1
    gatsby-plugin-styled-components: ^3.0.0 => 3.0.0
    gatsby-remark-images: ^2.0.4 => 2.0.4
    gatsby-source-filesystem: ^2.0.5 => 2.0.5
    gatsby-transformer-remark: ^2.1.9 => 2.1.9
    gatsby-transformer-sharp: ^2.1.5 => 2.1.5
  npmGlobalPackages:
    gatsby-cli: 2.4.3

@ryanwiemer
Copy link
Contributor

@davidbailey00,

Are the other offline issues that you were looking into possibly related to this problem of the cache increasing on every refresh?

It looks like you just submitted a PR to not precache the index page. (#9603)

@ryanwiemer
Copy link
Contributor

@KyleAMathews,

I haven't been able to reproduce it with the default starter but it is pretty apparent on https://www.gatsbyjs.org/. Each refresh adds about ~10mb to the cache.

oct-31-2018 14-36-44

@hackhat
Copy link
Author

hackhat commented Nov 1, 2018

Before cleaning:

image

image

After cleaning:

image

image

I'm inclined to think is a bug in chrome as it doesn't free space on disk.

Anyway, we should investigate other browsers; Edge doesn't show this data.

Also, maybe we can have some automated tests to make sure it won't happen in the future if is a bug with gatsby.

@vtenfys
Copy link
Contributor

vtenfys commented Nov 1, 2018

Are the other offline issues that you were looking into possibly related to this problem of the cache increasing on every refresh?

No, the other problems are all unrelated - the only reason to disable precaching the index page is because it's runtime cached anyway, and precaching it causes additional problems with POST requests.

I'm guessing that Chrome isn't deleting old files from the cache after each time the website is updated - maybe this is something we need to investigate. Is the list of files in the dev tools getting longer each time the cache storage increases? If so, it might be a bug on our side. If not, it sounds like a bug in Chrome, since there's no reason the cache size should increase if no additional files are cached.

@vtenfys
Copy link
Contributor

vtenfys commented Nov 1, 2018

I'm also unable to reproduce this on gatsbyjs.org - I've cleared all site data, then done 20 reloads in a row and the cache size doesn't change. So it looks like either 1) the bug is platform-dependent (I'm using C70 on Ubuntu), or 2) it relies on visiting some other pages first, or 3) it relies on the website being updated, i.e. with resources at different locations (so they'll need to be cached again).

@hackhat
Copy link
Author

hackhat commented Nov 2, 2018

@davidbailey00 to me doesn't look like it adds new files to cache in the dev tools.

@vtenfys
Copy link
Contributor

vtenfys commented Nov 3, 2018

to me doesn't look like it adds new files to cache in the dev tools.

In that case I strongly suspect this is a bug in Chrome (possibly platform-specific) rather than a problem on our side.

Because of this, along with the fact I haven't been able to reproduce, I'll close this issue for now - if anyone has more information or another reproduction, then I'll be happy to reopen and investigate.

@vtenfys vtenfys closed this as completed Nov 3, 2018
@hackhat
Copy link
Author

hackhat commented Nov 5, 2018

@davidbailey00 I think we should be sure that is the case. I think is a pretty big deal if is it actually happening both in Gatsby or in chrome, so we should not rule them both out.

Because of this, along with the fact I haven't been able to reproduce, I'll close this issue for now - if anyone has more information or another reproduction, then I'll be happy to reopen and investigate.

Well, I and another user were able to reproduce, so I think is a bit too early to close it. Even if is dependent on platform, it might be a bug.

Can anybody get in touch with Chrome team to ask what they think?

I think we should keep it open.

@DSchau
Copy link
Contributor

DSchau commented Nov 8, 2018

It's fair to re-open this. Both @pieh and I were able to reproduce the issue, so it seems to be a real issue.

Gatsbyjs.com

It doesn't seem to be anything introduced super recently, as gatsbyjs.com exhibits the issue and that's on 2.0.5 of gatsby-plugin-offline. It's possible it's still a Chrome bug, but I think this deserves another look. Thanks to everyone for reporting/finding the issue!

Also note that clearing the storage does not fix the issue (at least for me). I'm able to reliably reproduce the issue even after clearing the storage. It grows by approximately ~10MB each time.

@DSchau DSchau reopened this Nov 8, 2018
@DSchau DSchau added status: confirmed Issue with steps to reproduce the bug that’s been verified by at least one reviewer. and removed status: needs more info Needs triaging and reproducible examples or more information to be resolved labels Nov 8, 2018
@vtenfys
Copy link
Contributor

vtenfys commented Nov 14, 2018

Hey there, sorry for closing this initially! It turned out that I had my adblocker enabled, which prevented this problem from occurring, which is why I couldn't reproduce the problem initially.

I've identified the cause of the problem - Google Analytics loads a tracking GIF with a unique URL on each reload, each of which is cached by the offline plugin. Since these are cached opaquely, the data is padded out which increases the storage space considerably.

To fix this we need to prevent resources like this from being cached - for resources which support CORS, including the tracking GIFs, we can get the cache-control header and detect whether no-cache is specified (note: this is unrelated to our ?no-cache=1 parameter which is soon being removed).

For resources which don't support CORS we have a bit of a problem since we can't detect whether these should be cached. As a result, I think it's best that we always use CORS when caching resources, and ignore any errors related to access control headers - this is what we were previously doing so I'll revert some recent changes.

Update: I've found a better way to fix this, by leveraging Workbox's auto caching! A fix should be published shortly :)

DSchau pushed a commit that referenced this issue Nov 14, 2018
…ively (#9923)

Fixes #9415

This PR leverages Workbox's automatic caching when a resource is fetched, by creating a `<link>` prefetch element for each resource. This means we don't have to worry about whether the resource can be fetched with CORS or not, since the browser handles it automatically. I've also changed the runtime caching RegExps so that only paths with certain extensions are cached, but 3rd party resources with these extensions can now be cached.
@alex-p-chan
Copy link
Contributor

alex-p-chan commented Dec 7, 2018

If anyone else is having the issue with large caching.
I believe this might be due to the urlPattern not being focused enough.
The service worker was caching analytics scripts and other third party scripts which it shouldn't and was blowing out my cache.
I managed to make my cache 1/100th of the size by changing the options in gatsby.config as below. obviously change mysite.com to your URL.

    {
      resolve: `gatsby-plugin-offline`,
options: {
        dontCacheBustUrlsMatching: /(\.js$|\.css$|\/static\/)/,
        runtimeCaching: [
          {
           urlPattern: /(\.js$|\.css$|\/static\/)/,
            handler: `cacheFirst`,
          },
          {
            urlPattern: /^https?:\/\/(www\.mysite\.com|localhost:8000|localhost:9000|staging\.mysite\.com).*\.(png|jpg|jpeg|webp|svg|gif|tiff|js|woff|woff2|json|css)$/,
            handler: `staleWhileRevalidate`,
          },
          {
            urlPattern: /^https?:\/\/fonts\.googleapis\.com\/css/,
            handler: `staleWhileRevalidate`,
          },
        ],
      }
}

gpetrioli pushed a commit to gpetrioli/gatsby that referenced this issue Jan 22, 2019
…ively (gatsbyjs#9923)

Fixes gatsbyjs#9415

This PR leverages Workbox's automatic caching when a resource is fetched, by creating a `<link>` prefetch element for each resource. This means we don't have to worry about whether the resource can be fetched with CORS or not, since the browser handles it automatically. I've also changed the runtime caching RegExps so that only paths with certain extensions are cached, but 3rd party resources with these extensions can now be cached.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: confirmed Issue with steps to reproduce the bug that’s been verified by at least one reviewer.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants