Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache invalidation #7319

Closed
wants to merge 13 commits into from
Closed

Cache invalidation #7319

wants to merge 13 commits into from

Conversation

betodealmeida
Copy link
Member

CATEGORY

Choose one

  • Bug Fix
  • Enhancement (new features, refinement)
  • Refactor
  • Add tests
  • Build / Development Environment
  • Documentation

SUMMARY

This PR implements cache invalidation for charts (after #7032), in two places:

(1) In the frontend, every time a POST is done by SupersetClient to get chart data, we invalidate the cache. This happens when the user clicks "force refresh" in a dashboard, or clicks "run query" in the explore view. When this happens, we extract the chart id (slice_id) and delete all stored responses that reference that chart — this would include, eg, a cached response of the chart with extra filters from a dashboard.

(2) In the backend, we use a similar strategy. Unfortunately, we can't iterate over all the cached keys like in the frontend, to find all related cached responses. So we compute the key that would be generated if a GET request was made, and invalidate it.

TEST PLAN

I've tested manually, by modifying charts and ensuring that dashboards refresh correctly:

  1. Create chart in explore view, browser does a POST request.
  2. Reload chart, browser does GET and caches the response.
  3. Reload chart, browser reuses cached response.
  4. Reload chart after cache expiration, browser does conditional response.
  5. Click "Run Query". Browser does a POST, cache is invalidated for that chart. Server-side cache is also invalidated.

Similarly for dashboards. Before, if you clicked force refresh on a dashboard it would performa a POST and show the new data. But reloading the dashboard would show the old charts, since they were still cached in the browser.

I tried adding Cypress integration tests, but for some reason I get an error when writing to the Cache API (it works fine in Chrome) from an integration test:

TypeError: Failed to execute 'put' on 'Cache': parameter 2 is not of type 'Response'.
    at eval (webpack-internal:///./node_modules/@superset-ui/connection/esm/callApi/callApi.js:17)

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

REVIEWERS

@betodealmeida betodealmeida added enhancement:request Enhancement request submitted by anyone from the community .js change:backend Requires changing the backend lyft Related to Lyft labels Apr 17, 2019
@betodealmeida
Copy link
Member Author

if (method === 'POST' && 'caches' in self) {
caches.open(CACHE_KEY).then(supersetCache =>
supersetCache
.keys()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what exactly is in keys here? this is one req per key?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keys are all the URLs stored in the cache.

Let's say you visited a dashboard that has a chart 1 and an extra filter name=dave. Your browser cache would look somewhat like this (using a pseudo-URL):

cache = {
  'http://example.com/?slice_id=1&name=dave': Response(...),
}

Now you go to the Explore view for that chart. When the page loads, it will do a GET and cache the request, so now your cache looks like this:

cache = {
  'http://example.com/?slice_id=1&name=dave': Response(...),
  'http://example.com/?slice_id=1': Response(...),
}

You decide to change some parameters in chart 1. You click "Run query", which does a POST request. You save the chart.

Without cache invalidation, the following would happen:

  1. You visit the dashboard again. You notice the chart is different than what you saved, because you're seeing the cached version.
  2. You click "Force refresh". This does a POST request, and you see the new chart. You're happy.
  3. You visit the dashboard again in the same day, before the cache expiration. You see the old chart again, because it's still cached in the browser and in the server. You're confused and sad.

With cache invalidation, the following will happen:

  1. You click "Run query" while exploring chart 1.
  2. This invalidates all the responses stored on the client that reference that chart. This means your browser cache would now be empty.
  3. On the server side we can't iterate over keys, so only the http://example.com/?slice_id=1 response would be invalidated.
  4. You save the chart.
  5. You visit the dashboard again. The browser cache is empty, so it does a GET request for chart 1 with the extra filter (http://example.com/?slice_id=1&name=dave).
  6. This is a cache hit on the server, so it returns old data.
  7. You click "Force refresh". This does a POST request, which returns the new chart and invalidates http://example.com/?slice_id=1&name=dave in the server. You're happy.
  8. You visit the dashboard again in the same day, before the cache expiration. The browser does a GET request, misses caches, and shows the new chart. You're happy, and everything is cached now, both on the server and on the client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed response. Sorry if I'm misreading, but in the examples you gave above with ?slice_id=1 wouldn't this skip both of those keys because they do not contain form_data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DiggidyDave, sorry for the confusion, I used a pseudo-URL in my examples to make it smaller. The full URL is actually /superset/explore_json/?form_data={"slice_id":1}.

@codecov-io
Copy link

codecov-io commented Apr 19, 2019

Codecov Report

Merging #7319 into lyftga will increase coverage by 0.08%.
The diff coverage is 62.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           lyftga    #7319      +/-   ##
==========================================
+ Coverage   64.58%   64.67%   +0.08%     
==========================================
  Files         425      426       +1     
  Lines       20884    20925      +41     
  Branches     2297     2305       +8     
==========================================
+ Hits        13488    13533      +45     
+ Misses       7270     7266       -4     
  Partials      126      126
Impacted Files Coverage Δ
superset/utils/decorators.py 87.93% <100%> (+0.89%) ⬆️
superset/assets/src/chart/chartAction.js 46.8% <11.11%> (-2.44%) ⬇️
superset/views/core.py 74.55% <76.47%> (-0.02%) ⬇️
...rset/assets/src/explore/reducers/exploreReducer.js 31.57% <0%> (-4.14%) ⬇️
superset/assets/src/chart/Chart.jsx 12.5% <0%> (-0.66%) ⬇️
superset/assets/src/explore/controls.jsx 42.06% <0%> (-0.34%) ⬇️
...ts/src/explore/components/ExploreViewContainer.jsx 33.79% <0%> (-0.25%) ⬇️
superset/assets/src/chart/ChartRenderer.jsx 7.57% <0%> (ø) ⬆️
...rset/assets/src/dashboard/components/Dashboard.jsx 89.28% <0%> (ø) ⬆️
...et/assets/src/explore/components/controls/index.js 100% <0%> (ø) ⬆️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5dab983...1895dff. Read the comment docs.

@DiggidyDave
Copy link
Contributor

I tried adding Cypress integration tests, but for some reason I get an error when writing to the Cache API (it works fine in Chrome) from an integration test:

Instead of end-to-end UI test for this, can we add some real unit tests and just mock the relevant storage/cache APIs? I think this needs some pretty robust testing, I'm afraid it is pretty tightly coupled to how URLs are structured and other things that could someday change and silently break this feature in an undiscoverable, hard-to-debug way.

@betodealmeida
Copy link
Member Author

@DiggidyDave I'll do that.

@graceguo-supercat
Copy link

graceguo-supercat commented Apr 22, 2019

For cypress test, you can stubbing response to avoid hitting real Cache API call (and avoid TypeError):
https://docs.cypress.io/api/commands/route.html#With-Stubbing

@DiggidyDave
Copy link
Contributor

DiggidyDave commented Apr 22, 2019

Cypress is for end to end UX testing, though is it not? This is definitely much more suitable for fast, crisp unit tests. (e2e ux tests should be rare.)

@DiggidyDave
Copy link
Contributor

w.r.t. my concerns about coupling this with URL structure etc: will this be affected by issues @mistercrunch raised over here? Just wanted to make sure.

@stale
Copy link

stale bot commented Jun 22, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:backend Requires changing the backend enhancement:request Enhancement request submitted by anyone from the community inactive Inactive for >= 30 days lyft Related to Lyft
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants