Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WiP] Shorten URL for explore #3795

Closed
graceguo-supercat opened this issue Nov 7, 2017 · 4 comments · Fixed by #3993
Closed

[WiP] Shorten URL for explore #3795

graceguo-supercat opened this issue Nov 7, 2017 · 4 comments · Fixed by #3993
Assignees

Comments

@graceguo-supercat
Copy link

graceguo-supercat commented Nov 7, 2017

Currently Superset uses stateful url for explore chart. We append all query parameters in GET request. When user open a slice, or change one control to make a new query, Superset make a new GET request. Superset also provided a shorten url function, but this shorten url is only generated per user request (click a button from UI). Recently we see many failed requests within airbnb, when GET request's parameters are too many too long, which cause chrome browser throw 400 Bad Request error.

I am trying to provide a solution for this problem:

  1. For each explore view request, we generate a cache_key for it, which is the MD5 hash of sorted query parameters. The slice/explore URL will be like:
    http://localhost:8088/superset/explore/table/313/92bb019eda2cfb5ffc8a33cb7bae733f
    Currently we already stored form_data as part of cache content with cache_key.

  2. Server-side parses cache_key to get parameters, and pass it to front-end as part of bootstrap data.

  3. In front-end, when user interact with controls, or update markdown slice, the arbitrary amount of query parameters, as well as the content of Markdown slice, will be POST to the server in the body of the request message. see [explore/dashboard] use POST instead of GET to fetch explore_json #2886.

  4. When generating dashboard bootstrap data, we should use shorten format as slice URL.

  5. Explore view will still accept query parameters in GET request. These url parameters will be merged with parameters hash.

@graceguo-supercat graceguo-supercat self-assigned this Nov 7, 2017
@graceguo-supercat graceguo-supercat changed the title WiP Shorten URL for explore view [WiP] Shorten URL for explore view Nov 7, 2017
@graceguo-supercat graceguo-supercat changed the title [WiP] Shorten URL for explore view [WiP] Shorten URL for explore Nov 7, 2017
@mistercrunch
Copy link
Member

We mentioned a temporary solution at the meeting, @fabianmenges what did you use for compression? lz-strings?
http://pieroxy.net/blog/pages/lz-string/index.html

@fabianmenges
Copy link
Contributor

fabianmenges commented Nov 8, 2017

I tried to use brotli but couldn't get it to work, the npm package seemed to be out of date and broken (can't remember which one that was). I ended up using lz-string as you suggested.

This is what I did in the swivel index.jsx, which keeps everything backwards compatible:

import { decompressFromBase64 } from 'lz-string';

let formData;
if (bootstrapData.lz_form_data) {
  formData = JSON.parse(decompressFromBase64(bootstrapData.lz_form_data));
} else if (bootstrapData.form_data) {
  formData = JSON.parse(bootstrapData.form_data);
} else {
  formData = {};
}

And the in the url generation in exploreUtils.js I added something like this:

import { compressToBase64 } from 'lz-string';
...
export function trimFormData(formData) {
  const cleaned = { ... formData };
  Object.entries(formData).forEach(([k, v]) => {
    if (v === null || v === undefined || v === false) {
      delete cleaned[k];
    }
  });
  return cleaned;
}

export function getExploreUrl(form_data, endpointType = 'base', force = false,...) {
  ...
  search.lz_form_data = compressToBase64(JSON.stringify(trimFormData(form_data)));
  ...
}

I didn't make any adaptions to the python side (for Swivel).
I can try to submit a PR early next week for making this work with the explore view, let me know.

@rumbin
Copy link
Contributor

rumbin commented Nov 8, 2017

I am concerned whether this won't clutter up the url table fairly quickly. Or did I miss a point here?

Wouldn't it suffice to store the query parameters in the cache backend? Of course, the cache timeout for these entries should then be set to quite a long time, though...

@graceguo-supercat
Copy link
Author

graceguo-supercat commented Nov 8, 2017

@rumbin I have same concern. I store this key in url table because I want to same some decoding time for each shorten URL request. For example, every time when request http://localhost:8088/superset/explore/table/313/?query=92bb019eda2cfb5ffc8a33cb7bae733f
coming in, i have to decode 92bb019eda2cfb5ffc8a33cb7bae733f to find the form_params are { a:1, b:2, c:true ...}.

Fabian and Max had a suggestion: we compress all form_data from front-end, and send compressed string as a request parameter. I assume in server-side we still need to decompress this lz_form_data parameter, so there is a similar cost for this solution as well.

I saw we had a payload data (which has charting data, form_data, etc) in cache, I probably just need to figure out how to get form_data from cache. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants