Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FilterBox visualisation runs one extra query #4249

Closed
3 tasks done
IanSavchenko opened this issue Jan 19, 2018 · 1 comment · Fixed by #4276
Closed
3 tasks done

FilterBox visualisation runs one extra query #4249

IanSavchenko opened this issue Jan 19, 2018 · 1 comment · Fixed by #4276

Comments

@IanSavchenko
Copy link

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if any
  • I have reproduced the issue with at least the latest released version of superset
  • I have checked the issue tracker for the same issue and I haven't found one similar

Superset version

0.22.1

Expected results

For FilterBox visualization (widget) Superset should run one query per filter to get possible filter values.

Actual results

Superset makes two queries (actually, N + 1, where N - number of filters in the widget)

Steps to reproduce

Create FilterBox, and run the query. You will see one extra query on DB end. Easy to reproduce with default dashboard "World's Banks Data" and existing FilterBox.

This is an issue for us because for one of the queries FilterBox does take quite long time (around a minute), but in fact, this time gets doubled in the end, because there are at least two queries run.

I managed to find the source of the issue: file superset/viz.py lines 276, 278. See code and my comments:

# first query executed here by design, 
# but it's results are actually ignored in FilterBoxViz subclass method `get_data`
df = self.get_df()    
  if not self.error_message:
    # N queries are executed here
    data = self.get_data(df) 

Here, in the subclass FilterBoxViz (lines 1532-1564):

# df not used here!
def get_data(self, df): 
        qry = self.query_obj()
        filters = [g for g in self.form_data['groupby']]
        d = {}
        for flt in filters:
            qry['groupby'] = [flt]
            
            # N "legit" queries are executed in the loop here
            df = super(FilterBoxViz, self).get_df(qry)
            d[flt] = [{
                'id': row[0],
                'text': row[0],
                'filter': flt,
                'metric': row[1]}
                for row in df.itertuples(index=False)
            ]
        return d

I'm not a Python dev and not really sure how to fix this. I would override some other methods in FilterBoxViz subclass, but since get_df is used in more methods like get_csv (it must be also broken now for FilterBox, btw), I don't know what is the right design. If nobody steps in, I will try to make a PR, but this fix should be trivial for those who know this code.

mistercrunch added a commit to mistercrunch/superset that referenced this issue Jan 24, 2018
@mistercrunch
Copy link
Member

Thanks for reporting this and providing all the detail. Fix is here: #4276

mistercrunch added a commit that referenced this issue Jan 25, 2018
michellethomas pushed a commit to michellethomas/panoramix that referenced this issue May 24, 2018
wenchma pushed a commit to wenchma/incubator-superset that referenced this issue Nov 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants