-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: count(distinct column_name) in metrics #19842
fix: count(distinct column_name) in metrics #19842
Conversation
Codecov Report
@@ Coverage Diff @@
## master #19842 +/- ##
=======================================
Coverage 66.55% 66.55%
=======================================
Files 1692 1692
Lines 64802 64804 +2
Branches 6657 6657
=======================================
+ Hits 43129 43131 +2
Misses 19973 19973
Partials 1700 1700
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing! I propose making a minor refactor of the code to make it more approachable for new developers, other than that LGTM 👍
const column = | ||
useVerboseName && this.column?.verbose_name | ||
params.useVerboseName && this.column?.verbose_name | ||
? `(${this.column.verbose_name})` | ||
: this.column?.column_name | ||
? `(${this.column.column_name})` | ||
: ''; | ||
// transform from `count_distinct(column)` to `count(distinct column)` | ||
if ( | ||
params.transformCountDistinct && | ||
aggregate === AGGREGATES.COUNT_DISTINCT && | ||
/^\(.*\)$/.test(column) | ||
) { | ||
return `COUNT(DISTINCT ${column.slice(1, -1)})`; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to reread these lines a few times to understand what was going on (mostly there from before this PR so not your fault!). IMO it would be more readable if we could first do something like
const column =
params.useVerboseName && this.column?.verbose_name
this.column.verbose_name
: this.column?.column_name
? this.column.column_name
: '';
and then something like
if (params.transformCountDistinct && aggregate === AGGREGATES.COUNT_DISTINCT) {
return `COUNT(DISTINCT ${column})`;
}
return `${aggregate}($column)`;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous logic may not have been covered by UT, so I skipped these very carefully. I think it would be better to add some UTs to this part of the code and then modify it all together.
Of course, if this needs to be done in this PR, I'm all for it. What do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, if you think this is dangerous to refactor let's do it in a separate PR 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm! 1 suggestion - should we rename COUNT_DISTINCT
to COUNT DISTINCT
on our list of aggregates in Simple tab?
Thanks @kgabryje! the |
(cherry picked from commit 25e572a)
🏷️ preset:2022.17 |
SUMMARY
The aggregate function count_distinct isn't ANSI SQL. Currently, when the user uses
simple metric
, the AdhocMetric control will generate acount_distinct(column)
for the label. If the user switches to the SQL tab, this label will apply to SQL. It isn't ANSI SQL, so directly use this SQL snippet, the error will appear on many databases.This PR transform count distinct metric from
count_distinct(column)
tocount(distinct column)
, but does not change the original metric label(verbose name).BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
After
count.distinct.mov
TESTING INSTRUCTIONS
birth_name
dataset.count_distinct
as AGGREGATE, andnum
as COLUMN in metric popover.count_distinct
metric, and switch tocustom SQL
, verify the SQL isCOUNT(DISTINCT num)
, but the label is stillcount_distinct(num)
.COUNT(DISTINCT num)
incustom SQL
, verify the label has been changed.ADDITIONAL INFORMATION