-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] Dataset query to get only the latest facet for each version #2859
[FIX] Dataset query to get only the latest facet for each version #2859
Conversation
✅ Deploy Preview for peppy-sprite-186812 canceled.
|
4470982
to
676800d
Compare
Signed-off-by: sophiely <ly.sophie200@gmail.com>
Signed-off-by: sophiely <ly.sophie200@gmail.com>
3e05cdd
to
2e76688
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2859 +/- ##
=========================================
Coverage 84.75% 84.75%
Complexity 1456 1456
=========================================
Files 253 253
Lines 6566 6566
Branches 305 305
=========================================
Hits 5565 5565
Misses 850 850
Partials 151 151 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thanks, do we have a similar problem elsewhere? Or is this the only instance of this problem that you have seen.
df.facet, | ||
df."name", | ||
df.created_at, | ||
rank() OVER (PARTITION BY df.dataset_version_uuid, "name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 💯 🚀 🔥
For now i don't think so but ofc I will let you know if I notice similar issues :) |
Problem
Closes: #2860
Solution
Since the same facet type is replicated a lot of times, we can rank the facet partition by dataset version and facet name ands so as we can take only the most recent facet for each dataset uuid and type.
The UI seems to display only one facet per type (facet name) and dataset version anyway so we don't need to query as much facet (which are just duplicates anyway).
Checklist
CHANGELOG.md
(Depending on the change, this may not be necessary)..sql
database schema migration according to Flyway's naming convention (if relevant)