-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: handle subtle bug with load-examples #16052
feat: handle subtle bug with load-examples #16052
Conversation
Codecov Report
@@ Coverage Diff @@
## master #16052 +/- ##
==========================================
- Coverage 76.90% 76.66% -0.25%
==========================================
Files 995 995
Lines 52842 52869 +27
Branches 6709 6712 +3
==========================================
- Hits 40640 40533 -107
- Misses 11976 12110 +134
Partials 226 226
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
(cherry picked from commit 69c5cd7)
SUMMARY
This PR solves a bug caused by a number of small errors that separate are harmless:
Bug 1. When running
load-examples
the datasets created from configs (https://github.com/apache/superset/tree/master/superset/examples/configs/datasets/examples) were loaded with schema set to null. The bug was fixed in #16041, but there are many datasets in the wild withNULL
as their schema.Bug 2. When following the dashboard creation tutorial the user is instructed to create a dataset called
cleaned_sales_data
in thepublic
schema. This results in two datasets with the same name:[NULL].cleaned_sales_data
with UUID A (added without a schema byload-examples
)public.cleaned_sales_data
with UUID B (added with a schema by the user)This shouldn't be possible, because
tables
had a uniqueness constraint ofdatabase_id, table_name
at the time:superset/superset/connectors/sqla/models.py
Line 486 in a786373
But the logic was enforced by the application, not the DB.
If the user now runs
load-examples
again (with the fix from #16041) we'll try to import the dataset with the schema,public.cleaned_sales_data
, with the same UUID ("A"). The helperimport_from_dict
will then run a query similar to this to check for uniqueness:And this returns both datasets.
A solution would be to delete all the datasets that have schema set to
NULL
(assuming they should have one), and then runload-examples
again. But this would overwrite any custom datasets created by users.Instead, I changed the
load-examples
script to skip an import when duplicates are found. This should affect a small number of datasets, and since we have made no changes to thecleaned_sales_data
dataset it's fine to skip it.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A
TESTING INSTRUCTIONS
To replicate:
load-examples
with a SHA before fix: set correct schema on config import #16041 was merged.cleaned_sales_data
in a given schema.cleaned_sales_data
, one with schema and another without.load-examples
again, it should work.ADDITIONAL INFORMATION