-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[schema] Updating the tables schema #5449
[schema] Updating the tables schema #5449
Conversation
b0f8589
to
6de006c
Compare
Codecov Report
@@ Coverage Diff @@
## master #5449 +/- ##
==========================================
+ Coverage 59.11% 59.13% +0.01%
==========================================
Files 372 372
Lines 23756 23754 -2
Branches 2758 2758
==========================================
+ Hits 14044 14046 +2
+ Misses 9697 9693 -4
Partials 15 15
Continue to review full report at Codecov.
|
6de006c
to
08b8821
Compare
superset/connectors/sqla/models.py
Outdated
|
||
table_name = Column(String(250)) | ||
table_name = Column(String(127)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will cause problems. When you create new "Tables" in SQL Lab it will autogenerate very long table names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fabianmenges unless the database or tab names are significantly long I suspect that this shouldn't generate very long table names.
@@ -243,17 +244,13 @@ class TableModelView(DatasourceModelView, DeleteMixin, YamlExportMixin): # noqa | |||
'is_sqllab_view': _('SQL Lab View'), | |||
'template_params': _('Template parameters'), | |||
} | |||
validators_columns = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about SQL Lab tables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fabianmenges could you be more specific? It potentially seems wrong to use a period in a datasource name. Based on the logic for auto-generating the SQL Lab table name it should only contain a period if the username, database name, or tab name contain a period.
08b8821
to
25b3b50
Compare
quick question: when will this PR be merged? Saw this PR has been open for more than 2 months. |
I made the point in #6718 that we may want to allow multiple datasources that point to the same physical table in the database. The concept of ownership effectively allows a user to "landgrab" a datasource and control it. It's also possible that an employe that has left the company is the owner of a datasource, then you have to ask an admin to grant you ownership of that table... Or if I want to add a metric to a table, I have to ask the owner to make me an owner as well. Also if a table has many owners, it's likely that one person will break someone else's chart or dashboard. On the other hand, having many datasources around the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at my PR #6718 which made me look at yours :-)
|
||
# Add the missing uniqueness constraint. | ||
with op.batch_alter_table('tables', naming_convention=conv) as batch_op: | ||
batch_op.create_unique_constraint( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that when I did this op.batch_alter_table alone: It didn't really remove the existing unique constraint. Second, the '.schema' on the table showed that a new unique constraint wasn't really created. Usually alembic creates a temp table and copies the old table to it but it didn't do so in this case. So I am curious if you actually did confirm that this unique constraint is effectively getting created for sqlite ?
Also, are you aware of this very old diff: 15b67b2 (from the days of caravel) that already actually adds the schema you are shooting for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agrawaldevesh this may because that in some deployments there may have been a uniqueness constraint on the table_name
. I'll update the migration logic.
batch_op.drop_constraint( | ||
generic_find_uq_constraint_name( | ||
'tables', | ||
{'database_id', 'schema', 'table_name'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the issues with this schema uniqueness is the pesky 'sql' column on the table: Effectively we would ideally like to have two different views of the same table, both with different sql but with the same <database, schema, table_name>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue here is that if there were two different views of the same table how would one differentiate between them as datasources. Note in a database construct view names need to be unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't mean real 'SQL views'. I meant basically having the same table added twice, but with different 'sql' like so:
Table foo with sql: 'select foo.a, foo.b, count(1) from foo group by a, b'.
And another with Table foo with sql: 'select foo.*, some_function() from foo'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally they feel like different views and thus should be represented as different entities in the database table. If they're both called foo
it's not apparent to the user which one the datasource is referring to and thus this would lead to confusion.
c303533
to
af3fcfe
Compare
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue |
Sadly yes. I think we never resolved how to implement the ideal solution for the various Superset metadata databases. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
6 similar comments
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
3 similar comments
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
i'm pretty surprised it hasn't fallen more out of sync :) |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
7 similar comments
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
❗ Please consider rebasing your branch to avoid db migration conflicts. |
Closing in favor of #15909. |
We've noticed a number of anomalies in our database caused by ill-defined forms and/or table schema definitions. This PR resolves a number of issues related to the
tables
table including:table_name
column is non-nullable.Note this migration will fail if the
table_name
column is NULL. One must manually fix these records as programmatically trying to remedy these invalid records is difficult as the intent is unclear and the tables may function (from a query standpoint) if SQL is provided. The following query determines which records are problematic:Finally this migration will fail if the
tables
table is corrupt in terms of the uniqueness. One must manually consolidate duplicate or non-valid records given there's no way of programmatically removing invalid records. The following query determines whether there are duplicates:Note this PR is gated by #5445 and #7084 which ensure that empty strings associated with form-data wont persist in the database and is necessary for ensuring that the relevant entries are non-NULL.
to: @fabianmenges @graceguo-supercat @michellethomas @mistercrunch @timifasubaa