Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema viewer enhancements. #2990

Closed
wants to merge 6 commits into from
Closed

Conversation

emtwo
Copy link

@emtwo emtwo commented Oct 19, 2018

This is a prototype of where additional schema info might go. It's a follow-up for some discussion in mozilla#376 (comment)

Essentially each table would have a "?" icon that you can click on to get more information about that table's schema.

In the future, this drawer could have more information in it such as a table or column descriptions or a link to table query samples that are entered on the data source page.

Here are a couple of screenshots:

screen shot 2018-10-19 at 10 25 11 am

screen shot 2018-10-19 at 10 25 31 am

@emtwo
Copy link
Author

emtwo commented Oct 19, 2018

Some notes/thoughts:

  • @arikfr I know you mentioned using data type icons, I'm wondering if you feel they're still necessary in this table format since now there is more room to write the full type names?

  • I only wrote this for postgres but each data source type will need to have custom code added to gather this extra metadata. Perhaps we could start with landing just a couple of them? For now it should just not display the "?" icon if no extra metadata exists.

  • For now the data sample query is quite naive, "select * from table limit 1" Perhaps this could be the default with an option for the data source creator to change it to a custom query.

    Aside from these potential improvements, I feel like this PR as it is so far is already useful (maybe just add some tests and a couple more data sources) and we can land it then iterate with further improvements. @arikfr what do you think?

@emtwo emtwo force-pushed the emtwo/schema branch 2 times, most recently from 4f9ff8e to 294df87 Compare October 19, 2018 15:38
@arikfr
Copy link
Member

arikfr commented Oct 19, 2018

@arikfr I know you mentioned using data type icons, I'm wondering if you feel they're still necessary in this table format since now there is more room to write the full type names?

I think they are still necessary to show this data in the schema browser, so the user doesn't have to open the drawer every time to see this data.

I only wrote this for postgres but each data source type will need to have custom code added to gather this extra metadata. Perhaps we could start with landing just a couple of them? For now it should just not display the "?" icon if no extra metadata exists.

Yes, it totally makes sense that not all data sources will support this. We can make a generic "SQL data source analyzer" which we can apply to all the SQL based data sources, but still need to support the option that it's not supported by all.

Aside from these potential improvements, I feel like this PR as it is so far is already useful (maybe just add some tests and a couple more data sources) and we can land it then iterate with further improvements. @arikfr what do you think?

Yes, it's definitely doesn't need to be changed much further to land. Even the icons support is something we can later introduce.

The only thing that we should put some thought into is the new data structure the get_schema method will return (related issue: #2553). At first I thought that we shouldn't have a separate metadata object, but just merge it with columns. But now I realize that it has the benefit of not having to change any existing code. Only thing I would consider is to have only one of them. So new data sources will have only metadata while old ones will have columns.

@kocsmy
Copy link
Collaborator

kocsmy commented Oct 19, 2018

Thanks for this, @emtwo

It feels so redundant to have 10-25-50 (?) icon next to each item. How about hiding it just like the >> icon, so we'll only show it on hover. I believe that'd be a more elegant design for this feature.

The icons (type) will be something that can stay there constantly.

@emtwo emtwo force-pushed the emtwo/schema branch 2 times, most recently from 0a9f113 to 2a2edb3 Compare October 29, 2018 17:11
@emtwo
Copy link
Author

emtwo commented Oct 29, 2018

@arikfr, I've made the following updates/changes:

  • Use table.metadata when table.columns does not exist. This means that future data sources can have only metadata and this will work.

  • I added the method _get_table_sample to BaseQueryRunner as a helper to generalize and it's used by 5 different data sources now to start - presto, mysql, athena, pg, redshift

  • Added some tests

  • Made the ? button only visible on hover, similar to >>:

screen shot 2018-10-29 at 12 55 00 pm

@emtwo
Copy link
Author

emtwo commented Nov 7, 2018

note relevant PR: #2826

@jezdez
Copy link
Member

jezdez commented Nov 7, 2018

For future reference: #2826

@emtwo emtwo force-pushed the emtwo/schema branch 2 times, most recently from b3405f1 to 7f78382 Compare November 26, 2018 15:45
@emtwo
Copy link
Author

emtwo commented Nov 26, 2018

@arikfr I've updated this PR with some functionality that was discussed during the Redash work week:

  • I added two new tables, table_metadata and column_metadata which will store schema information about a table that used to be cached in redis as well as new metadata such as column/table descriptions or examples.

  • The refresh_schema celery task that runs periodically now updates schemas stored in these tables. It will never delete a table or column but will only mark the exists field as False and if the table/column re-appears it will mark it as True again.

  • Fetching a schema no longer looks it up in the redis cache, but instead looks up whatever data is currently available in the metadata tables.

  • The front-end will not display tables or columns with exists marked as False

@emtwo emtwo force-pushed the emtwo/schema branch 5 times, most recently from 447a500 to ac91e37 Compare November 26, 2018 21:19
@emtwo emtwo requested a review from arikfr November 26, 2018 21:31
@emtwo emtwo deleted the emtwo/schema branch January 16, 2019 15:05
@ghost ghost removed the review label Jan 16, 2019
@emtwo emtwo mentioned this pull request Jan 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants