Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(db_engine_specs): Refactor get_index #23656

Merged

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Apr 12, 2023

SUMMARY

Rather than having a normalize_indexes method for normalizing the response of the DB-API get_indexes function, the DB engine spec should provide a more flexible utility method named get_indexes—akin to get_table_names, get_column_names, etc. with access to the database et al. objects and can be overridden.

The motivation for the change is at Airbnb for Trino we have both Iceberg and Hive backed tables where the Trino DB-API get_indexes method returns an empty list for Iceberg backed tables. Providing a get_indexes method within the DB engine spec allows us to easily add custom logic to handle both scenarios within our derived engine spec.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

CI. Added unit tests.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

df = database.get_df(sql, schema)
return column_names, cls._latest_partition_from_df(df)

return column_names, cls._latest_partition_from_df(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same logic as before just a cleaner presentation including adding keyword arguments, i.e., previously it wasn't apparent in cls._partition_query(table_name, database, 1, part_fields) what 1 and part_fields meant.

@john-bodley john-bodley force-pushed the john-bodley-refactor-get-indexes branch 2 times, most recently from 3d77d2a to bb68e0f Compare April 12, 2023 03:54
@pull-request-size pull-request-size bot added size/L and removed size/M labels Apr 12, 2023
@john-bodley john-bodley marked this pull request as ready for review April 12, 2023 04:17
@john-bodley john-bodley force-pushed the john-bodley-refactor-get-indexes branch 4 times, most recently from f9868f3 to d631e64 Compare April 12, 2023 06:15
@codecov
Copy link

codecov bot commented Apr 12, 2023

Codecov Report

Merging #23656 (db55c7b) into master (976e333) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head db55c7b differs from pull request most recent head ebd79e5. Consider uploading reports for the commit ebd79e5 to get more accurate results

@@           Coverage Diff           @@
##           master   #23656   +/-   ##
=======================================
  Coverage   68.08%   68.08%           
=======================================
  Files        1920     1920           
  Lines       73984    73990    +6     
  Branches     8092     8092           
=======================================
+ Hits        50374    50379    +5     
- Misses      21539    21540    +1     
  Partials     2071     2071           
Flag Coverage Δ
hive 53.18% <66.66%> (+0.01%) ⬆️
mysql 79.21% <100.00%> (+<0.01%) ⬆️
postgres 79.29% <100.00%> (+<0.01%) ⬆️
presto 53.09% <66.66%> (+0.01%) ⬆️
python 83.14% <100.00%> (+<0.01%) ⬆️
sqlite 77.78% <100.00%> (+<0.01%) ⬆️
unit 53.03% <75.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/db_engine_specs/base.py 90.78% <100.00%> (-0.08%) ⬇️
superset/db_engine_specs/bigquery.py 70.22% <100.00%> (+0.55%) ⬆️
superset/db_engine_specs/presto.py 87.84% <100.00%> (-0.08%) ⬇️
superset/models/core.py 89.72% <100.00%> (-0.03%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@john-bodley john-bodley force-pushed the john-bodley-refactor-get-indexes branch 2 times, most recently from 780e135 to fd1f5cd Compare April 12, 2023 18:24
@john-bodley john-bodley force-pushed the john-bodley-refactor-get-indexes branch from fd1f5cd to ebd79e5 Compare April 12, 2023 19:37
Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +568 to +569
table_name,
database,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
table_name,
database,
table_name=table_name,
database=database,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to only provide keywords if the variable names differ.

Copy link
Member

@michael-s-molina michael-s-molina Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. I generally like to always provide the pair to remove any ordering requirements. By the way, keyword arguments is one of my favorite Python features.

@john-bodley john-bodley merged commit b35b5a6 into apache:master Apr 12, 2023
john-bodley added a commit to airbnb/superset-fork that referenced this pull request Apr 12, 2023
@john-bodley john-bodley deleted the john-bodley-refactor-get-indexes branch April 12, 2023 21:31
john-bodley added a commit to airbnb/superset-fork that referenced this pull request Apr 14, 2023
john-bodley added a commit to airbnb/superset-fork that referenced this pull request Apr 26, 2023
sebastianliebscher pushed a commit to sebastianliebscher/superset that referenced this pull request Apr 28, 2023
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.0.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants