-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: improve get_db_engine_spec_for_backend #21171
fix: improve get_db_engine_spec_for_backend #21171
Conversation
b20b711
to
094fe68
Compare
094fe68
to
a705479
Compare
a705479
to
b2e8c66
Compare
072f395
to
f77246c
Compare
Codecov Report
@@ Coverage Diff @@
## master #21171 +/- ##
==========================================
+ Coverage 66.37% 66.43% +0.06%
==========================================
Files 1781 1782 +1
Lines 67871 68018 +147
Branches 7239 7239
==========================================
+ Hits 45047 45190 +143
- Misses 20966 20970 +4
Partials 1858 1858
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
e364210
to
6f7876b
Compare
6f7876b
to
bfb87d7
Compare
@@ -78,20 +85,31 @@ def load_engine_specs() -> List[Type[BaseEngineSpec]]: | |||
return engine_specs | |||
|
|||
|
|||
def get_engine_specs() -> Dict[str, Type[BaseEngineSpec]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@john-bodley does AirBnB use this function outside of the Superset code base?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
SUMMARY
How does Superset map from a SQLAlchemy URI — eg,
postgres+psycopg2://user:password@host/db
— to a DB engine spec? Currently the mapping is done using only the backend part of the URI; in this examplepostgres
matches thePostgresEngineSpec
engine spec since it's listed in theengine_aliases
class attribute:superset/superset/db_engine_specs/postgres.py
Lines 166 to 168 in d408393
This matching has worked perfectly until the introduction of new DB engine specs that share the backend but have different drivers. Eg, for Databricks we currently have 3 DB engine specs:
superset/superset/db_engine_specs/databricks.py
Lines 49 to 52 in d408393
superset/superset/db_engine_specs/databricks.py
Lines 58 to 61 in d408393
superset/superset/db_engine_specs/databricks.py
Lines 76 to 79 in d408393
Whenever we encounter a database configured with a
databricks
backend — be itdatabricks+pyhive://
,databricks+pyodbc://
ordatabricsk+connector://
we discard the driver part, and perform a match using only the backend. The end result is that one of the 3 DB engine specs is chosen at random every time!To fix the problem, I extended the base DB engine spec with 2 new class attributes:
drivers: Dict[str, str] = {}
, a dictionary with driver names ("psycopg2") and their descriptionsdefault_driver: Optional[str] = None
, the recommended driver to be used when one is not selected (used only when the SQLAlchemy URI is not entered directly)With these new attributes, instead of simply matching the backend of a URI (
databricks
) we can use the driver as well (pyhive
) to find the associated DB engine spec.A few things of note: first, if the backend matches but no driver matches the new flow will still return a random DB engine specs from the ones that support the backend. This preserves the old behavior and allows people to use drivers that Superset is not aware of. Eg, someone could configure a database with the
postgres+newfancydriver://
URI and Superset would still usePostgresEngineSpec
as the engine spec.Second, if no backend matches we also preserve the old behavior of returning the
BaseEngineSpec
, since this allows users to explore databases that Superset is not aware of, though at limited functionality (no time grains, for example).BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A
TESTING INSTRUCTIONS
WIP
ADDITIONAL INFORMATION