-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add Databricks ODBC engine spec (#16862)
* feat: add Databricks ODBC engine spec * Rename Databricks specs
- Loading branch information
1 parent
aa74721
commit 0ea83c5
Showing
3 changed files
with
96 additions
and
1 deletion.
There are no files selected for viewing
68 changes: 68 additions & 0 deletions
68
docs/src/pages/docs/Connecting to Databases/databricks.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
--- | ||
name: Databricks | ||
menu: Connecting to Databases | ||
route: /docs/databases/databricks | ||
index: 30 | ||
version: 1 | ||
--- | ||
|
||
## Databricks | ||
|
||
To connect to Databricks, first install [databricks-dbapi](https://pypi.org/project/databricks-dbapi/) with the optional SQLAlchemy dependencies: | ||
|
||
```bash | ||
pip install databricks-dbapi[sqlalchemy] | ||
``` | ||
|
||
There are two ways to connect to Databricks: using a Hive connector or an ODBC connector. Both ways work similarly, but only ODBC can be used to connect to [SQL endpoints](https://docs.databricks.com/sql/admin/sql-endpoints.html). | ||
|
||
### Hive | ||
|
||
To use the Hive connector you need the following information from your cluster: | ||
|
||
- Server hostname | ||
- Port | ||
- HTTP path | ||
|
||
These can be found under "Configuration" -> "Advanced Options" -> "JDBC/ODBC". | ||
|
||
You also need an access token from "Settings" -> "User Settings" -> "Access Tokens". | ||
|
||
Once you have all this information, add a database of type "Databricks (Hive)" in Superset, and use the following SQLAlchemy URI: | ||
|
||
``` | ||
databricks+pyhive://token:{access token}@{server hostname}:{port}/{database name} | ||
``` | ||
|
||
You also need to add the following configuration to "Other" -> "Engine Parameters", with your HTTP path: | ||
|
||
``` | ||
{"connect_args": {"http_path": "sql/protocolv1/o/****"}} | ||
``` | ||
|
||
### ODBC | ||
|
||
For ODBC you first need to install the [ODBC drivers for your platform](https://databricks.com/spark/odbc-drivers-download). | ||
|
||
For a regular connection use this as the SQLAlchemy URI: | ||
|
||
``` | ||
databricks+pyodbc://token:{access token}@{server hostname}:{port}/{database name} | ||
``` | ||
|
||
And for the connection arguments: | ||
|
||
``` | ||
{"connect_args": {"http_path": "sql/protocolv1/o/****", "driver_path": "/path/to/odbc/driver"}} | ||
``` | ||
|
||
The driver path should be: | ||
|
||
- `/Library/simba/spark/lib/libsparkodbc_sbu.dylib` (Mac OS) | ||
- `/opt/simba/spark/lib/64/libsparkodbc_sb64.so` (Linux) | ||
|
||
For a connection to a SQL endpoint you need to use the HTTP path from the endpoint: | ||
|
||
``` | ||
{"connect_args": {"http_path": "/sql/1.0/endpoints/****", "driver_path": "/path/to/odbc/driver"}} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters