-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Databricks in SQLDatabase #4702
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some minor comments on the API doc
langchain/sql_database.py
Outdated
f"databricks://token:{api_token}@{host}?" | ||
f"http_path={http_path}&catalog={catalog}&schema={schema}" | ||
) | ||
return cls.from_uri(uri, engine_args=None, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity are engine_args explicitly not allowed in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. I just added the engine_args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two small comments, otherwise looks good!
# Add documentation for Databricks integration This is a follow-up of #4702 It documents the details of how to integrate Databricks using langchain. It also provides examples in a notebook. ## Who can review? @dev2049 @hwchase17 since you are aware of the context. We will promote the integration after this doc is ready. Thanks in advance!
This PR adds support for Databricks runtime and Databricks SQL by using Databricks SQL Connector for Python.
As a cloud data platform, accessing Databricks requires a URL as follows
databricks://token:{api_token}@{hostname}?http_path={http_path}&catalog={catalog}&schema={schema}
.The URL is complicated and it may take users a while to figure it out. Since the fields
api_token
/hostname
/http_path
fields are known in the Databricks notebook, I am proposing a new methodfrom_databricks
to simplify the connection to Databricks.In Databricks Notebook
After changes, Databricks users only need to specify the
catalog
andschema
field when using langchain.In Jupyter Notebook
The method can be used on the local setup as well: