Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error connecting to Databricks #29

Open
Bartman0 opened this issue Feb 3, 2024 · 2 comments
Open

error connecting to Databricks #29

Bartman0 opened this issue Feb 3, 2024 · 2 comments

Comments

@Bartman0
Copy link

Bartman0 commented Feb 3, 2024

Hi, I try to get jdbc_fdw working with Databricks.
I have succeeded in compiling and deploying the extension. I see queries ending up at the Databricks side of things.
The connection is working therefore, and from the SQL endpoint monitoring I see succesfull queries being executing.

I enabled the lowest message levels on the PostgreSQL side (debug5), and I see this in the logs:

postgres=# select jdbc_exec('richardkooijman_databricks_jdbc6', 'SELECT * FROM dpd1_dev.richardkooijman.cms_afv_container');
DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
DEBUG: Added server = richardkooijman_databricks_jdbc6 to hashtable
DEBUG: In jq_connect_db_params
DEBUG: In jdbc_jvm_init
DEBUG: In jdbc_attach_jvm
DEBUG: Successfully created a JVM with 8192 MB heapsize and classpath set to '-Djava.class.path=/usr/lib/postgresql/15/lib'
DEBUG: In jdbc_create_JDBC_connection
DEBUG: Created a JDBC connection: jdbc:databricks://adb-5159569612410553.13.azuredatabricks.net:443/default;transportMode=http;ssl=1;AuthMech=3;EnableArrow=0;httpPath=/sql/1.0/warehouses/3d85a8325bf4f4c6;
DEBUG: In jq_exec_id(0x557a010e6048): SELECT * FROM dpd1_dev.richardkooijman.cms_afv_container
DEBUG: Get resultSetID successfully, ID: 1
DEBUG: In jq_iterate_all_row
DEBUG: In jq_clear
DEBUG: In jq_release_resultset_id: 1
DEBUG: In jdbc_detach_jvm
DEBUG: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
DEBUG: In jdbc_attach_jvm
DEBUG: In jdbc_detach_jvm
DEBUG: In jdbc_detach_jvm
LOG: duration: 11264.023 ms statement: select jdbc_exec('richardkooijman_databricks_jdbc6', 'SELECT * FROM dpd1_dev.richardkooijman.cms_afv_container');

This works, and gives me screens full of records.

however, if I try to make a foreign table with this command:
CREATE FOREIGN TABLE richardkooijman.cms_afv_container6
(
chipnumber text,
unitidnumber text,
color text,
containertype text,
emptyfrequency text,
fraction text,
mark text,
owner text,
replacementdate text,
well text,
adoptedcontainer text,
id text,
createdat text,
deliverydate text,
idnumber text,
modifiedat text,
operationaldate text,
placingdate text,
serialnumber text,
warrantydate text,
active text,
comment text,
outofservicedate text,
ownership text,
mf_insert_datetime text,
mf_run_id text,
containertype_id INT,
well_id INT) SERVER richardkooijman_databricks_jdbc6

and try to do:

postgres=# select * from richardkooijman.cms_afv_container6;
DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
PANIC: ERRORDATA_STACK_SIZE exceeded
SSL SYSCALL error: EOF detected
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

it fails as you will probably see.

the DDL for the table on the other end is:

-- dpd1_dev.richardkooijman.cms_afv_container definition

CREATE TABLE dpd1_dev.richardkooijman.cms_afv_container (
chipnumber STRING,
unitidnumber STRING,
color STRING,
containertype STRING,
emptyfrequency STRING,
fraction STRING,
mark STRING,
owner STRING,
replacementdate STRING,
well STRING,
adoptedcontainer STRING,
id STRING,
createdat STRING,
deliverydate STRING,
idnumber STRING,
modifiedat STRING,
operationaldate STRING,
placingdate STRING,
serialnumber STRING,
warrantydate STRING,
active STRING,
comment STRING,
outofservicedate STRING,
ownership STRING,
mf_insert_datetime STRING,
mf_run_id STRING,
containertype_id INT,
well_id INT)
USING delta
TBLPROPERTIES (
'delta.minReaderVersion' = '1',
'delta.minWriterVersion' = '2');

I just replaced STRING with text as datatype.

any thoughts?
any hints on how to proceed?

@Bartman0
Copy link
Author

Bartman0 commented Feb 3, 2024

additional info:

CREATE SERVER richardkooijman_databricks_jdbc6 FOREIGN DATA WRAPPER jdbc_fdw
OPTIONS(
drivername 'com.databricks.client.jdbc.Driver',
url 'jdbc:databricks://adb-**************.13.azuredatabricks.net:443/default;transportMode=http;ssl=1;AuthMech=3;EnableArrow=0;httpPath=/sql/1.0/warehouses/3d85a8325bf4f4c6;',
querytimeout '30',
jarfile '/usr/lib/postgresql/15/lib/DatabricksJDBC42.jar',
maxheapsize '8192'
);

CREATE USER MAPPING FOR CURRENT_USER SERVER richardkooijman_databricks_jdbc6
OPTIONS(username 'token',password 'dapi03e795579baff811************-2');

@MinhLA1410
Copy link
Contributor

@Bartman0 , Thank you for reporting

JDBC FDW has not been tested with databrick. We do not have special experience and knowledge using databrick.
We would like to know if you have any additional resources or information. We are open to any suggestions and can consider them for future improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants