-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated dependencies for Spark 3.0.0 #30
base: master
Are you sure you want to change the base?
Conversation
…ss for Spark 3.0 changes
This is strange, all tests passed on my local dev environment. See the log below:
` |
@dovijoel Thanks for your work. Did you do a E2E test with Databricks/Spark3.0? AFIAK that will fail. This needs a source level fix as well. |
val sparkVersion = "2.4.6" | ||
scalaVersion := "2.12.11" | ||
ThisBuild / useCoursier := false | ||
val sparkVersion = "3.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would need to support sparkVersion 2.4/Scala 2.11 combo as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will look into supporting both scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use something like
crossScalaVersions := Seq("2.12.10", "2.11.12")
to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rajmera3 it won't work, because there is no Spark 3.0 with Scala 2.11. Here we need to have a combo of (Spark 2.4 + Scala 2.11) & (Spark 3.0 + Scala 2.12)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option here is to make main line as Spark3.0/Scala.2.12 and stable/old version to a separate branch e.g. Spark2.4 branch.
Can you please point me in the right direction on how to do an E2E test? After looking through the changes that was done for Spark 3.0, it seems that some classes are now no longer accessible, specifically the internal Logger class |
I get a 404 when i access these links. Yes, will need to remove the dependency on internal logging and have a logger created in the connector. 2 possibilities
I am not sure what's other issues happen with DBR 3.0. Can u please put any errors/finding here? |
Thank you for your advice! I will incorporate your advice and let you know. |
Hi @dovijoel, did you see my PR against your PR? 😄 |
I was able to validate the connector with Databricks DBR 7.1 runtime successfully from the forked branch which had the PR 30 merged. (https://github.com/dovijoel/sql-spark-connector). print("Use Apache Spark Connector for SQL Server and Azure SQL to write to master SQL instance ")
servername = "jdbc:sqlserver://aravishsqlserver.database.windows.net"
dbname = "custommetastore"
url = servername + ";" + "databaseName=" + dbname + ";"
dbtable = "TBLS"
user = "aravish"
password = "xxxxxxxx" # Please specify password here
print("read data from SQL server table ")
jdbcDF = spark.read \
.format("com.microsoft.sqlserver.jdbc.spark") \
.option("url", url) \
.option("dbtable", dbtable) \
.option("user", user) \
.option("password", password).load()
jdbcDF.show(5) (1) Spark Jobs Writes jdbcDF.write \
.format("com.microsoft.sqlserver.jdbc.spark") \
.mode("overwrite") \
.option("url", url) \
.option("dbtable", "TBLS_Spark_SQL_Connector") \
.option("user", user) \
.option("password", password) \
.save() (1) Spark Jobs |
|
Thanks @aravish. Can u test with Master and check that you get an error and repeat the test with this PR branch. |
Any progress on this? I can also validate that the spark-3.0 branch of @dovijoel works. I have compiled it successfully on Windows and used it to read and write data using DBR 7.3. |
@MrWhiteABEX - we have the same issue, using an old Databricks runtime because this connector doesn't support Spark 3.0. As an Azure Databricks customer, it's important to us that Spark 3.0+ support comes ASAP. Please advise if there's anything I can do to validate this in our DEV environment. |
@gmdiana-hershey In our testing environment I´am already running DBR 7.2 with the spark connector from adjustments from dovijoel. No issues experienced so far. I think the only problem is that the CI build pipeline is broken by Scala 2.12 but I did not investigate it. |
I just tested build with Spark 3.0.1 & DBR 7.3, and it works just fine. |
@aravish @shivsood any update on this? |
I tried building it with sbt assembly to get a fat jar. I used the jar on DBR 7.3 with spark 3.0.1, but when writing to a database, I get the following error:
It appears that it fails when writing the data to the database, since the table itself is successfully created in the database with all its columns. |
compiling the fat JAR worked for me, as I descbribed here: #15 (comment) this has been running smooth at Databrick Runtime 7.4 | Spark 3.1 over the last few days |
"com.novocode" % "junit-interface" % "0.11" % "test", | ||
|
||
//SQLServer JDBC jars | ||
"com.microsoft.sqlserver" % "mssql-jdbc" % "7.2.1.jre8" | ||
"com.microsoft.sqlserver" % "mssql-jdbc" % "8.2.1.jre8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why v8.2 for JDBC driver? Did 7.2 give some issue or was this just alignment to latest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark 3 is JDK 11, you need to use
// https://mvnrepository.com/artifact/com.microsoft.sqlserver/mssql-jdbc
libraryDependencies += "com.microsoft.sqlserver" % "mssql-jdbc" % "8.4.1.jre11"
Updated dependencies to work with Spark 3.0 as addressed in issue #15