-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Databricks 12.2 Support [databricks] #8282
Conversation
Signed-off-by: Andy Grove <andygrove@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one small comment
This was going fine until I ran into this compilation error:
|
nm, this changed in 3.4 |
…pids/delta/shims/InvariantViolationExceptionShim.scala Co-authored-by: Jason Lowe <jlowe@nvidia.com>
…transaction/tahoe/rapids/GpuOptimisticTransaction.scala Co-authored-by: Jason Lowe <jlowe@nvidia.com>
…hims/SparkShims.scala Co-authored-by: Jason Lowe <jlowe@nvidia.com>
This commit fixes a compile failure on Databricks, introduced in NVIDIA#8240. Partition.files is a `SerializableFileStatus` on Databricks, as opposed to a `FileStatus` on Apache Spark. This commit solves the problem, without Shims. Signed-off-by: MithunR <mythrocks@gmail.com>
…bricks' into 332db
build |
I got the
|
@andygrove we need to add a shim for RapidsShuffleManager for spark322db. This is why we added the test, because last time we merged a shim and we didn't know it (@nartal1 for the win!!) |
@andygrove change looks good to me. |
re: RapidsShuffleManager, ordinarily It should have been caught by a scala test SparkShimSuite but we haven't prioritized making them runable on DBR #1320 |
build |
@@ -131,6 +131,7 @@ def generate_dest_data(spark): | |||
@pytest.mark.parametrize("use_cdf", [True, False], ids=idfn) | |||
@pytest.mark.parametrize("partition_columns", [None, ["a"]], ids=idfn) | |||
@pytest.mark.skipif(is_before_spark_320(), reason="Delta Lake writes are not supported before Spark 3.2.x") | |||
@pytest.mark.skipif(is_databricks122_or_later(), reason="https://github.com/NVIDIA/spark-rapids/issues/8423") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically we want all these to be xfail'd tests rather than skipped. We xfail tests that are failing due to bugs and should work at some point in the future when those bugs are fixed. We skip tests that we never expect to work. Definitely not must-fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine, but I didn't dig too deeply into the delta code because there is a follow on issue for that.
build |
1 similar comment
build |
Closes #7876
This PR adds support for Databricks 12.2
SQL Plugin
Delta Lake
Add new Delta Lake shim for 332db, which is essentially a copy of 330db with minor modifications. I have filed a follow-on issue #8320 for using shimplify in this module and refactoring to remove duplicate code.
Changes
The following files are duplicated between 330db and 332db without modification:
Integration Tests
Status / TODO
Follow-on issues: