-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-27560: [2.3] Enhancing compatibility with Guava #4542
Conversation
cc @viirya |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch makes sense to me.
I believe it could help to unblock Spark upgrading Guava.
I tested it with the Spark master branch with Guava 32.0.1-jre [1], everything looks well on my internal YARN 3.3 cluster.
Note that, there are a few UTs failed because of IsolatedClientLoader[2], it's another story.
[1] https://github.com/pan3793/spark/tree/guava
[2] https://www.mail-archive.com/dev@spark.apache.org/msg30708.html
@sunchao are we good to go? I think apache/spark#42493 at least proves this PR makes Hive 2.3.10 compatible with Guava 32+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged to branch-2.3, thanks @LuciferYang @pan3793 ! |
### What changes were proposed in this pull request? This PR upgrades Spark's built-in Guava from 14 to 33.2.1-jre Currently, Spark uses Guava 14 because the previous built-in Hive 2.3.9 is incompatible with new Guava versions. HIVE-27560 (apache/hive#4542) makes Hive 2.3.10 compatible with Guava 14+ (thanks to LuciferYang) ### Why are the changes needed? It's a long-standing issue, see prior discussions at #35584, #36231, and #33989 ### Does this PR introduce _any_ user-facing change? Yes, some user-faced error messages changed. ### How was this patch tested? GA passed. Closes #42493 from pan3793/guava. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
This PR made the following changes to allow Hive 2.3 to use the more compatible Guava API:
Copied
com.google.common.base.Objects
from Guava 14.0.1 version toorg.apache.hive.common.guava.Objects
and changed Hive to useorg.apache.hive.common.guava.Objects
. This change is to accommodate versions of Guava after 21, asObjects
was renamed toMoreObjects
after Guava 21.Copied
com.google.common.base.Stopwatch
from Guava 14.0.1 version toorg.apache.hive.common.guava.Stopwatch
, and changed Hive to useorg.apache.hive.common.guava.Stopwatch
. This change was made to accommodate versions of Guava after 17, whereStopwatch
underwent significant changes. Maintaining the use of a fixed API is simpler than using reflection for compatibility.Copied
org.spark_project.guava.util.concurrent.MoreExecutors.SameThreadExecutorService
and its corresponding utility method from Guava 14.0.1 version toorg.apache.hive.common.guava.SameThreadExecutorUtil
, and changed Hive to useSameThreadExecutorUtil#sameThreadExecutor
. This change was made to accommodate versions of Guava after 26, which no longer contains this API.Used reflection in the
TephraHBaseConnection#connect
method to start theTransactionManager
. This is to accommodate versions of Guava after 17. Prior to Guava 16,startAndWait
was called via reflection, while from Guava 17 onwardsstartAsync
andawaitRunning
are called through reflection.In order to ensure compatibility with Guava 21+, a
test
method was added for each implementation of thecom.google.common.base.Predicate
.In order to ensure compatibility with Guava 26+, the
Futures.addCallback
method with anExecutor
parameter is uniformly used in Hive 2.3. TheSameThreadExecutorUtil.sameThreadExecutor()
is passed in according to the implementation of Guava 14.0.1.In order to ensure compatibility with Guava 20+,
Collections.emptyIterator()
was used instead ofIterators.emptyIterator()
.In order to ensure compatibility with Guava 16+, the
Hasher#putString
method with aCharset
parameter is uniformly used.The relocation behavior of Guava in the
hive-exec
module has been removed, this operation should no longer be necessary.After this PR, Hive 2.3 uses APIs that exist from Guava 14.0.1 to 32.1.2-jre, which also makes it possible for downstream projects that depend on Hive to upgrade their own Guava dependencies.
Why are the changes needed?
Make downstream projects that rely on Hive to possibly upgrade the Guava version they depend on.
Does this PR introduce any user-facing change?
No
Is the change a dependency upgrade?
No
Need to add additional error_prone_annotations 2.0.12 dependency.
How was this patch tested?