-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49795][CORE][SQL][SS][DSTREAM][ML][MLLIB][K8S][YARN][EXAMPLES] Clean up deprecated Guava API usage #48248
Conversation
still some work left undone, but it doesn't require attention at this time |
@@ -266,7 +266,7 @@ class OpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag]( | |||
/** | |||
* Re-hash a value to deal better with hash functions that don't differ in the lower bits. | |||
*/ | |||
private def hashcode(h: Int): Int = Hashing.murmur3_32().hashInt(h).asInt() | |||
private def hashcode(h: Int): Int = Hashing.murmur3_32_fixed().hashInt(h).asInt() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, this is a different implementation that fixes the original one's bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/**
* Returns a hash function implementing the <a
* href="https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp">32-bit murmur3
* algorithm, x86 variant</a> (little-endian variant), using the given seed value, <b>with a known
* bug</b> as described in the deprecation text.
*
* <p>The C++ equivalent is the MurmurHash3_x86_32 function (Murmur3A), which however does not
* have the bug.
*
* @deprecated This implementation produces incorrect hash values from the {@link
* HashFunction#hashString} method if the string contains non-BMP characters. Use {@link
* #murmur3_32_fixed()} instead.
*/
@Deprecated
public static HashFunction murmur3_32() {
return Murmur3_32HashFunction.MURMUR3_32;
}
Yes, but this is the official fix provided, and there seems to be no other equivalent alternative.
@pan3793 any better suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is safe for Spark. The difference between MURMUR3_32
and MURMUR3_32_FIXED
lies in the different supplementaryPlaneFix
parameters passed when constructing the Murmur3_32HashFunction
:
static final HashFunction MURMUR3_32 =
new Murmur3_32HashFunction(0, /* supplementaryPlaneFix= */ false);
static final HashFunction MURMUR3_32_FIXED =
new Murmur3_32HashFunction(0, /* supplementaryPlaneFix= */ true);
However, the supplementaryPlaneFix
parameter is only used in Murmur3_32HashFunction#hashString
, and Spark only utilizes Murmur3_32HashFunction#hashInt
. Therefore, there will be no logical changes to this method after this change.
@Override
public HashCode hashInt(int input) {
int k1 = mixK1(input);
int h1 = mixH1(seed, k1);
return fmix(h1, Ints.BYTES);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for checking, it is not a problem then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pan3793
@@ -290,9 +290,11 @@ public void onFailure(Throwable e) { | |||
try { | |||
return result.get(timeoutMs, TimeUnit.MILLISECONDS); | |||
} catch (ExecutionException e) { | |||
throw Throwables.propagate(e.getCause()); | |||
Throwables.throwIfUnchecked(e.getCause()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unable to confirm what type of Throwable is wrapped in ExecutionException
, so still use
Throwables.throwIfUnchecked(e.getCause());
throw new RuntimeException(e.getCause());
all done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
Ping @pan3793 once more since he had a comment before.
Also, cc @panbingkun since he is working on this area. |
Thank you, @LuciferYang and @pan3793 . |
Thanks @dongjoon-hyun and @pan3793 |
+1, LGTM. |
Thanks @panbingkun ~ |
… Clean up deprecated Guava API usage ### What changes were proposed in this pull request? In order to clean up the usage of deprecated Guava API, the following changes were made in this pr: 1. Replaced `Files.write(from, to, charset)` with `Files.asCharSink(to, charset).write(from)`. This change was made with reference to: https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/io/Files.java#L275-L291 ```java /** * Writes a character sequence (such as a string) to a file using the given character set. * * param from the character sequence to write * param to the destination file * param charset the charset used to encode the output stream; see {link StandardCharsets} for * helpful predefined constants * throws IOException if an I/O error occurs * deprecated Prefer {code asCharSink(to, charset).write(from)}. */ Deprecated InlineMe( replacement = "Files.asCharSink(to, charset).write(from)", imports = "com.google.common.io.Files") public static void write(CharSequence from, File to, Charset charset) throws IOException { asCharSink(to, charset).write(from); } ``` 2. Replaced `Files.append(from, to, charset)` with `Files.asCharSink(to, charset, FileWriteMode.APPEND).write(from)`. This change was made with reference to: https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/io/Files.java#L350-L368 ```java /** * Appends a character sequence (such as a string) to a file using the given character set. * * param from the character sequence to append * param to the destination file * param charset the charset used to encode the output stream; see {link StandardCharsets} for * helpful predefined constants * throws IOException if an I/O error occurs * deprecated Prefer {code asCharSink(to, charset, FileWriteMode.APPEND).write(from)}. This * method is scheduled to be removed in October 2019. */ Deprecated InlineMe( replacement = "Files.asCharSink(to, charset, FileWriteMode.APPEND).write(from)", imports = {"com.google.common.io.FileWriteMode", "com.google.common.io.Files"}) public static void append(CharSequence from, File to, Charset charset) throws IOException { asCharSink(to, charset, FileWriteMode.APPEND).write(from); } ``` 3. Replaced `Files.toString(file, charset)` with `Files.asCharSource(file, charset).read()`. This change was made with reference to: https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/io/Files.java#L243-L259 ```java /** * Reads all characters from a file into a {link String}, using the given character set. * * param file the file to read from * param charset the charset used to decode the input stream; see {link StandardCharsets} for * helpful predefined constants * return a string containing all the characters from the file * throws IOException if an I/O error occurs * deprecated Prefer {code asCharSource(file, charset).read()}. */ Deprecated InlineMe( replacement = "Files.asCharSource(file, charset).read()", imports = "com.google.common.io.Files") public static String toString(File file, Charset charset) throws IOException { return asCharSource(file, charset).read(); } ``` 4. Replaced `HashFunction.murmur3_32()` with `HashFunction.murmur3_32_fixed()`. This change was made with reference to: https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/hash/Hashing.java#L99-L115 ```java /** * Returns a hash function implementing the <a * href="https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp">32-bit murmur3 * algorithm, x86 variant</a> (little-endian variant), using the given seed value, <b>with a known * bug</b> as described in the deprecation text. * * <p>The C++ equivalent is the MurmurHash3_x86_32 function (Murmur3A), which however does not * have the bug. * * deprecated This implementation produces incorrect hash values from the {link * HashFunction#hashString} method if the string contains non-BMP characters. Use {link * #murmur3_32_fixed()} instead. */ Deprecated public static HashFunction murmur3_32() { return Murmur3_32HashFunction.MURMUR3_32; } ``` This change is safe for Spark. The difference between `MURMUR3_32` and `MURMUR3_32_FIXED` lies in the different `supplementaryPlaneFix` parameters passed when constructing the `Murmur3_32HashFunction`: https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/hash/Murmur3_32HashFunction.java#L56-L59 ```java static final HashFunction MURMUR3_32 = new Murmur3_32HashFunction(0, /* supplementaryPlaneFix= */ false); static final HashFunction MURMUR3_32_FIXED = new Murmur3_32HashFunction(0, /* supplementaryPlaneFix= */ true); ``` However, the `supplementaryPlaneFix` parameter is only used in `Murmur3_32HashFunction#hashString`, and Spark only utilizes `Murmur3_32HashFunction#hashInt`. Therefore, there will be no logical changes to this method after this change. https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/hash/Murmur3_32HashFunction.java#L108-L114 ```java Override public HashCode hashInt(int input) { int k1 = mixK1(input); int h1 = mixH1(seed, k1); return fmix(h1, Ints.BYTES); } ``` 5. Replaced `Throwables.propagateIfPossible(throwable, declaredType)` with `Throwables.throwIfInstanceOf(throwable, declaredType)` + `Throwables.throwIfUnchecked(throwable)`. This change was made with reference to: https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/base/Throwables.java#L156-L175 ``` /** * Propagates {code throwable} exactly as-is, if and only if it is an instance of {link * RuntimeException}, {link Error}, or {code declaredType}. * * <p><b>Discouraged</b> in favor of calling {link #throwIfInstanceOf} and {link * #throwIfUnchecked}. * * param throwable the Throwable to possibly propagate * param declaredType the single checked exception type declared by the calling method * deprecated Use a combination of {link #throwIfInstanceOf} and {link #throwIfUnchecked}, * which togther provide the same behavior except that they reject {code null}. */ Deprecated J2ktIncompatible GwtIncompatible // propagateIfInstanceOf public static <X extends Throwable> void propagateIfPossible( CheckForNull Throwable throwable, Class<X> declaredType) throws X { propagateIfInstanceOf(throwable, declaredType); propagateIfPossible(throwable); } ``` 6. Made modifications to `Throwables.propagate` with reference to https://github.com/google/guava/wiki/Why-we-deprecated-Throwables.propagate - For cases where it is known to be a checked exception, including `IOException`, `GeneralSecurityException`, `SaslException`, and `RocksDBException`, none of which are subclasses of `RuntimeException` or `Error`, directly replaced `Throwables.propagate(e)` with `throw new RuntimeException(e);`. - For cases where it cannot be determined whether it is a checked exception or an unchecked exception or Error, use ```java throwIfUnchecked(e); throw new RuntimeException(e); ``` to replace `Throwables.propagate(e)`。 https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/base/Throwables.java#L199-L235 ```java /** * ... * deprecated To preserve behavior, use {code throw e} or {code throw new RuntimeException(e)} * directly, or use a combination of {link #throwIfUnchecked} and {code throw new * RuntimeException(e)}. But consider whether users would be better off if your API threw a * different type of exception. For background on the deprecation, read <a * href="https://goo.gl/Ivn2kc">Why we deprecated {code Throwables.propagate}</a>. */ CanIgnoreReturnValue J2ktIncompatible GwtIncompatible Deprecated public static RuntimeException propagate(Throwable throwable) { throwIfUnchecked(throwable); throw new RuntimeException(throwable); } ``` ### Why are the changes needed? Clean up deprecated Guava API usage. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#48248 from LuciferYang/guava-deprecation. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
In order to clean up the usage of deprecated Guava API, the following changes were made in this pr:
Files.write(from, to, charset)
withFiles.asCharSink(to, charset).write(from)
. This change was made with reference to:https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/io/Files.java#L275-L291
Files.append(from, to, charset)
withFiles.asCharSink(to, charset, FileWriteMode.APPEND).write(from)
. This change was made with reference to:https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/io/Files.java#L350-L368
Files.toString(file, charset)
withFiles.asCharSource(file, charset).read()
. This change was made with reference to:https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/io/Files.java#L243-L259
HashFunction.murmur3_32()
withHashFunction.murmur3_32_fixed()
. This change was made with reference to:https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/hash/Hashing.java#L99-L115
This change is safe for Spark. The difference between
MURMUR3_32
andMURMUR3_32_FIXED
lies in the differentsupplementaryPlaneFix
parameters passed when constructing theMurmur3_32HashFunction
:https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/hash/Murmur3_32HashFunction.java#L56-L59
However, the
supplementaryPlaneFix
parameter is only used inMurmur3_32HashFunction#hashString
, and Spark only utilizesMurmur3_32HashFunction#hashInt
. Therefore, there will be no logical changes to this method after this change.https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/hash/Murmur3_32HashFunction.java#L108-L114
Throwables.propagateIfPossible(throwable, declaredType)
withThrowables.throwIfInstanceOf(throwable, declaredType)
+Throwables.throwIfUnchecked(throwable)
. This change was made with reference to:https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/base/Throwables.java#L156-L175
Throwables.propagate
with reference to https://github.com/google/guava/wiki/Why-we-deprecated-Throwables.propagateFor cases where it is known to be a checked exception, including
IOException
,GeneralSecurityException
,SaslException
, andRocksDBException
, none of which are subclasses ofRuntimeException
orError
, directly replacedThrowables.propagate(e)
withthrow new RuntimeException(e);
.For cases where it cannot be determined whether it is a checked exception or an unchecked exception or Error, use
to replace
Throwables.propagate(e)
。https://github.com/google/guava/blob/0c33dd12b193402cdf6962d43d69743521aa2f76/guava/src/com/google/common/base/Throwables.java#L199-L235
Why are the changes needed?
Clean up deprecated Guava API usage.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Pass GitHub Actions
Was this patch authored or co-authored using generative AI tooling?
No