You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the spark snowflake connector (2.11 and above I believe) you will always end up seeing the following (harmless) stack trace on the driver:
Caused by: java.io.NotSerializableException: org.apache.spark.storage.StorageStatus
Serialization stack:
- object not serializable (class: org.apache.spark.storage.StorageStatus, value: org.apache.spark.storage.StorageStatus@715b4e82)
- element of array (index: 0)
- array (class [Lorg.apache.spark.storage.StorageStatus;, size 2)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:41)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101)
at org.apache.spark.rpc.netty.NettyRpcEnv.serialize(NettyRpcEnv.scala:286)
at org.apache.spark.rpc.netty.RemoteNettyRpcCallContext.send(NettyRpcCallContext.scala:64)
at org.apache.spark.rpc.netty.NettyRpcCallContext.reply(NettyRpcCallContext.scala:32)
at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:156)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
and a corresponding warning in the executor that triggered it (note there is no additional detail after this line):
WARN SnowflakeTelemetry$: Fail to get cluster statistic. reason: Exception thrown in awaitResult:
This is caused by this commit: 2a3f090 which attempts to gather some telemetry about the running spark session. Specifically the offending line is SparkEnv.get.blockManager.master.getStorageStatus.length to attempt to gather the number of nodes in the cluster. Based on how StorageStatus is not serializable, I can't believe that this call has ever succeeded (nor will it ever) until Spark actually fixes the issue upstream. I'm wondering if it makes sense to remove this line (or somehow get the data in a different way) to clean up our logs and avoid having "ignorable" errors.
The text was updated successfully, but these errors were encountered:
In truth this is more an issue with Spark itself which is recorded here: https://issues.apache.org/jira/browse/SPARK-43108
When using the spark snowflake connector (2.11 and above I believe) you will always end up seeing the following (harmless) stack trace on the driver:
and a corresponding warning in the executor that triggered it (note there is no additional detail after this line):
This is caused by this commit: 2a3f090 which attempts to gather some telemetry about the running spark session. Specifically the offending line is
SparkEnv.get.blockManager.master.getStorageStatus.length
to attempt to gather the number of nodes in the cluster. Based on how StorageStatus is not serializable, I can't believe that this call has ever succeeded (nor will it ever) until Spark actually fixes the issue upstream. I'm wondering if it makes sense to remove this line (or somehow get the data in a different way) to clean up our logs and avoid having "ignorable" errors.The text was updated successfully, but these errors were encountered: