Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

Three unit tests newly failed on master branch #493

Closed
rui-mo opened this issue Aug 30, 2021 · 4 comments
Closed

Three unit tests newly failed on master branch #493

rui-mo opened this issue Aug 30, 2021 · 4 comments
Labels
bug Something isn't working

Comments

@rui-mo
Copy link
Collaborator

rui-mo commented Aug 30, 2021

Describe the bug
We found the below tests failed on master branch:

  • columnar arrow_udf test *** FAILED ***
  • continuous mode with various UDFs - Scalar Pandas UDF *** FAILED ***
  • fallback arrow_udf test *** FAILED ***

Error is:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 2) (sr404 executor driver): java.lang.IllegalArgumentException: Could not load buffers for field _0: Utf8. error message: A buffer can only be associated between two allocators that share the same root
at org.apache.arrow.vector.VectorLoader.loadBuffers(VectorLoader.java:117)
at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:81)
at org.apache.spark.sql.execution.python.ColumnarArrowPythonRunner$$anon$2.$anonfun$writeIteratorToStream$1(ColumnarArrowPythonRunner.scala:163)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
at org.apache.spark.sql.execution.python.ColumnarArrowPythonRunner$$anon$2.writeIteratorToStream(ColumnarArrowPythonRunner.scala:171)
at org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:397)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:232)
Caused by: java.lang.IllegalStateException: A buffer can only be associated between two allocators that share the same root
at org.apache.arrow.util.Preconditions.checkState(Preconditions.java:458)
at org.apache.arrow.memory.AllocationManager.associate(AllocationManager.java:96)
at org.apache.arrow.memory.AllocationManager.associate(AllocationManager.java:91)
at org.apache.arrow.memory.BufferLedger.retain(BufferLedger.java:320)
at org.apache.arrow.vector.BaseVariableWidthVector.loadFieldBuffers(BaseVariableWidthVector.java:320)
at org.apache.arrow.vector.VectorLoader.loadBuffers(VectorLoader.java:109)
... 8 more

To Reproduce
Steps to reproduce the behavior:
mvn clean test -P full-scala-compiler -Dbuild_arrow=OFF -Dbuild_protobuf=OFF -DfailIfNoTests=false -Dexec.skip=true -Dmaven.test.failure.ignore=true -Dtest=none -DwildcardSuites="org.apache.spark.sql.execution.python.ArrowEvalPythonExecSuite"

Additional context
We can locate to this commit: d6bc791
Before this commit, these test can work.
Look like some issue on memory allocation.

@rui-mo rui-mo added the bug Something isn't working label Aug 30, 2021
@rui-mo
Copy link
Collaborator Author

rui-mo commented Aug 30, 2021

@zhztheplayer @xuechendi An issue found on master branch.

@zhztheplayer
Copy link
Collaborator

I was able to expect something may get broken in this way due to d6bc791 but failed to produce. Is this test included in CI now?

A valid solution may be changing the allocation in python runner

private val allocator = ArrowUtils.rootAllocator.newChildAllocator(
s"stdin reader for $pythonExec", 0, Long.MaxValue)
to task-restricted context allocator. @xuechendi Hi Chendi, do you remember why a global allocator was used here? Do you think there is risk changing to a local one?

@rui-mo
Copy link
Collaborator Author

rui-mo commented Aug 31, 2021

We are testing them on Jenkins, and this error can be reproduced by running below cmd in gazelle plugin home.
mvn clean test -P full-scala-compiler -Dbuild_arrow=OFF -Dbuild_protobuf=OFF -DfailIfNoTests=false -Dexec.skip=true -Dmaven.test.failure.ignore=true -Dtest=none -DwildcardSuites="org.apache.spark.sql.execution.python.ArrowEvalPythonExecSuite"

@zhztheplayer
Copy link
Collaborator

Fixed in f07e6fb

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants