-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry to RoundRobin Partitioner and Range Partitioner #9419
Conversation
Signed-off-by: Ferdinand Xu <ferdinandx@nvidia.com>
build |
1 similar comment
build |
build |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuRoundRobinPartitioning.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuRoundRobinPartitioning.scala
Outdated
Show resolved
Hide resolved
build |
1 similar comment
build |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuRoundRobinPartitioning.scala
Outdated
Show resolved
Hide resolved
tests/src/test/scala/com/nvidia/spark/rapids/ShufflePartitionerRetrySuite.scala
Outdated
Show resolved
Hide resolved
tests/src/test/scala/com/nvidia/spark/rapids/ShufflePartitionerRetrySuite.scala
Outdated
Show resolved
Hide resolved
build |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuRangePartitioner.scala
Outdated
Show resolved
Hide resolved
(parts, columns) | ||
} else { | ||
// Increase ref count since the caller will close the batch also. | ||
val spillableBatch = SpillableColumnarBatch(GpuColumnVector.incRefCounts(batch), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same leak question here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This batch will be close in its caller.
build |
1 similar comment
build |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuRoundRobinPartitioning.scala
Outdated
Show resolved
Hide resolved
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits from me.
new Random(TaskContext.get().partitionId()) | ||
} else { | ||
// For unit test purpose where task context does not exist | ||
new Random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should we have a hard coded seed here instead? just so that we get deterministic results for unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it could be helpful when we need to validate result. We can touch this part when we have such needs.
TestUtils.withGpuSparkSession(new SparkConf()) { _ => | ||
val rrp = GpuRoundRobinPartitioning(partNum) | ||
// batch will be closed within columnarEvalAny | ||
val batch = buildBatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it might be nice to build the batch right when we call columnarEvalAny
just so there is less code between where it is created and where it is consumed. Just because it could leak if there is an exception inbetween.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's separated due to buildBatch
method implement didn't have a retry in this test suite. We can do it and since it's just a test, so I skipped it.
|
||
val rp = GpuRangePartitioner(Array.apply(bounds), gpuSorter) | ||
// batch will be closed within columnarEvalAny | ||
val batch = buildBatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same nit here about how long it lives for. Very minor, but a nice to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
Currently there're three partitioners in general: round robin, range and single partitioner. This PR will cover round robin and range partitioner. And will create a follow-up PR for single partitioner.
It's tested by newly added UTs.
Related issue is 8502.