Skip to content

Commit

Permalink
[SPARK-47085][SQL][3.4] reduce the complexity of toTRowSet from n^2 to n
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
reduce the complexity of RowSetUtils.toTRowSet from n^2 to n

### Why are the changes needed?
This causes performance issues.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Tests + test manually on AWS EMR

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #45164 from igreenfield/branch-3.4.

Authored-by: Izek Greenfield <izek.greenfield@adenza.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
Izek Greenfield authored and dongjoon-hyun committed Feb 21, 2024
1 parent 081c7a7 commit ef02dbd
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 11 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,7 @@ object RowSetUtils {
rows: Seq[Row],
schema: Array[DataType],
timeFormatters: TimeFormatters): TRowSet = {
var i = 0
val rowSize = rows.length
val tRows = new java.util.ArrayList[TRow](rowSize)
while (i < rowSize) {
val row = rows(i)
val tRows = rows.map { row =>
val tRow = new TRow()
var j = 0
val columnSize = row.length
Expand All @@ -65,9 +61,8 @@ object RowSetUtils {
tRow.addToColVals(columnValue)
j += 1
}
i += 1
tRows.add(tRow)
}
tRow
}.asJava
new TRowSet(startRowOffSet, tRows)
}

Expand Down Expand Up @@ -159,8 +154,7 @@ object RowSetUtils {
val size = rows.length
val ret = new java.util.ArrayList[T](size)
var idx = 0
while (idx < size) {
val row = rows(idx)
rows.foreach { row =>
if (row.isNullAt(ordinal)) {
nulls.set(idx, true)
ret.add(idx, defaultVal)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ private[hive] class SparkExecuteStatementOperation(
val offset = iter.getPosition
val rows = iter.take(maxRows).toList
log.debug(s"Returning result set with ${rows.length} rows from offsets " +
s"[${iter.getFetchStart}, ${offset}) with $statementId")
s"[${iter.getFetchStart}, ${iter.getPosition}) with $statementId")
RowSetUtils.toTRowSet(offset, rows, dataTypes, getProtocolVersion, getTimeFormatters)
}

Expand Down

0 comments on commit ef02dbd

Please sign in to comment.