Skip to content

Commit

Permalink
[SPARK-43885][SQL][FOLLOWUP] Instruction#dataType should not fail
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This is a followup of apache#41448 . As an optimizer rule, the produced plan should be resolved and resolved expressions should be able to report data type. The `Instruction` expression fails to report data type and may break external optimizer rules. This PR makes it to return dummy NullType.

### Why are the changes needed?

to not break external optimizer rules.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

Closes apache#42482 from cloud-fan/merge.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
cloud-fan committed Aug 15, 2023
1 parent 46580ab commit c9ff702
Showing 1 changed file with 6 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeSet, Expre
import org.apache.spark.sql.catalyst.plans.logical.MergeRows.{Instruction, ROW_ID}
import org.apache.spark.sql.catalyst.trees.UnaryLike
import org.apache.spark.sql.catalyst.util.truncatedString
import org.apache.spark.sql.types.DataType
import org.apache.spark.sql.types.{DataType, NullType}

case class MergeRows(
isSourceRowPresent: Expression,
Expand Down Expand Up @@ -74,7 +74,11 @@ object MergeRows {
def condition: Expression
def outputs: Seq[Seq[Expression]]
override def nullable: Boolean = false
override def dataType: DataType = throw new UnsupportedOperationException("dataType")
// We return NullType here as only the `MergeRows` operator can contain `Instruction`
// expressions and it doesn't care about the data type. Some external optimizer rules may
// assume optimized plan is always resolved and Expression#dataType is always available, so
// we can't just fail here.
override def dataType: DataType = NullType
}

case class Keep(condition: Expression, output: Seq[Expression]) extends Instruction {
Expand Down

0 comments on commit c9ff702

Please sign in to comment.