Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47210][SQL] Addition of implicit casting without indeterminate support #45383

Closed
wants to merge 80 commits into from
Closed
Show file tree
Hide file tree
Changes from 78 commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
b34544a
Implicit casting on collated expressions
mihailom-db Mar 5, 2024
fdbfa44
Fix doc files
mihailom-db Mar 5, 2024
ce9b027
Fix contains, startWith, endWith tests
mihailom-db Mar 5, 2024
e537190
Fix imports
mihailom-db Mar 5, 2024
b5a79c1
Fix docs and incorporate changes
mihailom-db Mar 6, 2024
8321d0c
Fix tests in CollationSuite
mihailom-db Mar 6, 2024
d178233
Add test and incorporate changes
mihailom-db Mar 7, 2024
a4b9be7
Fix godlen files
mihailom-db Mar 7, 2024
a6e7662
Incorporate StringType in findWiderCommonType
mihailom-db Mar 8, 2024
e1d7ad5
Merge branch 'master' into SPARK-47210
mihailom-db Mar 8, 2024
b3b1356
Fix ArrayType(StringType, _) casting in findWiderCommonType
mihailom-db Mar 11, 2024
7773d13
Fix type mismatch error
mihailom-db Mar 11, 2024
198a728
Merge branch 'apache:master' into SPARK-47210
mihailom-db Mar 11, 2024
255b1ab
Incorporate changes and fix errors
mihailom-db Mar 11, 2024
9ce417f
Merge branch 'master' into SPARK-47210
mihailom-db Mar 12, 2024
50f3aa2
Fix errors
mihailom-db Mar 12, 2024
ca0c84d
Rework casting
mihailom-db Mar 13, 2024
880a1b1
Merge branch 'master' into SPARK-47210
mihailom-db Mar 13, 2024
56d6c7c
Fix failing tests
mihailom-db Mar 14, 2024
94e5259
Fix array cast errors
mihailom-db Mar 14, 2024
ccb52ba
Fix additional errors
mihailom-db Mar 14, 2024
9b1387b
Fix explicit collation search
mihailom-db Mar 17, 2024
c9974e1
Fix scala style errors
mihailom-db Mar 18, 2024
fca9a65
Add support for ImplicitCastInputTypes
mihailom-db Mar 18, 2024
660d664
Fix accidental change in license header
mihailom-db Mar 18, 2024
c8edd93
Fix null casting
mihailom-db Mar 19, 2024
a91490b
Fix failing tests
mihailom-db Mar 19, 2024
49a8d61
Move implicit casting when strings present
mihailom-db Mar 19, 2024
4c4cd84
Fix unintentional changes
mihailom-db Mar 19, 2024
66122a6
improve types.py
mihailom-db Mar 20, 2024
50f46e4
Refactor code
mihailom-db Mar 21, 2024
cc86a87
Merge branch 'master' into SPARK-47210
mihailom-db Mar 21, 2024
c01e80c
Fix imports and failing tests
mihailom-db Mar 21, 2024
cc797a2
Disable casting of StructTypes
mihailom-db Mar 21, 2024
5d001ee
Fix imports
mihailom-db Mar 21, 2024
c68fc7d
Fix concat tests
mihailom-db Mar 21, 2024
1c926ab
Fix unnecessary repetition
mihailom-db Mar 21, 2024
dec39bf
Remove Elt test
mihailom-db Mar 21, 2024
e808446
Remove tests for Repeat
mihailom-db Mar 21, 2024
ca1a23a
Merge branch 'master' into SPARK-47210
mihailom-db Mar 21, 2024
116931c
Merge branch 'apache:master' into SPARK-47210
mihailom-db Mar 22, 2024
af487a2
Fix failing tests
mihailom-db Mar 22, 2024
4ba7055
Fix nullability for StringType->StringType
mihailom-db Mar 22, 2024
e490e42
Improve comments and switch tests from E2E to unit tests
mihailom-db Mar 24, 2024
00e88e7
Add new tests and remove compatibility test
mihailom-db Mar 25, 2024
85b4d16
Fix conflict resolution mistake
mihailom-db Mar 25, 2024
30f7225
Merge branch 'apache:master' into SPARK-47210
mihailom-db Mar 25, 2024
e89a354
Add indeterminate collation tests
mihailom-db Mar 26, 2024
788dc06
Fix test
mihailom-db Mar 26, 2024
75c0140
Block Alias on Indeterminate
mihailom-db Mar 27, 2024
2918413
Merge remote-tracking branch 'upstream/master' into SPARK-47210
mihailom-db Mar 28, 2024
f6ed55a
Remove introduction of indeterminate collation
mihailom-db Mar 28, 2024
98960c0
Fix import problem
mihailom-db Mar 28, 2024
de623c8
Fix failing tests
mihailom-db Mar 28, 2024
a92b4e1
Fix pyspark error
mihailom-db Mar 28, 2024
f7f3011
Merge branch 'apache:master' into SPARK-47210
mihailom-db Mar 28, 2024
f67808e
Fix errors
mihailom-db Mar 29, 2024
815ce42
Fix schema error
mihailom-db Mar 29, 2024
7fca38a
Merge remote-tracking branch 'upstream/master' into SPARK-47210
mihailom-db Mar 29, 2024
b19b0eb
Fix collated tests
mihailom-db Mar 29, 2024
a111f03
Add isExplicit flag
mihailom-db Mar 29, 2024
55bdd9b
Fix import error
mihailom-db Mar 29, 2024
a7228be
Fix imports in TypeCoercion
mihailom-db Mar 31, 2024
27a72c6
Merge remote-tracking branch 'upstream/master' into SPARK-47210
mihailom-db Apr 1, 2024
18ada04
Add support for explicit propagation in arrays
mihailom-db Apr 1, 2024
38670af
Fix tests to follow recent changes
mihailom-db Apr 1, 2024
01d891e
Incorporate changes
mihailom-db Apr 1, 2024
c5daf86
Fix error
mihailom-db Apr 1, 2024
9ac5678
Change var to val in StringType
mihailom-db Apr 1, 2024
0f1757d
Fix import style
mihailom-db Apr 1, 2024
506c8c0
Revert explicit flag addition
mihailom-db Apr 1, 2024
f743cf8
Narrow down expressions casting
mihailom-db Apr 2, 2024
4f8fe1d
Incorporate minor changes
mihailom-db Apr 2, 2024
52bf4dc
Incorporate changes
mihailom-db Apr 2, 2024
7cbeafe
Special case expressions
mihailom-db Apr 3, 2024
3e92e92
Return new line
mihailom-db Apr 3, 2024
b23e106
Remove indentation cosmetic
mihailom-db Apr 3, 2024
880ebed
Add more cosmetic changes
mihailom-db Apr 3, 2024
f96ecd9
Incorporate changes
mihailom-db Apr 3, 2024
e1e0cf4
Merge branch 'apache:master' into SPARK-47210
mihailom-db Apr 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 24 additions & 5 deletions common/utils/src/main/resources/error/error-classes.json
Original file line number Diff line number Diff line change
Expand Up @@ -467,6 +467,24 @@
],
"sqlState" : "42704"
},
"COLLATION_MISMATCH" : {
"message" : [
"Could not determine which collation to use for string functions and operators."
],
"subClass" : {
"EXPLICIT" : {
"message" : [
"Error occurred due to the mismatch between explicit collations: <explicitTypes>. Decide on a single explicit collation and remove others."
]
},
"IMPLICIT" : {
"message" : [
"Error occurred due to the mismatch between multiple implicit non-default collations. Use COLLATE function to set the collation explicitly."
]
}
},
"sqlState" : "42P21"
},
"COLLECTION_SIZE_LIMIT_EXCEEDED" : {
"message" : [
"Can't create array with <numberOfElements> elements which exceeding the array size limit <maxRoundedArrayLength>,"
Expand Down Expand Up @@ -688,11 +706,6 @@
"To convert values from <srcType> to <targetType>, you can use the functions <functionNames> instead."
]
},
"COLLATION_MISMATCH" : {
"message" : [
"Collations <collationNameLeft> and <collationNameRight> are not compatible. Please use the same collation for both strings."
]
},
"CREATE_MAP_KEY_DIFF_TYPES" : {
"message" : [
"The given keys of function <functionName> should all be the same type, but they are <dataType>."
Expand Down Expand Up @@ -1598,6 +1611,12 @@
],
"sqlState" : "22003"
},
"INDETERMINATE_COLLATION" : {
"message" : [
"Function called requires knowledge of the collation it should apply, but indeterminate collation was found. Use COLLATE function to set the collation explicitly."
],
"sqlState" : "42P22"
},
"INDEX_ALREADY_EXISTS" : {
"message" : [
"Cannot create the index <indexName> on table <tableName> because it already exists."
Expand Down
41 changes: 41 additions & 0 deletions docs/sql-error-conditions-collation-mismatch-error-class.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
layout: global
title: COLLATION_MISMATCH error class
displayTitle: COLLATION_MISMATCH error class
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---

<!--
DO NOT EDIT THIS FILE.
It was generated automatically by `org.apache.spark.SparkThrowableSuite`.
-->

[SQLSTATE: 42P21](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation)

Could not determine which collation to use for string functions and operators.

This error class has the following derived error classes:

## EXPLICIT

Error occurred due to the mismatch between explicit collations: `<explicitTypes>`. Decide on a single explicit collation and remove others.

## IMPLICIT

Error occurred due to the mismatch between multiple implicit non-default collations. Use COLLATE function to set the collation explicitly.


4 changes: 0 additions & 4 deletions docs/sql-error-conditions-datatype-mismatch-error-class.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,6 @@ If you have to cast `<srcType>` to `<targetType>`, you can set `<config>` as `<c
cannot cast `<srcType>` to `<targetType>`.
To convert values from `<srcType>` to `<targetType>`, you can use the functions `<functionNames>` instead.

## COLLATION_MISMATCH

Collations `<collationNameLeft>` and `<collationNameRight>` are not compatible. Please use the same collation for both strings.

## CREATE_MAP_KEY_DIFF_TYPES

The given keys of function `<functionName>` should all be the same type, but they are `<dataType>`.
Expand Down
14 changes: 14 additions & 0 deletions docs/sql-error-conditions.md
Original file line number Diff line number Diff line change
Expand Up @@ -390,6 +390,14 @@ Cannot find a short name for the codec `<codecName>`.

The value `<collationName>` does not represent a correct collation name. Suggested valid collation name: [`<proposal>`].

### [COLLATION_MISMATCH](sql-error-conditions-collation-mismatch-error-class.html)

[SQLSTATE: 42P21](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation)

Could not determine which collation to use for string functions and operators.

For more details see [COLLATION_MISMATCH](sql-error-conditions-collation-mismatch-error-class.html)

### [COLLECTION_SIZE_LIMIT_EXCEEDED](sql-error-conditions-collection-size-limit-exceeded-error-class.html)

[SQLSTATE: 54000](sql-error-conditions-sqlstates.html#class-54-program-limit-exceeded)
Expand Down Expand Up @@ -939,6 +947,12 @@ For more details see [INCONSISTENT_BEHAVIOR_CROSS_VERSION](sql-error-conditions-

Max offset with `<rowsPerSecond>` rowsPerSecond is `<maxSeconds>`, but 'rampUpTimeSeconds' is `<rampUpTimeSeconds>`.

### INDETERMINATE_COLLATION

[SQLSTATE: 42P22](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation)

Function called requires knowledge of the collation it should apply, but indeterminate collation was found. Use COLLATE function to set the collation explicitly.

### INDEX_ALREADY_EXISTS

[SQLSTATE: 42710](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation)
Expand Down
9 changes: 4 additions & 5 deletions python/pyspark/sql/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,11 +264,10 @@ def fromCollationId(self, collationId: int) -> "StringType":
return StringType(StringType.collationNames[collationId])

def collationIdToName(self) -> str:
return (
" collate %s" % StringType.collationNames[self.collationId]
if self.collationId != 0
else ""
)
if self.collationId == 0:
return ""
else:
return " collate %s" % StringType.collationNames[self.collationId]

@classmethod
def collationNameToId(cls, collationName: str) -> int:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@ private[sql] object ArrowUtils {
case LongType => new ArrowType.Int(8 * 8, true)
case FloatType => new ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
case DoubleType => new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
case StringType if !largeVarTypes => ArrowType.Utf8.INSTANCE
case _: StringType if !largeVarTypes => ArrowType.Utf8.INSTANCE
case BinaryType if !largeVarTypes => ArrowType.Binary.INSTANCE
case StringType if largeVarTypes => ArrowType.LargeUtf8.INSTANCE
case _: StringType if largeVarTypes => ArrowType.LargeUtf8.INSTANCE
case BinaryType if largeVarTypes => ArrowType.LargeBinary.INSTANCE
case DecimalType.Fixed(precision, scale) => new ArrowType.Decimal(precision, scale)
case DateType => new ArrowType.Date(DateUnit.DAY)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ object AnsiTypeCoercion extends TypeCoercionBase {
UnpivotCoercion ::
WidenSetOperationTypes ::
new AnsiCombinedTypeCoercionRule(
CollationTypeCasts ::
InConversion ::
PromoteStrings ::
DecimalPrecision ::
Expand All @@ -92,7 +93,7 @@ object AnsiTypeCoercion extends TypeCoercionBase {
ImplicitTypeCasts ::
DateTimeOperations ::
WindowFrameCoercion ::
GetDateFieldOperations:: Nil) :: Nil
GetDateFieldOperations :: Nil) :: Nil

val findTightestCommonType: (DataType, DataType) => Option[DataType] = {
case (t1, t2) if t1 == t2 => Some(t1)
Expand Down Expand Up @@ -138,15 +139,16 @@ object AnsiTypeCoercion extends TypeCoercionBase {
@scala.annotation.tailrec
private def findWiderTypeForString(dt1: DataType, dt2: DataType): Option[DataType] = {
(dt1, dt2) match {
case (StringType, _: IntegralType) => Some(LongType)
case (StringType, _: FractionalType) => Some(DoubleType)
case (StringType, NullType) => Some(StringType)
case (_: StringType, _: IntegralType) => Some(LongType)
case (_: StringType, _: FractionalType) => Some(DoubleType)
case (st: StringType, NullType) => Some(st)
// If a binary operation contains interval type and string, we can't decide which
// interval type the string should be promoted as. There are many possible interval
// types, such as year interval, month interval, day interval, hour interval, etc.
case (StringType, _: AnsiIntervalType) => None
case (StringType, a: AtomicType) => Some(a)
case (other, StringType) if other != StringType => findWiderTypeForString(StringType, other)
case (_: StringType, _: AnsiIntervalType) => None
case (_: StringType, a: AtomicType) => Some(a)
case (other, st: StringType) if !other.isInstanceOf[StringType] =>
findWiderTypeForString(st, other)
case _ => None
}
}
Expand Down Expand Up @@ -182,7 +184,7 @@ object AnsiTypeCoercion extends TypeCoercionBase {

// If a function expects a StringType, no StringType instance should be implicitly cast to
// StringType with a collation that's not accepted (aka. lockdown unsupported collations).
case (_: StringType, StringType) => None
case (_: StringType, _: StringType) => None
case (_: StringType, _: StringTypeCollated) => None

// If a function expects integral type, fractional input is not allowed.
Expand All @@ -191,7 +193,7 @@ object AnsiTypeCoercion extends TypeCoercionBase {
// Ideally the implicit cast rule should be the same as `Cast.canANSIStoreAssign` so that it's
// consistent with table insertion. To avoid breaking too many existing Spark SQL queries,
// we make the system to allow implicitly converting String type as other primitive types.
case (StringType, a @ (_: AtomicType | NumericType | DecimalType | AnyTimestampType)) =>
case (_: StringType, a @ (_: AtomicType | NumericType | DecimalType | AnyTimestampType)) =>
Some(a.defaultConcreteType)

// When the target type is `TypeCollection`, there is another branch to find the
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql.catalyst.analysis

import javax.annotation.Nullable

import scala.annotation.tailrec

import org.apache.spark.sql.catalyst.analysis.TypeCoercion.{hasStringType}
import org.apache.spark.sql.catalyst.expressions.{ArrayJoin, BinaryExpression, CaseWhen, Cast, Coalesce, Collate, Concat, ConcatWs, CreateArray, Expression, Greatest, If, In, InSubquery, Least, Substring}
import org.apache.spark.sql.errors.QueryCompilationErrors
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types.{AbstractDataType, ArrayType, DataType, StringType}

object CollationTypeCasts extends TypeCoercionRule {
override val transform: PartialFunction[Expression, Expression] = {
case e if !e.childrenResolved => e
case ifExpr: If =>
ifExpr.withNewChildren(
ifExpr.predicate +: collateToSingleType(Seq(ifExpr.trueValue, ifExpr.falseValue)))
case caseWhenExpr: CaseWhen =>
val newValues = collateToSingleType(
caseWhenExpr.branches.map(b => b._2) ++ caseWhenExpr.elseValue)
caseWhenExpr.withNewChildren(
interleave(Seq.empty, caseWhenExpr.branches.map(b => b._1), newValues))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks a bit complicated now. Can we follow the existing rule?

  object CaseWhenCoercion extends TypeCoercionRule {
    override val transform: PartialFunction[Expression, Expression] = {
      case c: CaseWhen if c.childrenResolved && !haveSameType(c.inputTypesForMerging) =>
        val maybeCommonType = findWiderCommonType(c.inputTypesForMerging)
        maybeCommonType.map { commonType =>
          val newBranches = c.branches.map { case (condition, value) =>
            (condition, castIfNotSameType(value, commonType))
          }
          val newElseValue = c.elseValue.map(castIfNotSameType(_, commonType))
          CaseWhen(newBranches, newElseValue)
        }.getOrElse(c)
    }
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this exposes some gaps in this new rule

  1. We should add a trigger condition and only enter the branch if !haveSameType(...)
  2. We should not blindly add cast but use castIfNotSameType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually do not blindly Cast in CollationTypeCasts. We only cast if we get different collations, as all other cases go to null branch in castStringType. Maybe it is better to keep this check internal to this rule, as we will add the priority flag and we will need to handle casting of priority in this rule as well later. Am adding the haveSameType for now to make it check the input types, but will change this code in the following PR to include priorities in the internal implementation of haveSameType check.

case substrExpr: Substring =>
// This case is necessary for changing Substring input to implicit collation
substrExpr.withNewChildren(
collateToSingleType(Seq(substrExpr.str)) :+ substrExpr.pos :+ substrExpr.len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it. Why do we find the common collation for a single type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now we do not need it, I can revert this change and leave it for default collation handling. Because if we add a flag for collation priority, we will need to change the priority of the input to implicit, if it was default before.

case otherExpr @ (
_: In | _: InSubquery | _: CreateArray | _: ArrayJoin | _: Concat | _: Greatest | _: Least |
_: Coalesce | _: BinaryExpression | _: ConcatWs) =>
val newChildren = collateToSingleType(otherExpr.children)
otherExpr.withNewChildren(newChildren)
}
/**
* Extracts StringTypes from filtered hasStringType
*/
@tailrec
private def extractStringType(dt: DataType): StringType = dt match {
case st: StringType => st
case ArrayType(et, _) => extractStringType(et)
}

/**
* Casts given expression to collated StringType with id equal to collationId only
* if expression has StringType in the first place.
* @param expr
* @param collationId
* @return
*/
def castStringType(expr: Expression, st: StringType): Option[Expression] =
castStringType(expr.dataType, st).map { dt => Cast(expr, dt)}

private def castStringType(inType: AbstractDataType, castType: StringType): Option[DataType] = {
@Nullable val ret: DataType = inType match {
case st: StringType if st.collationId != castType.collationId => castType
case ArrayType(arrType, nullable) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with special-case array type. The code looks broken. It assumes the children of the given expression can have both string type and array of string type, then tries to find a common collation between the string type child and the array element. This makes no sense without knowing the semantic of the given expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple example is ConcatWs. It can have ArrayType(StringType, _) for input strings and StringType for separator as parameters. What collations do we want for this then? We need to cast the ArrayType into a proper collation if separator is explicit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then please match the ConcatWs expression explicitly to handle this case. What I disagree with is to do this blindly for all expressions.

castStringType(arrType, castType).map(ArrayType(_, nullable)).orNull
case _ => null
}
Option(ret)
}

/**
* Collates input expressions to a single collation.
*/
def collateToSingleType(exprs: Seq[Expression]): Seq[Expression] = {
val st = getOutputCollation(exprs)

exprs.map(e => castStringType(e, st).getOrElse(e))
}

/**
* Based on the data types of the input expressions this method determines
* a collation type which the output will have. This function accepts Seq of
* any expressions, but will only be affected by collated StringTypes or
* complex DataTypes with collated StringTypes (e.g. ArrayType)
*/
def getOutputCollation(expr: Seq[Expression]): StringType = {
val explicitTypes = expr.filter(_.isInstanceOf[Collate])
.map(_.dataType.asInstanceOf[StringType].collationId)
.distinct

explicitTypes.size match {
// We have 1 explicit collation
case 1 => StringType(explicitTypes.head)
// Multiple explicit collations occurred
case size if size > 1 =>
throw QueryCompilationErrors
.explicitCollationMismatchError(
explicitTypes.map(t => StringType(t).typeName)
)
// Only implicit or default collations present
case 0 =>
val implicitTypes = expr.map(_.dataType)
.filter(hasStringType)
.map(extractStringType)
cloud-fan marked this conversation as resolved.
Show resolved Hide resolved
.filter(dt => dt.collationId != SQLConf.get.defaultStringType.collationId)
cloud-fan marked this conversation as resolved.
Show resolved Hide resolved
.distinctBy(_.collationId)

if (implicitTypes.length > 1) {
throw QueryCompilationErrors.implicitCollationMismatchError()
}
else {
implicitTypes.headOption.getOrElse(SQLConf.get.defaultStringType)
}
}
}

@tailrec
final def interleave[A](base: Seq[A], a: Seq[A], b: Seq[A]): Seq[A] = a match {
case elt :: aTail => interleave(base :+ elt, b, aTail)
case _ => base ++ b
}
}
Loading