You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wish we can support collect_set on struct in reduction context.
For example:
import org.apache.spark.sql.Row
import org.apache.spark.sql.types._
val data = Seq(
Row("Adam",Map("hair"->"black","eye"->"black"), Row("address" , Map("state"->"CA"),Map("city"->"santa clara"))),
Row("Bob",Map("hair"->"red","eye"->"red"),Row("address", Map("state"->"GA"),Map("city"->"abc"))),
Row("Cathy",Map("hair"->"blue","eye"->"blue"),Row("address",Map("state"->"NC"),Map("city"->"xyz")))
)
val mapType = DataTypes.createMapType(StringType,StringType)
val schema = new StructType().add("name",StringType).add("properties", mapType).add("prop2", new StructType().add("propname",StringType).add("address", mapType).add("address2", mapType))
val mapTypeDF = spark.createDataFrame(spark.sparkContext.parallelize(data),schema)
mapTypeDF.write.format("parquet").mode("overwrite").save("/tmp/testparquet")
val df1 = spark.read.parquet("/tmp/testparquet")
df1.createOrReplaceTempView("df1")
df1.printSchema
spark.sql("SELECT collect_set(struct(name,name)) FROM df1").collect()
Not-supported-messages:
!Exec <ObjectHashAggregateExec> cannot run on GPU because not all expressions can be replaced. The data type of following expressions will be converted in GPU runtime: buf#199: Converted BinaryType to ArrayType(StructType(StructField(name,StringType,true), StructField(name,StringType,true)),false)
@Expression <AggregateExpression> partial_collect_set(struct(name, name#182, name, name#182), 0, 0) AS buf#199 could run on GPU
!Expression <CollectSet> collect_set(struct(name, name#182, name, name#182), 0, 0) cannot run on GPU because input expression CreateNamedStruct struct(name, name#182, name, name#182) (StructType(StructField(name,StringType,true), StructField(name,StringType,true)) is not supported); expression CollectSet collect_set(struct(name, name#182, name, name#182), 0, 0) produces an unsupported type ArrayType(StructType(StructField(name,StringType,true), StructField(name,StringType,true)),false)
The text was updated successfully, but these errors were encountered:
I wish we can support collect_set on struct in reduction context.
For example:
Not-supported-messages:
The text was updated successfully, but these errors were encountered: