Skip to content

Commit

Permalink
update doc
Browse files Browse the repository at this point in the history
  • Loading branch information
maryannxue committed Oct 19, 2018
1 parent ec368ac commit 84cb456
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion docs/sql-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -1973,7 +1973,6 @@ working with timestamps in `pandas_udf`s to get the best performance, see
- Since Spark 2.4, empty strings are saved as quoted empty strings `""`. In version 2.3 and earlier, empty strings are equal to `null` values and do not reflect to any characters in saved CSV files. For example, the row of `"a", null, "", 1` was writted as `a,,,1`. Since Spark 2.4, the same row is saved as `a,,"",1`. To restore the previous behavior, set the CSV option `emptyValue` to empty (not quoted) string.
- Since Spark 2.4, The LOAD DATA command supports wildcard `?` and `*`, which match any one character, and zero or more characters, respectively. Example: `LOAD DATA INPATH '/tmp/folder*/'` or `LOAD DATA INPATH '/tmp/part-?'`. Special Characters like `space` also now work in paths. Example: `LOAD DATA INPATH '/tmp/folder name/'`.
- In Spark version 2.3 and earlier, HAVING without GROUP BY is treated as WHERE. This means, `SELECT 1 FROM range(10) HAVING true` is executed as `SELECT 1 FROM range(10) WHERE true` and returns 10 rows. This violates SQL standard, and has been fixed in Spark 2.4. Since Spark 2.4, HAVING without GROUP BY is treated as a global aggregate, which means `SELECT 1 FROM range(10) HAVING true` will return only one row. To restore the previous behavior, set `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` to `true`.
- In Spark 2.4, use of the method `def udf(f: AnyRef, dataType: DataType): UserDefinedFunction` or the legacy `ScalaUDF` constructor `ScalaUDF(function: AnyRef, dataType: DataType, children: Seq[Expression], inputTypes: Seq[DataType], udfName: Option[String])` is not properly supported with Scala 2.12 compiler, thus a null input of a Scala primitive type will be converted to the type's corresponding default value in the UDF. The two aforementioned methods still work with Scala 2.11 and all other UDF methods work with both Scala 2.11 and Scala 2.12.

## Upgrading From Spark SQL 2.3.0 to 2.3.1 and above

Expand Down

0 comments on commit 84cb456

Please sign in to comment.