-
Notifications
You must be signed in to change notification settings - Fork 28.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-47101][SQL] Allow comma to be used in top-level column names a…
…nd remove check nested type definition in `HiveExternalCatalog.verifyDataSchema` ### What changes were proposed in this pull request? > In Hive 0.13 and later, column names can contain any [Unicode](http://en.wikipedia.org/wiki/List_of_Unicode_characters) character (see [HIVE-6013](https://issues.apache.org/jira/browse/HIVE-6013)), however, dot (.) and colon (:) yield errors on querying, so they are disallowed in Hive 1.2.0 (see [HIVE-10120](https://issues.apache.org/jira/browse/HIVE-10120)). Any column name that is specified within backticks (`) is treated literally. Within a backtick string, use double backticks (``) to represent a backtick character. Backtick quotation also enables the use of reserved keywords for table and column identifiers. According to Hive Doc, the column names have the flexibility to contain any character from the Unicode set. This PR makes HiveExternalCatalog.verifyDataSchema: - Allow comma to be used in top-level column names - remove check invalid characters in nested type definition for hard-coded ",:;", which turns out to be incomplete. for example, "^%", etc., are not allowed. They are all delayed to Hive API calls instead. ### Why are the changes needed? improvement ### Does this PR introduce _any_ user-facing change? yes, some special characters are now allowed and errors for some invalid characters now throw Spark Errors instead of Hive Meta Errors ### How was this patch tested? new tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #45180 from yaooqinn/SPARK-47101. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
- Loading branch information
1 parent
9b6c84f
commit 65fe9ef
Showing
5 changed files
with
33 additions
and
125 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters