Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] JSQLParser Version : 4.6 : NOT support Chinese characters in <S_IDENTIFIER> #1741

Closed
moonfruit opened this issue Mar 7, 2023 · 9 comments · Fixed by #1740
Closed
Assignees
Labels

Comments

@moonfruit
Copy link
Contributor

moonfruit commented Mar 7, 2023

Always check against the Latest SNAPSHOT of JSQLParser and the Syntax Diagram

Failing SQL Feature:

  • Before 4.6, <S_IDENTIFIER> support Chinese characters, which also support by various databases. But start from 4.6 this is not supported.

SQL Example:

  • Simplified Query Example, focusing on the failing feature. This is supported before 4.6.
    select c as 中文 from t
    select * from t where 中文 = 'abc'

Software Information:

  • JSqlParser version 4.6
  • Database Oracle, MySQL, PostgreSQL

Tips:

Please write in English and avoid Screenshots (as we can't copy and paste content from it).
Try your example online with the latest JSQLParser and share the link in the error report.
Do provide Links or References to the specific Grammar and Syntax you are trying to use.

@moonfruit moonfruit changed the title [BUG] JSQLParser Version: 4.6 NOT support Chinese characters in <S_IDENTIFIER> [BUG] JSQLParser Version : 4.6 : NOT support Chinese characters in <S_IDENTIFIER> Mar 7, 2023
@manticore-projects
Copy link
Contributor

Greetings.

Thanks for reporting but unfortunately we may need a lot of help in resolving it (since at least I d not understand the Chinese characters).

So what has been implemented according to the SQL2016 Standard is:
a) Unicode specification
b) for Letters and Part-Letters and Digits

Maybe in a first step, can you tell me what the Unicode transcription of the Label above (meaning the Unicodes for each used character).

@manticore-projects
Copy link
Contributor

What has been implemented is this:

// Unicode characters and categories are defined here: https://www.unicode.org/Public/UNIDATA/UnicodeData.txt
// SQL:2016 states:
// An <identifier start> is any character in the Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”.
// An <identifier extend> is U+00B7, “Middle Dot”, or any character in the Unicode General Category classes “Mn”, “Mc”, “Nd”, “Pc”, or “Cf”.

// unicode_identifier_start
| <#UnicodeIdentifierStart: ("\u00B7" | <Ll> | <Lm> | <Lo> | <Lt> | <Lu> | <Nl>) >
| <#Ll: ["a"-"z","µ","ß"-"ö","ø"-"ÿ","ā","ă","ą","ć","ĉ","ċ","č","ď","đ","ē","ĕ","ė","ę","ě","ĝ","ğ","ġ","ģ","ĥ","ħ","ĩ","ī","ĭ","į","ı","ij","ĵ","ķ"-"ĸ","ĺ","ļ","ľ","ŀ","ł","ń","ņ","ň"-"ʼn","ŋ","ō","ŏ","ő","œ","ŕ","ŗ","ř","ś","ŝ","ş","š","ţ","ť","ŧ","ũ","ū","ŭ","ů","ű","ų","ŵ","ŷ","ź","ż","ž"-"ƀ","ƃ","ƅ","ƈ","ƌ"-"ƍ","ƒ","ƕ","ƙ"-"ƛ","ƞ","ơ","ƣ","ƥ","ƨ","ƪ"-"ƫ","ƭ","ư","ƴ","ƶ","ƹ"-"ƺ","ƽ"-"ƿ","dž","lj","nj","ǎ","ǐ","ǒ","ǔ","ǖ","ǘ","ǚ","ǜ"-"ǝ","ǟ","ǡ","ǣ","ǥ","ǧ","ǩ","ǫ","ǭ","ǯ"-"ǰ","dz","ǵ","ǹ","ǻ","ǽ","ǿ","ȁ","ȃ","ȅ","ȇ","ȉ","ȋ","ȍ","ȏ","ȑ","ȓ","ȕ","ȗ","ș","ț","ȝ","ȟ","ȡ","ȣ","ȥ","ȧ","ȩ","ȫ","ȭ","ȯ","ȱ","ȳ"-"ȹ","ȼ","ȿ"-"ɀ","ɂ","ɇ","ɉ","ɋ","ɍ","ɏ"-"ʓ","ʕ"-"ʯ","ͱ","ͳ","ͷ","ͻ"-"ͽ","ΐ","ά"-"ώ","ϐ"-"ϑ","ϕ"-"ϗ","ϙ","ϛ","ϝ","ϟ","ϡ","ϣ","ϥ","ϧ","ϩ","ϫ","ϭ","ϯ"-"ϳ","ϵ","ϸ","ϻ"-"ϼ","а"-"џ","ѡ","ѣ","ѥ","ѧ","ѩ","ѫ","ѭ","ѯ","ѱ","ѳ","ѵ","ѷ","ѹ","ѻ","ѽ","ѿ","ҁ","ҋ","ҍ","ҏ","ґ","ғ","ҕ","җ","ҙ","қ","ҝ","ҟ","ҡ","ң","ҥ","ҧ","ҩ","ҫ","ҭ","ү","ұ","ҳ","ҵ","ҷ","ҹ","һ","ҽ","ҿ","ӂ","ӄ","ӆ","ӈ","ӊ","ӌ","ӎ"-"ӏ","ӑ","ӓ","ӕ","ӗ","ә","ӛ","ӝ","ӟ","ӡ","ӣ","ӥ","ӧ","ө","ӫ","ӭ","ӯ","ӱ","ӳ","ӵ","ӷ","ӹ","ӻ","ӽ","ӿ","ԁ","ԃ","ԅ","ԇ","ԉ","ԋ","ԍ","ԏ","ԑ","ԓ","ԕ","ԗ","ԙ","ԛ","ԝ","ԟ","ԡ","ԣ","ԥ","ԧ","ԩ","ԫ","ԭ","ԯ","ՠ"-"ֈ","ა"-"ჺ","ჽ"-"ჿ","ᏸ"-"ᏽ","ᲀ"-"ᲈ","ᴀ"-"ᴫ","ᵫ"-"ᵷ","ᵹ"-"ᶚ","ḁ","ḃ","ḅ","ḇ","ḉ","ḋ","ḍ","ḏ","ḑ","ḓ","ḕ","ḗ","ḙ","ḛ","ḝ","ḟ","ḡ","ḣ","ḥ","ḧ","ḩ","ḫ","ḭ","ḯ","ḱ","ḳ","ḵ","ḷ","ḹ","ḻ","ḽ","ḿ","ṁ","ṃ","ṅ","ṇ","ṉ","ṋ","ṍ","ṏ","ṑ","ṓ","ṕ","ṗ","ṙ","ṛ","ṝ","ṟ","ṡ","ṣ","ṥ","ṧ","ṩ","ṫ","ṭ","ṯ","ṱ","ṳ","ṵ","ṷ","ṹ","ṻ","ṽ","ṿ","ẁ","ẃ","ẅ","ẇ","ẉ","ẋ","ẍ","ẏ","ẑ","ẓ","ẕ"-"ẝ","ẟ","ạ","ả","ấ","ầ","ẩ","ẫ","ậ","ắ","ằ","ẳ","ẵ","ặ","ẹ","ẻ","ẽ","ế","ề","ể","ễ","ệ","ỉ","ị","ọ","ỏ","ố","ồ","ổ","ỗ","ộ","ớ","ờ","ở","ỡ","ợ","ụ","ủ","ứ","ừ","ử","ữ","ự","ỳ","ỵ","ỷ","ỹ","ỻ","ỽ","ỿ"-"ἇ","ἐ"-"ἕ","ἠ"-"ἧ","ἰ"-"ἷ","ὀ"-"ὅ","ὐ"-"ὗ","ὠ"-"ὧ","ὰ"-"ώ","ᾀ"-"ᾇ","ᾐ"-"ᾗ","ᾠ"-"ᾧ","ᾰ"-"ᾴ","ᾶ"-"ᾷ","ι","ῂ"-"ῄ","ῆ"-"ῇ","ῐ"-"ΐ","ῖ"-"ῗ","ῠ"-"ῧ","ῲ"-"ῴ","ῶ"-"ῷ","ℊ","ℎ"-"ℏ","ℓ","ℯ","ℴ","ℹ","ℼ"-"ℽ","ⅆ"-"ⅉ","ⅎ","ↄ","ⰰ"-"ⱟ","ⱡ","ⱥ"-"ⱦ","ⱨ","ⱪ","ⱬ","ⱱ","ⱳ"-"ⱴ","ⱶ"-"ⱻ","ⲁ","ⲃ","ⲅ","ⲇ","ⲉ","ⲋ","ⲍ","ⲏ","ⲑ","ⲓ","ⲕ","ⲗ","ⲙ","ⲛ","ⲝ","ⲟ","ⲡ","ⲣ","ⲥ","ⲧ","ⲩ","ⲫ","ⲭ","ⲯ","ⲱ","ⲳ","ⲵ","ⲷ","ⲹ","ⲻ","ⲽ","ⲿ","ⳁ","ⳃ","ⳅ","ⳇ","ⳉ","ⳋ","ⳍ","ⳏ","ⳑ","ⳓ","ⳕ","ⳗ","ⳙ","ⳛ","ⳝ","ⳟ","ⳡ","ⳣ"-"ⳤ","ⳬ","ⳮ","ⳳ","ⴀ"-"ⴥ","ⴧ","ⴭ","ꙁ","ꙃ","ꙅ","ꙇ","ꙉ","ꙋ","ꙍ","ꙏ","ꙑ","ꙓ","ꙕ","ꙗ","ꙙ","ꙛ","ꙝ","ꙟ","ꙡ","ꙣ","ꙥ","ꙧ","ꙩ","ꙫ","ꙭ","ꚁ","ꚃ","ꚅ","ꚇ","ꚉ","ꚋ","ꚍ","ꚏ","ꚑ","ꚓ","ꚕ","ꚗ","ꚙ","ꚛ","ꜣ","ꜥ","ꜧ","ꜩ","ꜫ","ꜭ","ꜯ"-"ꜱ","ꜳ","ꜵ","ꜷ","ꜹ","ꜻ","ꜽ","ꜿ","ꝁ","ꝃ","ꝅ","ꝇ","ꝉ","ꝋ","ꝍ","ꝏ","ꝑ","ꝓ","ꝕ","ꝗ","ꝙ","ꝛ","ꝝ","ꝟ","ꝡ","ꝣ","ꝥ","ꝧ","ꝩ","ꝫ","ꝭ","ꝯ","ꝱ"-"ꝸ","ꝺ","ꝼ","ꝿ","ꞁ","ꞃ","ꞅ","ꞇ","ꞌ","ꞎ","ꞑ","ꞓ"-"ꞕ","ꞗ","ꞙ","ꞛ","ꞝ","ꞟ","ꞡ","ꞣ","ꞥ","ꞧ","ꞩ","ꞯ","ꞵ","ꞷ","ꞹ","ꞻ","ꞽ","ꞿ","ꟁ","ꟃ","ꟈ","ꟊ","ꟑ","ꟓ","ꟕ","ꟗ","ꟙ","ꟶ","ꟺ","ꬰ"-"ꭚ","ꭠ"-"ꭨ","ꭰ"-"ꮿ","ff"-"st","ﬓ"-"ﬗ","a"-"z"]>
| <#Lm: ["ʰ"-"ˁ","ˆ"-"ˑ","ˠ"-"ˤ","ˬ","ˮ","ʹ","ͺ","ՙ","ـ","ۥ"-"ۦ","ߴ"-"ߵ","ߺ","ࠚ","ࠤ","ࠨ","ࣉ","ॱ","ๆ","ໆ","ჼ","ៗ","ᡃ","ᪧ","ᱸ"-"ᱽ","ᴬ"-"ᵪ","ᵸ","ᶛ"-"ᶿ","ⁱ","ⁿ","ₐ"-"ₜ","ⱼ"-"ⱽ","ⵯ","ⸯ","々","〱"-"〵","〻","ゝ"-"ゞ","ー"-"ヾ","ꀕ","ꓸ"-"ꓽ","ꘌ","ꙿ","ꚜ"-"ꚝ","ꜗ"-"ꜟ","ꝰ","ꞈ","ꟲ"-"ꟴ","ꟸ"-"ꟹ","ꧏ","ꧦ","ꩰ","ꫝ","ꫳ"-"ꫴ","ꭜ"-"ꭟ","ꭩ","ー","゙"-"゚"]>
| <#Lo: ["ª","º","ƻ","ǀ"-"ǃ","ʔ","א"-"ת","ׯ"-"ײ","ؠ"-"ؿ","ف"-"ي","ٮ"-"ٯ","ٱ"-"ۓ","ە","ۮ"-"ۯ","ۺ"-"ۼ","ۿ","ܐ","ܒ"-"ܯ","ݍ"-"ޥ","ޱ","ߊ"-"ߪ","ࠀ"-"ࠕ","ࡀ"-"ࡘ","ࡠ"-"ࡪ","ࡰ"-"ࢇ","ࢉ"-"ࢎ","ࢠ"-"ࣈ","ऄ"-"ह","ऽ","ॐ","क़"-"ॡ","ॲ"-"ঀ","অ"-"ঌ","এ"-"ঐ","ও"-"ন","প"-"র","ল","শ"-"হ","ঽ","ৎ","ড়"-"ঢ়","য়"-"ৡ","ৰ"-"ৱ","ৼ","ਅ"-"ਊ","ਏ"-"ਐ","ਓ"-"ਨ","ਪ"-"ਰ","ਲ"-"ਲ਼","ਵ"-"ਸ਼","ਸ"-"ਹ","ਖ਼"-"ੜ","ਫ਼","ੲ"-"ੴ","અ"-"ઍ","એ"-"ઑ","ઓ"-"ન","પ"-"ર","લ"-"ળ","વ"-"હ","ઽ","ૐ","ૠ"-"ૡ","ૹ","ଅ"-"ଌ","ଏ"-"ଐ","ଓ"-"ନ","ପ"-"ର","ଲ"-"ଳ","ଵ"-"ହ","ଽ","ଡ଼"-"ଢ଼","ୟ"-"ୡ","ୱ","ஃ","அ"-"ஊ","எ"-"ஐ","ஒ"-"க","ங"-"ச","ஜ","ஞ"-"ட","ண"-"த","ந"-"ப","ம"-"ஹ","ௐ","అ"-"ఌ","ఎ"-"ఐ","ఒ"-"న","ప"-"హ","ఽ","ౘ"-"ౚ","ౝ","ౠ"-"ౡ","ಀ","ಅ"-"ಌ","ಎ"-"ಐ","ಒ"-"ನ","ಪ"-"ಳ","ವ"-"ಹ","ಽ","ೝ"-"ೞ","ೠ"-"ೡ","ೱ"-"ೲ","ഄ"-"ഌ","എ"-"ഐ","ഒ"-"ഺ","ഽ","ൎ","ൔ"-"ൖ","ൟ"-"ൡ","ൺ"-"ൿ","අ"-"ඖ","ක"-"න","ඳ"-"ර","ල","ව"-"ෆ","ก"-"ะ","า"-"ำ","เ"-"ๅ","ກ"-"ຂ","ຄ","ຆ"-"ຊ","ຌ"-"ຣ","ລ","ວ"-"ະ","າ"-"ຳ","ຽ","ເ"-"ໄ","ໜ"-"ໟ","ༀ","ཀ"-"ཇ","ཉ"-"ཬ","ྈ"-"ྌ","က"-"ဪ","ဿ","ၐ"-"ၕ","ၚ"-"ၝ","ၡ","ၥ"-"ၦ","ၮ"-"ၰ","ၵ"-"ႁ","ႎ","ᄀ"-"ቈ","ቊ"-"ቍ","ቐ"-"ቖ","ቘ","ቚ"-"ቝ","በ"-"ኈ","ኊ"-"ኍ","ነ"-"ኰ","ኲ"-"ኵ","ኸ"-"ኾ","ዀ","ዂ"-"ዅ","ወ"-"ዖ","ዘ"-"ጐ","ጒ"-"ጕ","ጘ"-"ፚ","ᎀ"-"ᎏ","ᐁ"-"ᙬ","ᙯ"-"ᙿ","ᚁ"-"ᚚ","ᚠ"-"ᛪ","ᛱ"-"ᛸ","ᜀ"-"ᜑ","ᜟ"-"ᜱ","ᝀ"-"ᝑ","ᝠ"-"ᝬ","ᝮ"-"ᝰ","ក"-"ឳ","ៜ","ᠠ"-"ᡂ","ᡄ"-"ᡸ","ᢀ"-"ᢄ","ᢇ"-"ᢨ","ᢪ","ᢰ"-"ᣵ","ᤀ"-"ᤞ","ᥐ"-"ᥭ","ᥰ"-"ᥴ","ᦀ"-"ᦫ","ᦰ"-"ᧉ","ᨀ"-"ᨖ","ᨠ"-"ᩔ","ᬅ"-"ᬳ","ᭅ"-"ᭌ","ᮃ"-"ᮠ","ᮮ"-"ᮯ","ᮺ"-"ᯥ","ᰀ"-"ᰣ","ᱍ"-"ᱏ","ᱚ"-"ᱷ","ᳩ"-"ᳬ","ᳮ"-"ᳳ","ᳵ"-"ᳶ","ᳺ","ℵ"-"ℸ","ⴰ"-"ⵧ","ⶀ"-"ⶖ","ⶠ"-"ⶦ","ⶨ"-"ⶮ","ⶰ"-"ⶶ","ⶸ"-"ⶾ","ⷀ"-"ⷆ","ⷈ"-"ⷎ","ⷐ"-"ⷖ","ⷘ"-"ⷞ","〆","〼","ぁ"-"ゖ","ゟ","ァ"-"ヺ","ヿ","ㄅ"-"ㄯ","ㄱ"-"ㆎ","ㆠ"-"ㆿ","ㇰ"-"ㇿ","䶿","鿿"-"ꀔ","ꀖ"-"ꒌ","ꓐ"-"ꓷ","ꔀ"-"ꘋ","ꘐ"-"ꘟ","ꘪ"-"ꘫ","ꙮ","ꚠ"-"ꛥ","ꞏ","ꟷ","ꟻ"-"ꠁ","ꠃ"-"ꠅ","ꠇ"-"ꠊ","ꠌ"-"ꠢ","ꡀ"-"ꡳ","ꢂ"-"ꢳ","ꣲ"-"ꣷ","ꣻ","ꣽ"-"ꣾ","ꤊ"-"ꤥ","ꤰ"-"ꥆ","ꥠ"-"ꥼ","ꦄ"-"ꦲ","ꧠ"-"ꧤ","ꧧ"-"ꧯ","ꧺ"-"ꧾ","ꨀ"-"ꨨ","ꩀ"-"ꩂ","ꩄ"-"ꩋ","ꩠ"-"ꩯ","ꩱ"-"ꩶ","ꩺ","ꩾ"-"ꪯ","ꪱ","ꪵ"-"ꪶ","ꪹ"-"ꪽ","ꫀ","ꫂ","ꫛ"-"ꫜ","ꫠ"-"ꫪ","ꫲ","ꬁ"-"ꬆ","ꬉ"-"ꬎ","ꬑ"-"ꬖ","ꬠ"-"ꬦ","ꬨ"-"ꬮ","ꯀ"-"ꯢ","힣","ힰ"-"ퟆ","ퟋ"-"ퟻ","豈"-"舘","並"-"龎","יִ","ײַ"-"ﬨ","שׁ"-"זּ","טּ"-"לּ","מּ","נּ"-"סּ","ףּ"-"פּ","צּ"-"ﮱ","ﯓ"-"ﴽ","ﵐ"-"ﶏ","ﶒ"-"ﷇ","ﷰ"-"ﷻ","ﹰ"-"ﹴ","ﹶ"-"ﻼ","ヲ"-"ッ","ア"-"ン","ᅠ"-"ᄒ","ᅡ"-"ᅦ","ᅧ"-"ᅬ","ᅭ"-"ᅲ","ᅳ"-"ᅵ"]>
| <#Lt: ["Dž","Lj","Nj","Dz","ᾈ"-"ᾏ","ᾘ"-"ᾟ","ᾨ"-"ᾯ","ᾼ","ῌ","ῼ"]>
| <#Lu: ["A"-"Z","À"-"Ö","Ø"-"Þ","Ā","Ă","Ą","Ć","Ĉ","Ċ","Č","Ď","Đ","Ē","Ĕ","Ė","Ę","Ě","Ĝ","Ğ","Ġ","Ģ","Ĥ","Ħ","Ĩ","Ī","Ĭ","Į","İ","IJ","Ĵ","Ķ","Ĺ","Ļ","Ľ","Ŀ","Ł","Ń","Ņ","Ň","Ŋ","Ō","Ŏ","Ő","Œ","Ŕ","Ŗ","Ř","Ś","Ŝ","Ş","Š","Ţ","Ť","Ŧ","Ũ","Ū","Ŭ","Ů","Ű","Ų","Ŵ","Ŷ","Ÿ"-"Ź","Ż","Ž","Ɓ"-"Ƃ","Ƅ","Ɔ"-"Ƈ","Ɖ"-"Ƌ","Ǝ"-"Ƒ","Ɠ"-"Ɣ","Ɩ"-"Ƙ","Ɯ"-"Ɲ","Ɵ"-"Ơ","Ƣ","Ƥ","Ʀ"-"Ƨ","Ʃ","Ƭ","Ʈ"-"Ư","Ʊ"-"Ƴ","Ƶ","Ʒ"-"Ƹ","Ƽ","DŽ","LJ","NJ","Ǎ","Ǐ","Ǒ","Ǔ","Ǖ","Ǘ","Ǚ","Ǜ","Ǟ","Ǡ","Ǣ","Ǥ","Ǧ","Ǩ","Ǫ","Ǭ","Ǯ","DZ","Ǵ","Ƕ"-"Ǹ","Ǻ","Ǽ","Ǿ","Ȁ","Ȃ","Ȅ","Ȇ","Ȉ","Ȋ","Ȍ","Ȏ","Ȑ","Ȓ","Ȕ","Ȗ","Ș","Ț","Ȝ","Ȟ","Ƞ","Ȣ","Ȥ","Ȧ","Ȩ","Ȫ","Ȭ","Ȯ","Ȱ","Ȳ","Ⱥ"-"Ȼ","Ƚ"-"Ⱦ","Ɂ","Ƀ"-"Ɇ","Ɉ","Ɋ","Ɍ","Ɏ","Ͱ","Ͳ","Ͷ","Ϳ","Ά","Έ"-"Ί","Ό","Ύ"-"Ώ","Α"-"Ρ","Σ"-"Ϋ","Ϗ","ϒ"-"ϔ","Ϙ","Ϛ","Ϝ","Ϟ","Ϡ","Ϣ","Ϥ","Ϧ","Ϩ","Ϫ","Ϭ","Ϯ","ϴ","Ϸ","Ϲ"-"Ϻ","Ͻ"-"Я","Ѡ","Ѣ","Ѥ","Ѧ","Ѩ","Ѫ","Ѭ","Ѯ","Ѱ","Ѳ","Ѵ","Ѷ","Ѹ","Ѻ","Ѽ","Ѿ","Ҁ","Ҋ","Ҍ","Ҏ","Ґ","Ғ","Ҕ","Җ","Ҙ","Қ","Ҝ","Ҟ","Ҡ","Ң","Ҥ","Ҧ","Ҩ","Ҫ","Ҭ","Ү","Ұ","Ҳ","Ҵ","Ҷ","Ҹ","Һ","Ҽ","Ҿ","Ӏ"-"Ӂ","Ӄ","Ӆ","Ӈ","Ӊ","Ӌ","Ӎ","Ӑ","Ӓ","Ӕ","Ӗ","Ә","Ӛ","Ӝ","Ӟ","Ӡ","Ӣ","Ӥ","Ӧ","Ө","Ӫ","Ӭ","Ӯ","Ӱ","Ӳ","Ӵ","Ӷ","Ӹ","Ӻ","Ӽ","Ӿ","Ԁ","Ԃ","Ԅ","Ԇ","Ԉ","Ԋ","Ԍ","Ԏ","Ԑ","Ԓ","Ԕ","Ԗ","Ԙ","Ԛ","Ԝ","Ԟ","Ԡ","Ԣ","Ԥ","Ԧ","Ԩ","Ԫ","Ԭ","Ԯ","Ա"-"Ֆ","Ⴀ"-"Ⴥ","Ⴧ","Ⴭ","Ꭰ"-"Ᏽ","Ა"-"Ჺ","Ჽ"-"Ჿ","Ḁ","Ḃ","Ḅ","Ḇ","Ḉ","Ḋ","Ḍ","Ḏ","Ḑ","Ḓ","Ḕ","Ḗ","Ḙ","Ḛ","Ḝ","Ḟ","Ḡ","Ḣ","Ḥ","Ḧ","Ḩ","Ḫ","Ḭ","Ḯ","Ḱ","Ḳ","Ḵ","Ḷ","Ḹ","Ḻ","Ḽ","Ḿ","Ṁ","Ṃ","Ṅ","Ṇ","Ṉ","Ṋ","Ṍ","Ṏ","Ṑ","Ṓ","Ṕ","Ṗ","Ṙ","Ṛ","Ṝ","Ṟ","Ṡ","Ṣ","Ṥ","Ṧ","Ṩ","Ṫ","Ṭ","Ṯ","Ṱ","Ṳ","Ṵ","Ṷ","Ṹ","Ṻ","Ṽ","Ṿ","Ẁ","Ẃ","Ẅ","Ẇ","Ẉ","Ẋ","Ẍ","Ẏ","Ẑ","Ẓ","Ẕ","ẞ","Ạ","Ả","Ấ","Ầ","Ẩ","Ẫ","Ậ","Ắ","Ằ","Ẳ","Ẵ","Ặ","Ẹ","Ẻ","Ẽ","Ế","Ề","Ể","Ễ","Ệ","Ỉ","Ị","Ọ","Ỏ","Ố","Ồ","Ổ","Ỗ","Ộ","Ớ","Ờ","Ở","Ỡ","Ợ","Ụ","Ủ","Ứ","Ừ","Ử","Ữ","Ự","Ỳ","Ỵ","Ỷ","Ỹ","Ỻ","Ỽ","Ỿ","Ἀ"-"Ἇ","Ἐ"-"Ἕ","Ἠ"-"Ἧ","Ἰ"-"Ἷ","Ὀ"-"Ὅ","Ὑ","Ὓ","Ὕ","Ὗ","Ὠ"-"Ὧ","Ᾰ"-"Ά","Ὲ"-"Ή","Ῐ"-"Ί","Ῠ"-"Ῥ","Ὸ"-"Ώ","ℂ","ℇ","ℋ"-"ℍ","ℐ"-"ℒ","ℕ","ℙ"-"ℝ","ℤ","Ω","ℨ","K"-"ℭ","ℰ"-"ℳ","ℾ"-"ℿ","ⅅ","Ↄ","Ⰰ"-"Ⱟ","Ⱡ","Ɫ"-"Ɽ","Ⱨ","Ⱪ","Ⱬ","Ɑ"-"Ɒ","Ⱳ","Ⱶ","Ȿ"-"Ⲁ","Ⲃ","Ⲅ","Ⲇ","Ⲉ","Ⲋ","Ⲍ","Ⲏ","Ⲑ","Ⲓ","Ⲕ","Ⲗ","Ⲙ","Ⲛ","Ⲝ","Ⲟ","Ⲡ","Ⲣ","Ⲥ","Ⲧ","Ⲩ","Ⲫ","Ⲭ","Ⲯ","Ⲱ","Ⲳ","Ⲵ","Ⲷ","Ⲹ","Ⲻ","Ⲽ","Ⲿ","Ⳁ","Ⳃ","Ⳅ","Ⳇ","Ⳉ","Ⳋ","Ⳍ","Ⳏ","Ⳑ","Ⳓ","Ⳕ","Ⳗ","Ⳙ","Ⳛ","Ⳝ","Ⳟ","Ⳡ","Ⳣ","Ⳬ","Ⳮ","Ⳳ","Ꙁ","Ꙃ","Ꙅ","Ꙇ","Ꙉ","Ꙋ","Ꙍ","Ꙏ","Ꙑ","Ꙓ","Ꙕ","Ꙗ","Ꙙ","Ꙛ","Ꙝ","Ꙟ","Ꙡ","Ꙣ","Ꙥ","Ꙧ","Ꙩ","Ꙫ","Ꙭ","Ꚁ","Ꚃ","Ꚅ","Ꚇ","Ꚉ","Ꚋ","Ꚍ","Ꚏ","Ꚑ","Ꚓ","Ꚕ","Ꚗ","Ꚙ","Ꚛ","Ꜣ","Ꜥ","Ꜧ","Ꜩ","Ꜫ","Ꜭ","Ꜯ","Ꜳ","Ꜵ","Ꜷ","Ꜹ","Ꜻ","Ꜽ","Ꜿ","Ꝁ","Ꝃ","Ꝅ","Ꝇ","Ꝉ","Ꝋ","Ꝍ","Ꝏ","Ꝑ","Ꝓ","Ꝕ","Ꝗ","Ꝙ","Ꝛ","Ꝝ","Ꝟ","Ꝡ","Ꝣ","Ꝥ","Ꝧ","Ꝩ","Ꝫ","Ꝭ","Ꝯ","Ꝺ","Ꝼ","Ᵹ"-"Ꝿ","Ꞁ","Ꞃ","Ꞅ","Ꞇ","Ꞌ","Ɥ","Ꞑ","Ꞓ","Ꞗ","Ꞙ","Ꞛ","Ꞝ","Ꞟ","Ꞡ","Ꞣ","Ꞥ","Ꞧ","Ꞩ","Ɦ"-"Ɪ","Ʞ"-"Ꞵ","Ꞷ","Ꞹ","Ꞻ","Ꞽ","Ꞿ","Ꟁ","Ꟃ","Ꞔ"-"Ꟈ","Ꟊ","Ꟑ","Ꟗ","Ꟙ","Ꟶ","A"-"Z"]>
| <#Nl: ["ᛮ"-"ᛰ","Ⅰ"-"ↂ","ↅ"-"ↈ","〇","〡"-"〩","〸"-"〺","ꛦ"-"ꛯ"]>

// unicode_identifier_extend
| <#UnicodeIdentifierExtend: (<Mn>|<Mc>|<Nd>|<Pc>|<Cf>)>
| <#Cf: ["­","؀"-"؅","؜","۝","܏","࢐"-"࢑","࣢","᠎","​"-"‏","‪"-"‮","⁠"-"⁤","⁦"-"","",""-""]>
| <#Mc: ["ः","ऻ","ा"-"ी","ॉ"-"ौ","ॎ"-"ॏ","ং"-"ঃ","া"-"ী","ে"-"ৈ","ো"-"ৌ","ৗ","ਃ","ਾ"-"ੀ","ઃ","ા"-"ી","ૉ","ો"-"ૌ","ଂ"-"ଃ","ା","ୀ","େ"-"ୈ","ୋ"-"ୌ","ୗ","ா"-"ி","ு"-"ூ","ெ"-"ை","ொ"-"ௌ","ௗ","ఁ"-"ః","ు"-"ౄ","ಂ"-"ಃ","ಾ","ೀ"-"ೄ","ೇ"-"ೈ","ೊ"-"ೋ","ೕ"-"ೖ","ೳ","ം"-"ഃ","ാ"-"ീ","െ"-"ൈ","ൊ"-"ൌ","ൗ","ං"-"ඃ","ා"-"ෑ","ෘ"-"ෟ","ෲ"-"ෳ","༾"-"༿","ཿ","ါ"-"ာ","ေ","း","ျ"-"ြ","ၖ"-"ၗ","ၢ"-"ၤ","ၧ"-"ၭ","ႃ"-"ႄ","ႇ"-"ႌ","ႏ","ႚ"-"ႜ","᜕","᜴","ា","ើ"-"ៅ","ះ"-"ៈ","ᤣ"-"ᤦ","ᤩ"-"ᤫ","ᤰ"-"ᤱ","ᤳ"-"ᤸ","ᨙ"-"ᨚ","ᩕ","ᩗ","ᩡ","ᩣ"-"ᩤ","ᩭ"-"ᩲ","ᬄ","ᬵ","ᬻ","ᬽ"-"ᭁ","ᭃ"-"᭄","ᮂ","ᮡ","ᮦ"-"ᮧ","᮪","ᯧ","ᯪ"-"ᯬ","ᯮ","᯲"-"᯳","ᰤ"-"ᰫ","ᰴ"-"ᰵ","᳡","᳷","〮"-"〯","ꠣ"-"ꠤ","ꠧ","ꢀ"-"ꢁ","ꢴ"-"ꣃ","ꥒ"-"꥓","ꦃ","ꦴ"-"ꦵ","ꦺ"-"ꦻ","ꦾ"-"꧀","ꨯ"-"ꨰ","ꨳ"-"ꨴ","ꩍ","ꩻ","ꩽ","ꫫ","ꫮ"-"ꫯ","ꫵ","ꯣ"-"ꯤ","ꯦ"-"ꯧ","ꯩ"-"ꯪ","꯬"]>
| <#Mn: ["̀"-"ͯ","҃"-"҇","֑"-"ֽ","ֿ","ׁ"-"ׂ","ׄ"-"ׅ","ׇ","ؐ"-"ؚ","ً"-"ٟ","ٰ","ۖ"-"ۜ","۟"-"ۤ","ۧ"-"ۨ","۪"-"ۭ","ܑ","ܰ"-"݊","ަ"-"ް","߫"-"߳","߽","ࠖ"-"࠙","ࠛ"-"ࠣ","ࠥ"-"ࠧ","ࠩ"-"࠭","࡙"-"࡛","࢘"-"࢟","࣊"-"࣡","ࣣ"-"ं","ऺ","़","ु"-"ै","्","॑"-"ॗ","ॢ"-"ॣ","ঁ","়","ু"-"ৄ","্","ৢ"-"ৣ","৾","ਁ"-"ਂ","਼","ੁ"-"ੂ","ੇ"-"ੈ","ੋ"-"੍","ੑ","ੰ"-"ੱ","ੵ","ઁ"-"ં","઼","ુ"-"ૅ","ે"-"ૈ","્","ૢ"-"ૣ","ૺ"-"૿","ଁ","଼","ି","ୁ"-"ୄ","୍","୕"-"ୖ","ୢ"-"ୣ","ஂ","ீ","்","ఀ","ఄ","఼","ా"-"ీ","ె"-"ై","ొ"-"్","ౕ"-"ౖ","ౢ"-"ౣ","ಁ","಼","ಿ","ೆ","ೌ"-"್","ೢ"-"ೣ","ഀ"-"ഁ","഻"-"഼","ു"-"ൄ","്","ൢ"-"ൣ","ඁ","්","ි"-"ු","ූ","ั","ิ"-"ฺ","็"-"๎","ັ","ິ"-"ຼ","່"-"໎","༘"-"༙","༵","༷","༹","ཱ"-"ཾ","ྀ"-"྄","྆"-"྇","ྍ"-"ྗ","ྙ"-"ྼ","࿆","ိ"-"ူ","ဲ"-"့","္"-"်","ွ"-"ှ","ၘ"-"ၙ","ၞ"-"ၠ","ၱ"-"ၴ","ႂ","ႅ"-"ႆ","ႍ","ႝ","፝"-"፟","ᜒ"-"᜔","ᜲ"-"ᜳ","ᝒ"-"ᝓ","ᝲ"-"ᝳ","឴"-"឵","ិ"-"ួ","ំ","៉"-"៓","៝","᠋"-"᠍","᠏","ᢅ"-"ᢆ","ᢩ","ᤠ"-"ᤢ","ᤧ"-"ᤨ","ᤲ","᤹"-"᤻","ᨗ"-"ᨘ","ᨛ","ᩖ","ᩘ"-"ᩞ","᩠","ᩢ","ᩥ"-"ᩬ","ᩳ"-"᩼","᩿","᪰"-"᪽","ᪿ"-"ᫎ","ᬀ"-"ᬃ","᬴","ᬶ"-"ᬺ","ᬼ","ᭂ","᭫"-"᭳","ᮀ"-"ᮁ","ᮢ"-"ᮥ","ᮨ"-"ᮩ","᮫"-"ᮭ","᯦","ᯨ"-"ᯩ","ᯭ","ᯯ"-"ᯱ","ᰬ"-"ᰳ","ᰶ"-"᰷","᳐"-"᳒","᳔"-"᳠","᳢"-"᳨","᳭","᳴","᳸"-"᳹","᷀"-"᷿","⃐"-"⃜","⃡","⃥"-"⃰","⳯"-"⳱","⵿","ⷠ"-"ⷿ","〪"-"〭","゙"-"゚","꙯","ꙴ"-"꙽","ꚞ"-"ꚟ","꛰"-"꛱","ꠂ","꠆","ꠋ","ꠥ"-"ꠦ","꠬","꣄"-"ꣅ","꣠"-"꣱","ꣿ","ꤦ"-"꤭","ꥇ"-"ꥑ","ꦀ"-"ꦂ","꦳","ꦶ"-"ꦹ","ꦼ"-"ꦽ","ꧥ","ꨩ"-"ꨮ","ꨱ"-"ꨲ","ꨵ"-"ꨶ","ꩃ","ꩌ","ꩼ","ꪰ","ꪲ"-"ꪴ","ꪷ"-"ꪸ","ꪾ"-"꪿","꫁","ꫬ"-"ꫭ","꫶","ꯥ","ꯨ","꯭","ﬞ","︀"-"️","︠"-"︯"]>
| <#Nd: ["0"-"9","٠"-"٩","۰"-"۹","߀"-"߉","०"-"९","০"-"৯","੦"-"੯","૦"-"૯","୦"-"୯","௦"-"௯","౦"-"౯","೦"-"೯","൦"-"൯","෦"-"෯","๐"-"๙","໐"-"໙","༠"-"༩","၀"-"၉","႐"-"႙","០"-"៩","᠐"-"᠙","᥆"-"᥏","᧐"-"᧙","᪀"-"᪉","᪐"-"᪙","᭐"-"᭙","᮰"-"᮹","᱀"-"᱉","᱐"-"᱙","꘠"-"꘩","꣐"-"꣙","꤀"-"꤉","꧐"-"꧙","꧰"-"꧹","꩐"-"꩙","꯰"-"꯹","0"-"9"]>
| <#Pc: ["‿"-"⁀","⁔","︳"-"︴","﹍"-"﹏","_"]>

Please where exactly do your characters belong to?

@manticore-projects
Copy link
Contributor

Would the block below suffice?

20941 characters from the CJK Unified Ideographs block.

Code points U+4E00 to U+9FCC.

[U+4E00 - U+62FF](http://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs,_part_1_of_4)
[U+6300 - U+77FF](http://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs,_part_2_of_4)
[U+7800 - U+8CFF](http://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs,_part_3_of_4)
[U+8D00 - U+9FCC](http://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs,_part_4_of_4)

@moonfruit
Copy link
Contributor Author

moonfruit commented Mar 7, 2023

Sorry, I don't know Unicode well enough to give an exact range of characters. However, version 4.5 contains the characters range "\u4e00"-"\u9fa5". The Java parser contains another characters range "\u4e00"-"\u9fea".
But accoding to CJK Unified Ideographs (Unicode block), I believe this should be U+4E00..U+9FFF

JSQLParser 4.5:
https://github.com/JSQLParser/JSqlParser/blob/af886c7e450d5dbd36e87c2c60abc631075c6b8a/src/main/jjtree/net/sf/jsqlparser/parser/JSqlParserCC.jjt#LL484C3206-L484C3206

Java Parser:
https://github.com/javaparser/javaparser/blob/42ecfabdcb8fe05e764eb2ca7ade495aa38d153e/javaparser-core/src/main/javacc/java.jj#L828

@manticore-projects
Copy link
Contributor

Ok, I would add those ranges tomorrow.
Sorry for the inconvenience, I was not aware that those characters do not count as Letters.

manticore-projects added a commit to manticore-projects/JSqlParser that referenced this issue Mar 7, 2023
@manticore-projects
Copy link
Contributor

I have submitted the change. Please feel very welcome to test it extensively, especially with corner cases.
Thanks again for reporting.

@moonfruit
Copy link
Contributor Author

moonfruit commented Mar 7, 2023

According to CJK Unified Ideographs (Unicode block), all four parts are consecutive, and the last part should end at U+9FFF.
So I think <CJK> should be "\u4E00"-"\u9FFF"

manticore-projects added a commit to manticore-projects/JSqlParser that referenced this issue Mar 8, 2023
@manticore-projects
Copy link
Contributor

According to CJK Unified Ideographs (Unicode block), all four parts are consecutive, and the last part should end at U+9FFF. So I think <CJK> should be "\u4E00"-"\u9FFF"

Thanks for pointing it out, it seems that about 20 more characters have been added only recently.
From a European perspective, its hard to say if such a volatile character set is a bug or a feature. Nevertheless we are committed to make it work.

Please do let me know, if anything else was needed.

@moonfruit
Copy link
Contributor Author

👍👍

wumpz pushed a commit that referenced this issue Mar 9, 2023
* Fixes #1684: Support CREATE MATERIALIZED VIEW with AUTO REFRESH

Support parsing create view statements in Redshift with AUTO REFRESH
option.

* Reduce cyclomatic complexity in CreateView.toString

Extract adding the force option into a dedicated method resulting in the
cyclomatic complexity reduction of the CreateView.toString method.

* Enhanced Keywords

Add Keywords and document, which keywords are allowed for what purpose

* Fix incorrect tests

* Define Reserved Keywords explicitly
Derive All Keywords from Grammar directly
Generate production for Object Names (semi-) automatically
Add parametrized Keyword Tests

* Fix test resources

* Adjust Gradle to JUnit 5

Parallel Test execution
Gradle Caching
Explicitly request for latest JavaCC 7.0.10

* Do not mark SpeedTest for concurrent execution

* Remove unused imports

* Adjust Gradle to JUnit 5

Parallel Test execution
Gradle Caching
Explicitly request for latest JavaCC 7.0.10

* Do not mark SpeedTest for concurrent execution

* Remove unused imports

* Sphinx Documentation

Update the MANTICORE Sphinx Theme, but ignore it in GIT
Add the content to the Sphinx sites
Add a Gradle function to derive Stable and Snapshot version from GIT Tags
Add a Gradle GIT change task
Add a Gradle sphinx task
Add a special Test case for illustrating the use of JSQLParser

* doc: request for `Conventional Commit` messages

* feat: make important Classes Serializable

Implement Serializable for persisting via ObjectOutputStream

* chore: Make Serializable

* doc: Better integration of the RR diagrams

- apply neutral Sphinx theme
- insert the RR diagrams into the sphinx sources
- better documentation on Gradle dependencies
- link GitHub repository

* Merge

* feat: Oracle Alternative Quoting

- add support for Oracle Alternative Quoting e.g. `q'(...)'`
- fixes #1718
- add a Logo and FavIcon to the Website
- document recent changes on Quoting/Escaping
- add an example on building SQL from Java
- rework the README.md, promote the Website
- add Spotless Formatter, using Google Java Style (with Tab=4 Spaces)

* style: Appease PMD/Codacy

* doc: fix the issue template

- fix the issue template
- fix the -SNAPSHOT version number

* Update issue templates

* Update issue templates

* feat: Support more Statement Separators

- `GO`
- Slash `/`
- Two empty lines

* feat: FETCH uses EXPRESSION

- `FETCH` uses `EXPRESSION` instead of SimpleJDBCParameter only
- Visit/Accept `FETCH` `EXPRESSION` instead of `append` to String
- Visit/Accept `OFFSET` `EXPRESSION` instead of `append` to String
- Gradle: remove obsolete/incompatible `jvmArgs` from Test()

* style: apply Spotless

* test: commit missing test

* feat: Unicode CJK Unified Ideographs (Unicode block)

fixes #1741

* feat: Unicode CJK Unified Ideographs (Unicode block)

fixes #1741

* feat: Functions with nested Attributes

Supports `SELECT schemaName.f1(arguments).f2(arguments).f3.f4` and similar constructs

fixes #1742
fixes #1050

---------

Co-authored-by: zaza <tzarna@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants