Add support for case sensitive identifiers #2863

cberner · 2015-05-04T22:58:34Z

We need to record the fact that identifiers are quoted/not-quoted in the sql
We need to extend the SPI, so that connectors can specify whether table/column..etc identifiers are case sensitive
This should work for fields in row types also
Rule should be that if either side is case insensitive then the match is done ignoring case
For case insensitive matching, if more than one item matches, throw an ambiguity exception

jamiemccrindle · 2015-05-12T17:28:10Z

+1

electrum · 2015-05-12T18:02:53Z

I don't think the last one is correct. Case insensitive matching should only happen if the target is the correct case.

You can have multiple identifiers differing only in case. Case insensitive matching should match at most one.

dain · 2015-05-12T18:07:17Z

@electrum correct. I updated the description.

XiLongZheng · 2015-07-27T09:30:02Z

Is there any plan to get this fixed so I could use Upper case in table name?

dain · 2015-07-27T18:40:21Z

I took a look at this a few weeks back. The problem is the change is spread throughout the codebase and in parts that are actively being changed. After @martint's planner changes are in and @haozhun's changes to field dereferencing, the change should be doable.

ashish6976 · 2015-08-17T08:42:47Z

Is there any other way to use upper case in database name and table name while querying from presto

dain · 2015-08-18T00:33:15Z

@ashish6976 This issues is about changing the current Presto behavior to support full case sensitive identifiers.

From the users perspective, the current behavior of only supporting case insensitive identifiers should just work, unless you happen to have an existing system containing identifiers that only differ in case. If you are seeing failures when there is only one identifier, please file a new issue since that is a bug.

ashish6976 · 2015-08-18T07:31:19Z

@dain Here are the scenarios in which I am facing isssues

My MYSQL Server is running on Centos.
I am using presto to query my MYSQL database using MYSQL connector where my catalog name is mysql.

Scenario 1 - DataBase name and Table name is combination of upper case and lower case letters

Database Name - TestDB
Table Names - EmployeeDetails, EmployeeTable
Query 1 - show schemas from mysql;
Output -
Schema

information_schema
performance_schema
testdb
(3 rows)

Query 20150818_064410_00003_837eu, FINISHED, 1 node
Splits: 2 total, 2 done (100.00%)
0:00 [3 rows, 61B] [25 rows/s, 524B/s]

Query 2 - show tables from mysql.testdb;
Output -
Table

(0 rows)

Query 20150818_064532_00004_837eu, FINISHED, 1 node
Splits: 2 total, 2 done (100.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]

In this case presto is not able to Fetch the table names which are present in database TestDB.

Scenario 2 - DataBase name is in lower case , Table name is combination of upper case and lower case letters

Database Name - lowercasedb
Table Names - TableOne, TableTwo
Query 1 - show schemas from mysql;
Output -
Schema

information_schema
lowercasedb
performance_schema
testdb
(4 rows)

Query 20150818_065347_00005_837eu, FINISHED, 1 node
Splits: 2 total, 2 done (100.00%)
0:00 [4 rows, 77B] [27 rows/s, 522B/s]

Query 2 - show tables from mysql.lowercasedb;
Output -
Table

tableone
tabletwo
(2 rows)

Query 20150818_065432_00006_837eu, FINISHED, 1 node
Splits: 2 total, 2 done (100.00%)
0:00 [2 rows, 66B] [15 rows/s, 505B/s]

Query 3 - select * from mysql.lowercasedb.tableone;
Output -
Query 20150818_065535_00007_837eu failed: Table mysql.lowercasedb.tableone does not exist

In this scenario presto is able to fetch the table names but when I am accessing the table the its giving me an error as shown above.

The Mysql output
mysql> select * from lowercasedb.TableOne;
+-----------+-----------+
| ColumnOne | ColumnTwo |
+-----------+-----------+
| 1 | Row 1 |
| 2 | Row 2 |
+-----------+-----------+
2 rows in set (0.00 sec)

Scenario 3 - DataBase name and Table name is in lower case letters

Database Name - lowercasedb
Table Names - tableone, tabletwo
Query 1 - show schemas from mysql;
Output -
Schema

information_schema
lowercasedb
lowercasetabledb
performance_schema
testdb
(5 rows)

Query 20150818_070234_00008_837eu, FINISHED, 1 node
Splits: 2 total, 2 done (100.00%)
0:00 [5 rows, 98B] [30 rows/s, 597B/s]

Query 2 - show tables from mysql.lowercasetabledb;
Output -
Table

tableone
tabletwo
(2 rows)

Query 20150818_070253_00009_837eu, FINISHED, 1 node
Splits: 2 total, 2 done (100.00%)
0:00 [2 rows, 76B] [17 rows/s, 652B/s]

Query 3 - select * from mysql.lowercasetabledb.tableone;
Output -
columnone | columntwo
-----------+-----------
1 | Row 1
2 | Row 2
(2 rows)

Query 20150818_070319_00010_837eu, FINISHED, 1 node
Splits: 2 total, 2 done (100.00%)
0:00 [2 rows, 0B] [8 rows/s, 0B/s]

In this scenario I am able to access the tables in the database.

dain · 2015-08-18T17:18:25Z

@ashish6976 This is a bug in the mysql connector. Please open a new issue for the mysql connector.

…d be reverted once prestodb#2863 is resolved.

saileshmittal · 2016-02-13T04:31:52Z

Any progress on this?

samanthrao · 2016-11-10T05:29:10Z

I have the same issue as @ashish6976 but in my case the database is MongoDB. Some of the database names and collections are in title case per company standards. When Presto loads the catalog, it converts everything to lowercase and results in zero records being processed. The only thing that works is, when a collection name or database name is lowercase.

Are there any alternative to overcome this?

umnya · 2017-02-20T09:24:18Z

Any update about this ?

knoguchi · 2017-02-22T22:56:31Z

I've read the source a bit. There are places in the Presto code like this
https://github.com/prestodb/presto/blob/master/presto-spi/src/main/java/com/facebook/presto/spi/ColumnMetadata.java#L49

I think we have to remove those toLowerCase().

img22 · 2017-05-19T13:02:27Z

Is there a fix for this? I'm not able to query upper case tables

faisal00813 · 2017-06-14T06:26:07Z

Any update about the fix?

RameshByndoor · 2017-07-14T12:57:10Z

I am using 0.180 version of presto and issue is still present. We have all our mysql table names in UPPERCASE and any alternative or patches available..?

MichaelAlo · 2017-07-24T23:37:51Z

Hi there,
Is the issue about tables in upper case in MySQL and a Presto connection fixed? :)
Cheers

Drizzt321 · 2017-07-27T21:36:47Z

I'm testing out Presto with MySQL, running into this exact issue. Is there any will, at all, to fix this rather big issue? Clearly there's more than 1 or 2 people, and this would prevent us from using Presto as our tables are managed by Hibernate, which creates them. I have seen https://dev.mysql.com/doc/refman/5.7/en/identifier-case-sensitivity.html, which setting it to 2 might work. I'm having trouble getting my MySQL instance to get that set properly though, but I'll report back if I manage to and if that works.

Drizzt321 · 2017-08-01T00:12:45Z

So, sadly I can't get it set to '2', I suppose it's because I'm running on Linux, and so mysql really doesn't like it so it keeps reverting to 0. I can get it set to 1, which simply lowercases all table names and stores them on disk as lowercased, but that's not so useful for lots of existing tables.

martint · 2017-08-04T23:03:49Z

For reference, here are the relevant parts from the SQL spec:

<delimited identifier> ::=
  <double quote> <delimited identifier body> <double quote>

<delimited identifier body> ::=  <delimited identifier part>...
<delimited identifier part> ::=
    <nondoublequote character>
  | <doublequote symbol>

<Unicode delimited identifier> ::=
  U <ampersand> <double quote> <Unicode delimiter body> <double quote>
      <Unicode escape specifier>
<Unicode escape specifier> ::=
  [ UESCAPE <quote> <Unicode escape character> <quote> ]
<Unicode delimiter body> ::=
  <Unicode identifier part>...
<Unicode identifier part> ::=
    <delimited identifier part>
  | <Unicode escape value>

24) For every <identifier body> IB there is exactly one corresponding case-normal form CNF. CNF is an <identifier body> derived from IB as follows:
Let n be the number of characters in IB. For i ranging from 1 (one) to n, the i-th character Mi of IB is transliterated into the corresponding character 
or characters of CNF as follows:
Case:
   a) If Mi is a lower case character or a title case character for which an equivalent upper case sequence U is de ned by Unicode, then let j be th
       e number of characters in U; the next j characters of CNF are U.
   b) Otherwise, the next character of CNF is Mi.
25) The case-normal form of the <identifier body> of a <regular identifier> is used for purposes such as and including determination of identifier 
      equivalence, representation in the Definition and Information Schemas, and representation in diagnostics areas.

...

27) Two <regular identifier>s are equivalent if the case-normal forms of their <identifier body>s, considered as the repetition of a <character string literal> 
that specifies a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation IDC that is sensitive to case, compare equally 
according to the comparison rules in Subclause 8.2, “<comparison predicate>”.

28) A <regular identifier> and a <delimited identifier> are equivalent if the case-normal form of the <identifier body> of the <regular identifier> and the 
<delimited identifier body> of the <delimited identifier> (with all occurrences of <quote> replaced by <quote symbol> and all occurrences of 
<doublequote symbol> replaced by <double quote>), considered as the repetition of a <character string literal> that specifies a <character set specification>
 of SQL_IDENTIFIER and IDC, compare equally according to the comparison rules in Subclause 8.2, “<comparison predicate>”.


29) Two<delimited identifier>s are equivalent if their <delimited identifierbody>s,considered as the repetition of a <character string literal> that specifies
 a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according to the
 comparison rules in Subclause 8.2, “<comparison predicate>”.

30) Two <Unicode delimited identifier>s are equivalent if their <Unicode delimiter body>s, considered as the repetition of a <character string literal> that
 specifies a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according
 to the comparison rules in Subclause 8.2, “<comparison predicate>”.

31) A <Unicode delimited identifier> and a <delimited identifier> are equivalent if their <Unicode delimiter body> and <delimited identifier body>, 
respectively, each considered as the repetition of a <character string literal> that specifies a <character set specification> of SQL_IDENTIFIER and 
an implementation-defined collation that is sensitive to case, compare equally according to the comparison rules in Subclause 8.2, “<comparison predicate>”.

32) A <regular identifier> and a <Unicode delimited identifier> are equivalent if the case-normal form of the <identifier body> of the <regular identifier> 
and the <Unicode delimiter body> of the <Unicode delimited identifier> considered as the repetition of a <character string literal>, each specifying a
 <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according to the 
comparison rules in Subclause 8.2, “<comparison predicate>”.

ghost · 2018-04-16T04:20:11Z

+1
Is there any progess on this? We are useing version 0.192 and the problem with Scenario 2 is still there.

Drizzt321 · 2018-04-18T07:30:33Z

I have an open PR #8674 that got stalled waiting for some replies from maintainers, and then I got busy with work and unable to continue it. I don't see any time in the foreseeable future for me to pick this back up.

nihalbtq8 · 2018-07-03T08:49:05Z

This is still an issue currently. Is there any alternative available?

binque · 2018-08-28T07:05:17Z

Any update ?

kokosing · 2018-08-28T11:42:11Z

Any update ?

Unfortunately, no ;(

dain · 2018-08-28T21:39:24Z

@hgschmie told me the other day he was looking at this

SanjayJosh · 2018-09-05T11:33:40Z

Any update on this? All the tables in my project are by default in upper case.
Or can any pointers be given in the code as to what can be changed so that a custom build for the same can be made?

tooptoop4 · 2018-09-13T01:57:37Z

any update? hive metastore tables in mysql like DBS, PARTITIONS can't be queried.

presto> select * from mysql.metastore.DBS;
Query 20180913_015448_00010_774as failed: line 1:15: Table mysql.metastore.dbs does not exist

ciscoring · 2018-11-09T07:55:30Z

Any updae on this issue? My collections in mongodb have names with upper case. Presto can't read these collections because it always change lower case.

alvespat · 2018-11-22T11:02:59Z

I'm getting the same issue with Sqlserver connectors and tables defined in uppercase... ( only be able to query tables where names are defined in lower case.... :-( (presto 0.203)

cberner · 2019-06-20T17:35:25Z

I no longer am actively working on Presto

mizunno · 2019-12-31T15:27:09Z

Any update on this? Thanks in advance.

xqliang · 2021-10-21T12:24:27Z

Fixed at Release 0.225.

JDBC Changes

Match schema and table names case insensitively. This behavior can be enabled by setting the case-insensitive-name-matching catalog configuration option to true.

yys12138 · 2021-11-18T03:15:30Z

I have a table that that only differ by case,for example ,columns ‘toPkgVErsName ' and ‘ toPkGVersName‘ . I can execute with 'select topkgversname from ......' correctly when i use hive.But the same sql in presto will be wrong (I use hue to connect with hive and presto).
and the error message is :

Multiple entries with same key: [topkgversname] required binary topkgversname (STRING)=min: , max: , num_nulls: 0 and [topkgversname] required binary topkgversname (STRING)=min: , max: walleve-1.0.269.171227.eaddc8f, num_nulls: 0 at io.prestosql.jdbc.AbstractPrestoResultSet.resultsException(AbstractPrestoResultSet.java:1731) at io.prestosql.jdbc.PrestoResultSet$ResultsPageIterator.computeNext(PrestoResultSet.java:216) at io.prestosql.jdbc.PrestoResultSet$ResultsPageIterator.computeNext(PrestoResultSet.java:176) at io.prestosql.jdbc.$internal.guava.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141) at io.prestosql.jdbc.$internal.guava.collect.AbstractIterator.hasNext(AbstractIterator.java:136) at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811) at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:295) at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:207) at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:162) at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:301) at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681) at io.prestosql.jdbc.PrestoResultSet$AsyncIterator.lambda$new$0(PrestoResultSet.java:122) at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

Is there any setting that could fix this problem?

ebyhr · 2021-11-18T04:31:48Z

@yys12138 It seems you are using Trino formerly PrestoSQL. The right repository is https://github.com/trinodb/trino and you can ask questions in the community Slack. https://trino.io/slack.html

ashish6976 mentioned this issue Aug 19, 2015

MYSQL Connector does not identifies Upper Case Database Name and Table Name #3470

Closed

saileshmittal pushed a commit to twitter-forks/presto that referenced this issue Oct 2, 2015

Compare types ignoring cases. This is not complete solution and shoul…

ec109d1

…d be reverted once prestodb#2863 is resolved.

saileshmittal pushed a commit to twitter-forks/presto that referenced this issue Oct 6, 2015

Compare types ignoring cases. This is not complete solution and shoul…

7b6c0bc

…d be reverted once prestodb#2863 is resolved.

soulmachine mentioned this issue Dec 21, 2015

Always return NULL for columns with uppercase letters #4215

Closed

billonahill mentioned this issue Mar 22, 2016

PARQUET-304: Add an option to make requested schema case insensitive in read path apache/parquet-java#210

Open

cberner mentioned this issue Apr 12, 2016

Mysql connector cannot retrieve tables which includes underscore in table name #4983

Closed

cosinequanon mentioned this issue Jul 24, 2017

Issue with dbSendQuery and dbGetQuery and upper case tables names in MySQL prestodb/RPresto#81

Closed

Drizzt321 mentioned this issue Aug 4, 2017

Adding limited support for case-sensitive table names #8674

Closed

findepi mentioned this issue Mar 29, 2018

Can`t query mysql table using a uppercase name #10288

Closed

findepi mentioned this issue Sep 18, 2018

not all tables in information_schema is available in mysql catalog #11509

Closed

findepi mentioned this issue Oct 3, 2018

WIP: Support non-lower-case table names in JDBC connectors #11633

Closed

Praveen2112 mentioned this issue Dec 13, 2018

[WIP] Adding support for case sensitive table names #12071

Closed

arhimondr mentioned this issue Feb 18, 2019

Always escape row field names #12294

Closed

This was referenced May 10, 2019

Allow connectors to participate in query optimization trinodb/trino#18

Open

Add support for case sensitive identifiers trinodb/trino#17

Open

nezihyigitbasi mentioned this issue May 31, 2019

Fix Bug with ElasticSearchQueryBuilder, Column Name are not preserving case sensivity #12791

Closed

ebyhr mentioned this issue Jun 20, 2019

mysql connector - table does not exist error but table is in 'show tables' #12957

Closed

findepi mentioned this issue Jun 22, 2019

Resolve tables/schemas case-insensitively in JDBC connectors trinodb/trino#614

Merged

kewang1024 mentioned this issue Jul 16, 2019

Resolve tables/schemas case-insensitively in JDBC connectors #13087

Merged

beinan mentioned this issue Jan 13, 2021

Druid connector fails to identify tables with uppercase names. #15587

Closed

jtcohen6 mentioned this issue May 5, 2021

Updates catalog queries to use strict equality dbt-labs/dbt-presto#40

Merged

Add support for case sensitive identifiers #2863

Add support for case sensitive identifiers #2863

Comments

cberner commented May 4, 2015

jamiemccrindle commented May 12, 2015

electrum commented May 12, 2015

dain commented May 12, 2015

XiLongZheng commented Jul 27, 2015

dain commented Jul 27, 2015

ashish6976 commented Aug 17, 2015

dain commented Aug 18, 2015

ashish6976 commented Aug 18, 2015

Scenario 1 - DataBase name and Table name is combination of upper case and lower case letters

Scenario 2 - DataBase name is in lower case , Table name is combination of upper case and lower case letters

Scenario 3 - DataBase name and Table name is in lower case letters

dain commented Aug 18, 2015

saileshmittal commented Feb 13, 2016

samanthrao commented Nov 10, 2016

umnya commented Feb 20, 2017

knoguchi commented Feb 22, 2017 • edited Loading

img22 commented May 19, 2017

faisal00813 commented Jun 14, 2017

RameshByndoor commented Jul 14, 2017

MichaelAlo commented Jul 24, 2017

Drizzt321 commented Jul 27, 2017

Drizzt321 commented Aug 1, 2017

martint commented Aug 4, 2017

ghost commented Apr 16, 2018 • edited by ghost Loading

Drizzt321 commented Apr 18, 2018

nihalbtq8 commented Jul 3, 2018

binque commented Aug 28, 2018

kokosing commented Aug 28, 2018

dain commented Aug 28, 2018

SanjayJosh commented Sep 5, 2018

tooptoop4 commented Sep 13, 2018

ciscoring commented Nov 9, 2018

alvespat commented Nov 22, 2018

cberner commented Jun 20, 2019

mizunno commented Dec 31, 2019

xqliang commented Oct 21, 2021

JDBC Changes

yys12138 commented Nov 18, 2021 • edited Loading

ebyhr commented Nov 18, 2021

knoguchi commented Feb 22, 2017 •

edited

Loading

ghost commented Apr 16, 2018 •

edited by ghost

Loading

yys12138 commented Nov 18, 2021 •

edited

Loading