Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Fix catalog filtering in SYS COLUMNS #40583

Merged
merged 10 commits into from
Apr 9, 2019
Merged

Conversation

costin
Copy link
Member

@costin costin commented Mar 28, 2019

Properly treat '%' as a wildcard for catalog filtering instead of doing
a straight string match.

Fix #40582

Properly treat '%' as a wildcard for catalog filtering instead of doing
a straight string match.

Fix elastic#40582
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@costin costin requested a review from matriv March 28, 2019 10:55
Add integration test for SYS COLUMNS - currently running only inside
single_node since the cluster name is test dependent.
@costin
Copy link
Member Author

costin commented Mar 28, 2019

I have updated the PR to fix also alias filtering which was not performed (only the index name was considered). Also added a test suite since mocking doesn't really work for this case.

@@ -215,6 +216,8 @@ private Throwable reworkException(Throwable th) {

@SuppressForbidden(reason = "test reads from jar")
public static InputStream readFromJarUrl(URL source) throws IOException {
return source.openStream();
URLConnection con = source.openConnection();
con.setDefaultUseCaches(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment why this is necessary?

boolean matchesAlias = false;
if (pattern != null && aliasMetadata != null) {
for (AliasMetaData aliasMeta : aliasMetadata) {
matchesAlias |= pattern.matcher(aliasMeta.alias()).matches();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we exit on first occurrence of true?

public static List<Object[]> readScriptSpec() throws Exception {
List<Object[]> list = new ArrayList<>();
list.addAll(CsvSpecTestCase.readScriptSpec());
return readScriptSpec("/disabled/command-sys.csv-spec", specParser());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like disabled in the path here.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

for the drivers)
Move away from separateMappings() since in case of aliases pointing to
multiple indices, it can return duplicated entries (as there's no
merging) causing issues.
Hence why mergedMapping is used instead with the assumption that only
one table will be used at a time.
A more correct (yet quite expensive) alternative would be to return the
mapping per index yet in case of aliases, drop it and use field caps
instead. However as this is not CCS-friendly, using field_caps instead
seems a good solution for now.
Fix monotony of ORDINAL_POSITION
Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added two comments, but am not 100% sure I am right about those.


TABLE_CAT:s | TABLE_SCHEM:s| TABLE_NAME:s | COLUMN_NAME:s | DATA_TYPE:i | TYPE_NAME:s | COLUMN_SIZE:i| BUFFER_LENGTH:i|DECIMAL_DIGITS:i|NUM_PREC_RADIX:i | NULLABLE:i| REMARKS:s | COLUMN_DEF:s |SQL_DATA_TYPE:i|SQL_DATETIME_SUB:i|CHAR_OCTET_LENGTH:i|ORDINAL_POSITION:i|IS_NULLABLE:s|SCOPE_CATALOG:s|SCOPE_SCHEMA:s|SCOPE_TABLE:s|SOURCE_DATA_TYPE:sh|IS_AUTOINCREMENT:s|IS_GENERATEDCOLUMN:s
---------------+---------------+---------------+--------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------------+-----------------+----------------+---------------+---------------+---------------+---------------+----------------+----------------+------------------
x-pack_plugin_sql_qa_single-node_integTestCluster |null |test\_emp |birth_date |93 |DATETIME |24 |8 |null |null |1 |null |null |9 |3 |null |1 |YES |null |null |null |null |NO |NO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may miss something, but is it ok for this to return `test\_emp' (including the escaped character)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug - the test\_emp is escaped so the name should be test_emp.

sysColumnsWithTableLikeNoEscape
SYS COLUMNS TABLE LIKE 'test_emp';

// since there's no escaping test_emp means test*emp which matches also test_alias_emp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_emp should match test[one character]emp when there is no escape character, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately due to the current field caps features, we can't figure out the source index.

@bpintea
Copy link
Contributor

bpintea commented Apr 3, 2019

Gave the branch a try and if nothing was mistaken during the process:

  • sys columns catalog '%' table like 'employees' (and sys columns catalog '%' table like 'employees' like '%') returns an empty result set;
  • sys columns catalog 'distribution\_run' table like '%' will return a full list of all columns in all tables is the catalog (which is right); but sys columns catalog 'distribution\_run' table like '' will return the same, except for the table name column which is empty (which seems wrong).

@bpintea
Copy link
Contributor

bpintea commented Apr 9, 2019

Update after consulting with @costin:

sys columns catalog '%' table like 'employees' (and sys columns catalog '%' table like 'employees' like '%') returns an empty result set;

This (=empty result set) is correct, since the catalog argument cannot contain a string search pattern.

sys columns catalog 'distribution\_run' table like '' will return the same, except for the table name column which is empty (which seems wrong)

This is currently a known limitation; however, it should have no practical impact, since the applications ask for the columns of specific table.

@costin costin merged commit 8e61b77 into elastic:master Apr 9, 2019
costin added a commit that referenced this pull request Apr 9, 2019
Properly treat '%' as a wildcard for catalog filtering instead of doing
a straight string match.
Table filtering now considers aliases as well.
Add escaping char for LIKE queries with user defined params
Fix monotony of ORDINAL_POSITION
Add integration test for SYS COLUMNS - currently running only inside
single_node since the cluster name is test dependent.
Add pattern unescaping for index names

Fix #40582

(cherry picked from commit 8e61b77)
@costin costin deleted the fix-40582 branch April 9, 2019 15:57
costin added a commit that referenced this pull request Apr 9, 2019
Properly treat '%' as a wildcard for catalog filtering instead of doing
a straight string match.
Table filtering now considers aliases as well.
Add escaping char for LIKE queries with user defined params
Fix monotony of ORDINAL_POSITION
Add integration test for SYS COLUMNS - currently running only inside
single_node since the cluster name is test dependent.
Add pattern unescaping for index names

Fix #40582

(cherry picked from commit 8e61b77)
costin added a commit that referenced this pull request Apr 9, 2019
Properly treat '%' as a wildcard for catalog filtering instead of doing
a straight string match.
Table filtering now considers aliases as well.
Add escaping char for LIKE queries with user defined params
Fix monotony of ORDINAL_POSITION
Add integration test for SYS COLUMNS - currently running only inside
single_node since the cluster name is test dependent.
Add pattern unescaping for index names

Fix #40582

(cherry picked from commit 8e61b77)
@polyfractal polyfractal added v7.0.1 and removed v7.0.0 labels Apr 9, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Properly treat '%' as a wildcard for catalog filtering instead of doing
a straight string match.
Table filtering now considers aliases as well.
Add escaping char for LIKE queries with user defined params
Fix monotony of ORDINAL_POSITION
Add integration test for SYS COLUMNS - currently running only inside
single_node since the cluster name is test dependent.
Add pattern unescaping for index names

Fix elastic#40582
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQL: SYS COLUMNS doesn't consider catalog wildcard
8 participants