Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YCQL] Unable to do explicit ORDER BY sort on clustered columns of index #1591

Closed
dwheeler opened this issue Jun 20, 2019 · 3 comments
Closed
Assignees
Labels
community/request Issues created by external users kind/bug This issue is a bug

Comments

@dwheeler
Copy link

dwheeler commented Jun 20, 2019

When creating a keyspace with the following schema

CREATE TABLE test_table (
	a text,
	b text,
	c int,
	PRIMARY KEY (a, b))
WITH CLUSTERING ORDER BY (b ASC) AND default_time_to_live = 0;

CREATE INDEX test_table_index ON test_table (b, c) INCLUDE (a) 
WITH CLUSTERING ORDER BY (c DESC) 
  AND transactions = {'enabled' : FALSE, 'consistency_level' : 'user_enforced'};

Queries that use the secondary index and provide an order by will fail with the error:

SELECT * FROM test_table WHERE b = 'b2'  ORDER BY c DESC;
ERROR: Invalid Arguments. All hash columns must be set if order by clause is present.

Leaving the order clause out succeeds

SELECT * FROM test_table WHERE b = 'b2';
@rao-vasireddy rao-vasireddy added the kind/bug This issue is a bug label Jun 20, 2019
@kmuthukk kmuthukk changed the title Unable to sort on user enforced index [YCQL] Unable to do explicit ORDER BY sort on clustered columns of index Jun 20, 2019
@kmuthukk kmuthukk added the ycql label Jun 20, 2019
@kmuthukk
Copy link
Collaborator

Thx @dwheeler for reporting.
@nocaway - could you take an initial look of what's involved to address this?

@rkarthik007 rkarthik007 removed the ycql label Jun 24, 2019
@nocaway
Copy link
Contributor

nocaway commented Jun 24, 2019

This is a bug in YugaByte semantic analysis.

  • The error check was processing SELECT statement against the table description without considering the secondary indexes of the table.
  • The decision on which index to use for a query is determined at a later time.

Implementation.

  • Currently, when WHERE clause is processed, values of columns are divided to list<hash_column>, list<range_column>, and list<regular_column> according to the table description. These lists are then used to check against ONLY the PRIMARY index. Afterward, YugaByte walks the lists to decide which INDEX to use, but by that time, the error check already rejects the requested query.

  • The fix would be changing the list to array< list >. Each entry of the array would be associated with an index. PRIMARY index is associated with array[0]. Index id 1 would be associated with array[1], and so on. The array can then be used to analyze the query against each INDEX.

yugabyte-ci pushed a commit that referenced this issue Jun 26, 2019
Summary:
This diff fixed ORDER BY processing.

- Analyzing ORDER BY clause should come AFTER instead of BEFORE choosing which INDEX to be used in SCANing.
- When an index is chosen, the current code create a duplicate SELECT and run analysis on that duplicate select. After the statement is duplicated, ORDER_BY clause is removed from the user-select.  That way, we analyze the clause against ONLY chosen index instead of the table primary key.

Test Plan: Add test function "testOrderBy()" to yb-cql::TestIndex suite.

Reviewers: mihnea

Reviewed By: mihnea

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D6808
@nocaway
Copy link
Contributor

nocaway commented Jun 26, 2019

This has been fixed by commit 592fcfa

Instead of making copies of column_list (schema) for the indexes as suggested in my previous comment, the existing code made one new copy for the entire SELECT statement, replace the target-table with its chosen table_index, and rerun analysis on the entire SELECT against the chosen table_index. I decided to keep this design of copying the entire SELECT statement in this bug fix as changing the current design would be beyond the scope of this bug fix.

To fix the issue, the analysis for ORDER BY clause is now done after the aforementioned copy of original SELECT statement is created and being analyzed. As a result, the error-check is run against the index schema instead of the schema of the target-table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community/request Issues created by external users kind/bug This issue is a bug
Projects
None yet
Development

No branches or pull requests

6 participants