Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use Calcites.getColumnTypeForRelDataType for SQL CAST operator conversion #13890

Merged
merged 3 commits into from
Mar 7, 2023

Conversation

clintropolis
Copy link
Member

Description

This PR modifies the SQL CAST operator conversion to use Calcites.getColumnTypeForRelDataType to convert calcite types to native Druid types instead of using its own custom SqlTypeName to ExprType mapping to make it more consistent with other SQL to Druid type conversions done for most other operators. This allows it to handle some additional casts which were not previously supported, such as those with ARRAY types. The test added to CalciteMultiValueStringQueryTest was not able to be planned prior to the changes to the CAST operator since during planning explicit casts would be added which would then fail since the mapping table was missing arrays.

Switching CAST to use Calcites.getColumnTypeForRelDataType also required making a small adjustment to this method to be able to handle SqlTypeName.DAY_INTERVAL_TYPES and SqlTypeName.YEAR_INTERVAL_TYPES as a Druid LONG type. This also allows some additional expressions to be planned that were not previously supported, shown in the changes to ExpressionsTest, GreatestExpressionTest, and LeastExpressionTest.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@@ -1779,8 +1778,7 @@ public void testTimeMinusYearMonthInterval()
DruidExpression.functionCall("timestamp_shift"),
ImmutableList.of(
DruidExpression.ofColumn(ColumnType.LONG, "t"),
// RexNode type "interval year to month" is not reported as ColumnType.STRING
DruidExpression.ofLiteral(null, DruidExpression.stringLiteral("P13M")),
DruidExpression.ofLiteral(ColumnType.LONG, DruidExpression.stringLiteral("P13M")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long type for string literal? Seems not right.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so this was a result of merging the rules from cast into Calcites.getColumnTypeForRelDataType which had interval types as LONG, but agree it does look a bit strange so will fix to report these interval types as strings instead and see what that breaks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, i see why it was treating the cast as LONG typed...

timestampDiff test for example starts spitting out explicit casts which are not necessary so I think i need to add a bit extra logic to cast to let intervals cast to longs without a cast ExpressionVirtualColumn{name='v0', expression='div(CAST(("__time" - 915148800000), 'LONG'),86400000)', outputType=LONG}

Copy link
Member Author

@clintropolis clintropolis Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, updated to handle intervals as STRINGS so they match the argument literals in plans, but also handle them as longs for casting (and also reduction). I suppose instead I could just drop intervals from being handled by Calcites.getColumnTypeForRelDataType like it was before and just handle them in the cast, but string seems the most correct i think?

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after the latest changes

@gianm gianm merged commit 3924f0e into apache:master Mar 7, 2023
@clintropolis clintropolis deleted the sql-cast-operator-refactor branch March 7, 2023 21:13
317brian pushed a commit to 317brian/druid that referenced this pull request Mar 10, 2023
…sion (apache#13890)

* use Calcites.getColumnTypeForRelDataType for SQL CAST operator conversion

* fix comment

* intervals are strings but also longs
@clintropolis clintropolis added this to the 26.0 milestone Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants