Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Queries with date literals fail if date format does not includeepoch_millis #1847

Open
MaxKsyunz opened this issue Jul 11, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@MaxKsyunz
Copy link
Collaborator

MaxKsyunz commented Jul 11, 2023

What is the bug?
Timestamp literals are always converted to epoch_millis in OpenSearch DSL. This results in OpenSearch exception if the field does not support epoch_millis

How can one reproduce the bug?
Create an index with this mapping:

PUT /date-test
{
  "mappings": {
    "properties": {
      "dateField": {
        "type": "date",
        "format": "strict_date_optional_time"
      }
    }
  }
}

Add some data:

POST /date-test/_doc
{
  "dateField": "2000-03-04"
}

POST /date-test/_doc
{
  "dateField": "1999-01-02"
}

Execute SQL query:

POST /_plugins/sql
{
  "query": "select * from date-test where dateField = date \"2002-03-04\""
}

Results in OpenSearch error:

{
    "error": {
        "details": "Shard[0]: [date-test/Yd43agSsSlqDvbN9XM13tw] QueryShardException[failed to create query: failed to parse date field [1015200000000] with format [strict_date_optional_time]: [failed to parse date field [1015200000000] with format [strict_date_optional_time]]]; nested: OpenSearchParseException[failed to parse date field [1015200000000] with format [strict_date_optional_time]: [failed to parse date field [1015200000000] with format [strict_date_optional_time]]]; nested: IllegalArgumentException[failed to parse date field [1015200000000] with format [strict_date_optional_time]]; nested: DateTimeParseException[Text '1015200000000' could not be parsed at index 0];\n\nFor more details, please send request for Json format to see the raw response from OpenSearch engine.",
        "reason": "Error occurred in OpenSearch engine: all shards failed",
        "type": "SearchPhaseExecutionException"
    },
    "status": 400
}

What is the expected behavior?
Query work to return one row.

Additional context
The query executes successfully if format for dateField includes epoch_millis.

@MaxKsyunz MaxKsyunz added bug Something isn't working untriaged labels Jul 11, 2023
@Yury-Fridlyand
Copy link
Collaborator

2.5

AWS currently offers only 2.5 OpenSearch version. SQL plugin there interpret all date fields in TIMESTAMP SQL type. WHERE clause has a function which compares a TIMESTAMP with DATE provided by DATE literal, so cast_to_timestamp function is added to the right side of expression to provide values of the same type for comparison.
The filter expression was like

= (
  ReferenseExpression(
    dateField,
    TIMESTAMP),
  cast_to_timestamp(
    DateLiteral(
      "2002-03-04"
    )
  )
)

Hereby cast_to_timestamp function provides an ExprTimestampValue backed by Instant which is converted for epoch millis for query DSL. Epoch millis (a number) isn't comparable with any string, since query failed.

2.8

2.8 behaves the same way as 2.5 in that case. It has improvements for named formats, but custom formats are still not supported.

2.9

#1821 brings support of custom formats, so date field with format "yyyy-mm-dd" is interpreted as TIMESTAMP because there is a typo. mm (lowercase) stands for minutes, so this format refers date and time parts both. SQL plugin still adds cast_to_timestamp to the filter. Once format fixed to "yyyy-mm-dd" (MM in uppercase stands for months), SQL plugin detects that this field is in DATE format and nothing added to DSL (explain response):

{
  "from": 0,
  "size": 200,
  "timeout": "1m",
  "query": {
    "term": {
      "dateField": {
        "value": "2002-03-04",
        "boost": 1
      }
    }
  },
  "_source": {
    "includes": [
      "dateField"
    ],
    "excludes": []
  },
  "sort": [
    {
      "_doc": {
        "order": "asc"
      }
    }
  ]
}

@MaxKsyunz
Copy link
Collaborator Author

D'oh! Thank you for looking into this @Yury-Fridlyand.

While my example was problematic, users can still run into this bug if the field has a custom format that can include time. For example, with format = "strict_date_optional_time" users will get the same error. I'll update the ticket text.

@Yury-Fridlyand
Copy link
Collaborator

strict_date_optional_time is also a format for TIMESTAMP values, so cast_to_timestamp is added.
OpenSearchDateType (:opensearch module) has format information, but then it is simplified to ExprCoreType (:core module).
Analyzer (:core module) builds TypeEnvironment with simplified data types only, because they are used in DSL. ExprCoreType doesn't store type specific information like format.
Later, TermQuery (:opensearch module) builds a DSL query to OpenSearch using given formats. Unfortunately, it has no access to extended formats retrieved in the same module.

if (literal.type().equals(ExprCoreType.TIMESTAMP)) {
return literal.timestampValue().toEpochMilli();

Workaround:

  1. Add epoch_millis format to datetime fields.
  2. Use date formats for date fields and time for time.

Fix:
Update ExpType interface and make all complex types (all OpenSearchDataType, including OpenSearchDateType) are passed through :core module. Changes in FunctionDSL are also needed to use simplified types. Full types would become available in :opensearch when it is needed to build a DSL query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants