Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doris On ES][Bug-fix] Solve the problem of time format processing #3941

Merged
merged 1 commit into from
Jun 28, 2020

Conversation

wuyunfeng
Copy link
Member

@wuyunfeng wuyunfeng commented Jun 24, 2020

#3936
Doris On ES can obtain field value from _source or docvalues:

  1. From _source , get the origin value as you put, ES process indexing、docvalues for date field is converted to millisecond
  2. From docvalues, before( 6.4 you get millisecond timestamp value, after(include) 6.4 you get the formatted date value :2020-06-18T12:10:30.000Z, but ES (>=6.4) provide format parameter for docvalue field request, this would coming soon for Doris On ES

After this PR was merged into Doris, Doris On ES would only correctly support to process millisecond timestamp and string format date, if you provided a seconds timestamp, Doris On ES would process wrongly which (divided by 1000 internally)

ES mapping:

{
   "timestamp_test": {
      "mappings": {
         "doc": {
            "properties": {
               "k1": {
                  "type": "date",
                  "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
               }
            }
         }
      }
   }
}

ES documents:

       {
            "_index": "timestamp_test",
            "_type": "doc",
            "_id": "AXLbzdJY516Vuc7SL51m",
            "_score": 1,
            "_source": {
               "k1": "2020-6-25"
            }
         },
         {
            "_index": "timestamp_test",
            "_type": "doc",
            "_id": "AXLbzddn516Vuc7SL51n",
            "_score": 1,
            "_source": {
               "k1": 1592816393000  ->  2020/6/22 16:59:53
            }
         }

Doris Table:

CREATE EXTERNAL TABLE `timestamp_source` (
  `k1` date NULL COMMENT ""
) ENGINE=ELASTICSEARCH

enable_docvalue_scan = false

For ES 5.5:

mysql> select k1 from timestamp_source;
+------------+
| k1         |
+------------+
| 2020-06-25 |
| 2020-06-22 |
+------------+

For ES 6.5 or above:

mysql> select * from timestamp_source;
+------------+
| k1         |
+------------+
| 2020-06-25 |
| 2020-06-22 |
+------------+

enable_docvalue_scan = true

For ES 5.5:

mysql> select k1 from timestamp_dv; 
+------------+
| k1         |
+------------+
| 2020-06-25 |
| 2020-06-22 |
+------------+

For ES 6.5 or above:

mysql> select * from timestamp_dv; 
+------------+
| k1         |
+------------+
| 2020-06-25 |
| 2020-06-22 |
+------------+

@BabySid
Copy link
Contributor

BabySid commented Jun 24, 2020

add a example mapping of index into this doc to show how to create a index?

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman added approved Indicates a PR has been approved by one committer. area/doris-on-es Issues or PRs related to Doris on ElasticSearch kind/fix Categorizes issue or PR as related to a bug. labels Jun 26, 2020
@morningman morningman merged commit dc603de into apache:master Jun 28, 2020
morningman pushed a commit to morningman/doris that referenced this pull request Jun 28, 2020
…pache#3941)

 apache#3936
Doris On ES can obtain field value from `_source` or `docvalues`:
1. From `_source` ,  get the origin value as you put, ES process indexing、docvalues for `date` field is converted to millisecond
2. From `docvalues`,  before( 6.4 you get `millisecond timestamp` value, after(include) 6.4 you get the formatted `date` value :2020-06-18T12:10:30.000Z, but ES (>=6.4) provide `format` parameter for  `docvalue` field request, this would coming soon for Doris On ES

After this PR was merged into Doris, Doris On ES would only correctly support to process `millisecond` timestamp and string format date, if you provided a `seconds` timestamp, Doris On ES would process wrongly which (divided by 1000 internally)

ES mapping:

```
{
   "timestamp_test": {
      "mappings": {
         "doc": {
            "properties": {
               "k1": {
                  "type": "date",
                  "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
               }
            }
         }
      }
   }
}
```

ES documents:

```
       {
            "_index": "timestamp_test",
            "_type": "doc",
            "_id": "AXLbzdJY516Vuc7SL51m",
            "_score": 1,
            "_source": {
               "k1": "2020-6-25"
            }
         },
         {
            "_index": "timestamp_test",
            "_type": "doc",
            "_id": "AXLbzddn516Vuc7SL51n",
            "_score": 1,
            "_source": {
               "k1": 1592816393000  ->  2020/6/22 16:59:53
            }
         }
```
Doris Table:

```
CREATE EXTERNAL TABLE `timestamp_source` (
  `k1` date NULL COMMENT ""
) ENGINE=ELASTICSEARCH
```


### enable_docvalue_scan = false

**For ES 5.5**:
```
mysql> select k1 from timestamp_source;
+------------+
| k1         |
+------------+
| 2020-06-25 |
| 2020-06-22 |
+------------+
```

**For ES 6.5 or above**:

```
mysql> select * from timestamp_source;
+------------+
| k1         |
+------------+
| 2020-06-25 |
| 2020-06-22 |
+------------+
```

###  enable_docvalue_scan = true 

**For ES 5.5**:

```
mysql> select k1 from timestamp_dv; 
+------------+
| k1         |
+------------+
| 2020-06-25 |
| 2020-06-22 |
+------------+
```

**For ES 6.5 or above**:

```
mysql> select * from timestamp_dv; 
+------------+
| k1         |
+------------+
| 2020-06-25 |
| 2020-06-22 |
+------------+
```
@EmmyMiao87 EmmyMiao87 mentioned this pull request Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. area/doris-on-es Issues or PRs related to Doris on ElasticSearch kind/fix Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants