Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc_values for fields that need to be sorted or aggregated in ElasticSearch, and disable all others. #12782

Merged
merged 14 commits into from
Nov 24, 2024

Conversation

kezhenxu94
Copy link
Member

@kezhenxu94
Copy link
Member Author

Unlike what is suggested in #12741, I use @ElasticSearch.EnableDocValues for those fields that need this feature to opt in, because most of the fields won’t require this feature.

Comment on lines 338 to 340
if (!elasticSearchExtension.isDocValuesEnabled()) {
columnProperties.put("doc_values", false);
}
Copy link
Member

@wu-sheng wu-sheng Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entity_id in all OAL generated metrics, how do we add this for that? I don't see OAL engine relative changes.

@@ -78,6 +78,7 @@ public StorageID id() {
private String id0;
@Column(name = ID1, storageOnly = true)
private String id1;
@ElasticSearch.EnableDocValues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does alarm start time need this?

Copy link
Member Author

@kezhenxu94 kezhenxu94 Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does alarm start time need this?

All fields used for sorting and aggregation need this

Copy link
Member Author

@kezhenxu94 kezhenxu94 Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does alarm start time need this?

This reminds me one potential issue here, if we restrict to only add fields used in this repo, it might break third parties’s custom plugin if they add more features to their own plugin by aggregating/sorting some fields that we didn’t enable doc_values. The extensibility will be restricted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will. Generally, it may be worth to see how much benefit we will get from this change. Could you try a benchmark about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But to be honest, we never guarantee users could read data from elasticsearch on their own, we only guarantee from our GraphQL/PromQL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanahmily can you share where you find disabling doc_values can speed up the query performance?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, he was talking about BanyanDB, and the concept was from this config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key use case is when we talk about _traffic indices and trace/log indices. Conditions for that, is not used for sorting and aggregation. So, we could reduce the payload.

1. Value column of metrics.
2. Conditions of logs and traces(skywalking and zipkin) exclude latency and timestamp, which are used in sorting.
3. All searchable field in metadata(*_traffic)

In the original issue, I only proposed three use cases. Nothing more in my mind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditions for that, is not used for sorting and aggregation. So, we could reduce the payload.

What I’m wondering is this, what “payload” are we trying to reduce, as mentioned, disabling doc_values is mainly for reducing disk space, I don’t see how it would speed up in terms of query performance like you said here #12782 (comment). If reducing disk space is our goal, disabling all possible fields will maximize the outcome, that’s why I tend to disable doc_values by default and opt in those fields that need this feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reducing disk space is a benefit, and speeding up is just from smaller files/indices perspective, which is BanyanDB side asking about.
Sorry for misleading.

@wu-sheng wu-sheng added backend OAP backend related. enhancement Enhancement on performance or codes labels Nov 21, 2024
@wu-sheng
Copy link
Member

Please note, we are better to verify what is the impact to existing(last version) indices? Does Elasticsearch and our storage implementation support to change this config automatically when upgrade.

@kezhenxu94
Copy link
Member Author

We support modifying index/template mapping actually, just run a test on the same ES server from the previous commit (master branch) and then upgrade to this branch, the existing indices's mappings remain the same and the template mappings changed as expected, which will impact the new indices created in the future.

Diff of sw_metrics_all template mapping

diff --git a/tmp/before-template.json b/tmp/after-template.json
index 603432b7f9..113f050e94 100644
--- a/tmp/before-template.json
+++ b/tmp/after-template.json
@@ -27,90 +27,110 @@
             },
             "properties": {
               "dest_service_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "agent_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "last_ping": {
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "precision": {
                 "index": false,
-                "type": "integer"
+                "type": "integer",
+                "doc_values": false
               },
               "double_summation": {
                 "index": false,
-                "type": "double"
+                "type": "double",
+                "doc_values": false
               },
               "labels_json": {
                 "index": false,
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "tag_key": {
                 "type": "keyword"
               },
               "type": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "uuid": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "summation": {
                 "index": false,
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "instance_traffic_name": {
                 "index": false,
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "percentage": {
                 "type": "integer"
               },
               "total_num": {
                 "index": false,
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "time_bucket": {
                 "type": "long"
               },
               "service_layer": {
-                "type": "integer"
+                "type": "integer",
+                "doc_values": false
               },
               "component_id": {
                 "index": false,
                 "type": "integer"
               },
               "service_name": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "count": {
                 "index": false,
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "entity_id": {
                 "type": "keyword"
               },
               "denominator": {
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "numerator": {
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "dest_process_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "start_time": {
                 "type": "long"
               },
               "related_service_layer": {
-                "type": "integer"
+                "type": "integer",
+                "doc_values": false
               },
               "instance_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "tag_value": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "ranks": {
                 "index": false,
@@ -118,58 +138,71 @@
               },
               "t_num": {
                 "index": false,
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "service_traffic_name_match": {
                 "analyzer": "oap_analyzer",
                 "type": "text"
               },
               "related_service_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "name": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "service_traffic_name": {
                 "copy_to": "service_traffic_name_match",
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "short_name": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "s_num": {
                 "index": false,
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "parameters": {
                 "index": false,
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "process_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "span_name": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "datatable_summation": {
                 "index": false,
                 "type": "text"
               },
               "detect_type": {
-                "type": "integer"
+                "type": "integer",
+                "doc_values": false
               },
               "tag_type": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "task_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "component_ids": {
                 "index": false,
                 "type": "keyword"
               },
               "layer": {
-                "type": "integer"
+                "type": "integer",
+                "doc_values": false
               },
               "int_value": {
                 "type": "integer"
@@ -178,122 +211,154 @@
                 "type": "keyword"
               },
               "remote_service_name": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "endpoint": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "attr0": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "total": {
                 "index": false,
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "ebpf_profiling_schedule_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "endpoint_traffic_name_match": {
                 "analyzer": "oap_analyzer",
                 "type": "text"
               },
               "service_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "datatable_count": {
                 "index": false,
                 "type": "text"
               },
               "source_service_instance_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "service_instance": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "profiling_support_status": {
-                "type": "integer"
+                "type": "integer",
+                "doc_values": false
               },
               "value": {
                 "type": "long"
               },
               "source_service_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "address": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "datatable_value": {
                 "index": false,
                 "type": "text"
               },
               "represent_service_instance_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "dest_endpoint": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "represent_service_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "end_time": {
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "match": {
                 "index": false,
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "service_group": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "attr5": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "label": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "service_instance_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "related_instance_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "source_process_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "message": {
                 "index": false,
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "attr2": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "attr1": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "double_value": {
                 "type": "double"
               },
               "attr4": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "endpoint_traffic_name": {
                 "copy_to": "endpoint_traffic_name_match",
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "attr3": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "dest_service_instance_id": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "last_update_time_bucket": {
-                "type": "long"
+                "type": "long",
+                "doc_values": false
               },
               "source_endpoint": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "service": {
-                "type": "keyword"
+                "type": "keyword",
+                "doc_values": false
               },
               "dataset": {
                 "index": false,

@wu-sheng
Copy link
Member

OK, this seems good enough not breaking anything once we added all necessary annotations.

@kezhenxu94 kezhenxu94 marked this pull request as ready for review November 24, 2024 03:02
@wu-sheng wu-sheng added this to the 10.2.0 milestone Nov 24, 2024
@kezhenxu94 kezhenxu94 merged commit b832137 into apache:master Nov 24, 2024
168 checks passed
@kezhenxu94 kezhenxu94 deleted the docvalue branch November 24, 2024 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend OAP backend related. enhancement Enhancement on performance or codes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Improve elasticsearch performance by using disable doc values
2 participants