Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RAC][Rule Registry] BUG: Duplicated documents written after index bootstrapping #110499

Closed
Tracked by #101016
xcrzx opened this issue Aug 30, 2021 · 8 comments
Closed
Tracked by #101016
Labels
bug Fixes for quality problems that affect the customer experience Team:Detection Alerts Security Detection Alerts Area Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete

Comments

@xcrzx
Copy link
Contributor

xcrzx commented Aug 30, 2021

Parent ticket: #101016

Summary

When I am trying to mass write into uninitialized Rule Registry indices, some documents get duplicated.

Steps to reproduce

  1. Add a code snippet that writes 100 documents in parallel, e.g.:
diff --git a/x-pack/plugins/security_solution/server/plugin.ts b/x-pack/plugins/security_solution/server/plugin.ts
index 734ccc4d5ba..8e8023491c2 100644
--- a/x-pack/plugins/security_solution/server/plugin.ts
+++ b/x-pack/plugins/security_solution/server/plugin.ts
@@ -240,6 +240,19 @@ export class Plugin implements IPlugin<PluginSetup, PluginStart, SetupPlugins, S
         secondaryAlias: config.signalsIndex,
       });
 
+      new Array(100).map((e, i) =>
+        ruleDataClient.getWriter({ namespace: `testtt-${i}` }).bulk({
+          body: [
+            { index: {} },
+            {
+              '@timestamp': new Date(),
+              'kibana.alert.id': i,
+            },
+          ],
+          refresh: true,
+        })
+      );
+
       // Register rule types via rule-registry
       const createRuleOptions: CreateRuleOptions = {
         experimentalFeatures,
  1. Make sure indices and index templates do not exist yet. The issue seems to be reproducible only when namespace level resources are uninitialized.
  2. Start Kibana

Expected behavior

After Kibana start, 100 indices were created with 1 document in each.

Actual behavior

Created 100 indices, but some of them contain 2 documents instead of one:

Screenshot 2021-08-30 at 16 24 25

Example documents:

GET .internal.alerts-security.alerts-testtt-68-000001/_search

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : ".internal.alerts-security.alerts-testtt-68-000001",
        "_id" : "QTQil3sBVQ9RbCEMJdoR",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2021-08-30T12:55:16.869Z",
          "kibana.alert.id" : 68
        }
      },
      {
        "_index" : ".internal.alerts-security.alerts-testtt-68-000001",
        "_id" : "-jQhl3sBVQ9RbCEMr8jf",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2021-08-30T12:55:16.869Z",
          "kibana.alert.id" : 68
        }
      }
    ]
  }
}

Relevant Kibana logs

[14:55:27.219] [debug][data][elasticsearch][query] 200
POST /.alerts-security.alerts-testtt-68/_bulk?refresh=true&require_alias=true
{"index":{}}
{"@timestamp":"2021-08-30T12:55:16.869Z","kibana.alert.id":68}

[14:55:27.219] [info][plugins][ruleRegistry] Installing namespace-level resources and creating concrete index for .alerts-security.alerts-testtt-68

[14:55:27.220] [debug][plugins][ruleRegistry] Checking if concrete write index exists for .alerts-security.alerts-testtt-68

14:55:27.220] [debug][plugins][ruleRegistry] Fetching concrete indices for .internal.alerts-security.alerts-testtt-68-*

[14:55:27.631] [debug][data][elasticsearch][query] 404
GET /.internal.alerts-security.alerts-testtt-68-*/_alias/.alerts-security.alerts-testtt-68 [undefined]: {"error":"alias [.alerts-security.alerts-testtt-68] missing","status":404}

[14:55:27.631] [debug][plugins][ruleRegistry] Installing index template for .alerts-security.alerts-testtt-68

[14:55:27.631] [debug][plugins][ruleRegistry] Installing index template .alerts-security.alerts-testtt-68-index-template

[14:55:28.883] [debug][data][elasticsearch][query] 200
POST /_index_template/_simulate/.alerts-security.alerts-testtt-68-index-template
{"index_patterns":[".internal.alerts-security.alerts-testtt-68-*"],"composed_of":[".alerts-ecs-mappings",".alerts-security.alerts-mappings",".alerts-technical-mappings"],"template":{"settings":{"index.lifecycle":{"name":".alerts-ilm-policy","rollover_alias":".alerts-security.alerts-testtt-68"}},"mappings":{"_meta":{"kibana":{"version":"8.0.0"},"namespace":"testtt-68"}},"aliases":{".siem-signals-xcrzx-testtt-68":{"is_write_index":false}}},"_meta":{"kibana":{"version":"8.0.0"},"namespace":"testtt-68"},"priority":9}

[14:55:37.524] [debug][data][elasticsearch][query] 200
PUT /_index_template/.alerts-security.alerts-testtt-68-index-template
{"index_patterns":[".internal.alerts-security.alerts-testtt-68-*"],"composed_of":[".alerts-ecs-mappings",".alerts-security.alerts-mappings",".alerts-technical-mappings"],"template":{"settings":{"index.lifecycle":{"name":".alerts-ilm-policy","rollover_alias":".alerts-security.alerts-testtt-68"}},"mappings":{"_meta":{"kibana":{"version":"8.0.0"},"namespace":"testtt-68"}},"aliases":{".siem-signals-xcrzx-testtt-68":{"is_write_index":false}}},"_meta":{"kibana":{"version":"8.0.0"},"namespace":"testtt-68"},"priority":9}

[14:55:37.524] [debug][plugins][ruleRegistry] Creating concrete write index for .alerts-security.alerts-testtt-68

[14:56:12.182] [debug][data][elasticsearch][query] 400
PUT /.internal.alerts-security.alerts-testtt-68-000001
{"aliases":{".alerts-security.alerts-testtt-68":{"is_write_index":true}}} [resource_already_exists_exception]: index [.internal.alerts-security.alerts-testtt-68-000001/4KOx2_GdT9yKL_xAdYyxmg] already exists

[14:56:12.417] [debug][data][elasticsearch][query] 200
GET /.internal.alerts-security.alerts-testtt-68-000001

[14:56:45.719] [debug][data][elasticsearch][query] 200
POST /.alerts-security.alerts-testtt-68/_bulk?refresh=true&require_alias=true
{"index":{}}
{"@timestamp":"2021-08-30T12:55:16.869Z","kibana.alert.id":68}
@xcrzx xcrzx added bug Fixes for quality problems that affect the customer experience Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete v7.16.0 labels Aug 30, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@banderror banderror changed the title [RAC] [Rule Registry] Duplicated documents written after index bootstrapping [RAC][Rule Registry] BUG: Duplicated documents written after index bootstrapping Sep 1, 2021
@banderror banderror removed the v7.16.0 label Sep 1, 2021
@XavierM
Copy link
Contributor

XavierM commented Sep 3, 2021

@oatkiller
Copy link
Contributor

@banderror @peluja1012 Is this still required for 7.16?

@jasonrhodes
Copy link
Member

@weltenwort + @Kerry350 can we confirm whether or not this is a critical bug, given how we've designed mappings updates for now?

@weltenwort
Copy link
Member

I wasn't able to reproduce it with the changes coming as part of #113389.

What I did see, though, is that it can take almost 30 seconds to update about 200 indices. The lazy nature of that operation might mitigate the performance impact somewhat, but if several alerts start executing in short succession on startup it will produce some load.

@Kerry350 on a related note, during my experiments I removed the retry around the namespace-level init and didn't come across any failures that the retry might have fixed.

@banderror banderror added the Team:Detection Alerts Security Detection Alerts Area Team label Oct 11, 2021
@banderror
Copy link
Contributor

Hey everyone, I removed this ticket from the backlog of the Detection Rules area. We (@elastic/security-detections-response-rules) are not the owners anymore (however feel free to still ping us if you have any tech questions about the ticket).

Ownership of this ticket and other tickets related to rule_registry (like #101016) now goes to the Detection Alerts area (Team:Detection Alerts label). Please ping @peluja1012 and @marshallmain if you have any questions.

@jasonrhodes
Copy link
Member

Thanks for the update @banderror!

As we can't currently reproduce this issue, I'm going to close it as fixed. Let's reopen if someone has repro steps that do in fact show the issue still exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Detection Alerts Security Detection Alerts Area Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete
Projects
None yet
Development

No branches or pull requests

7 participants