Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] 'conflict' creating index-pattern SO when installing package during 8.0.0 to 8.1.0 upgrade (docker only?) #126611

Closed
hop-dev opened this issue Mar 1, 2022 · 9 comments · Fixed by #126900
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@hop-dev
Copy link
Contributor

hop-dev commented Mar 1, 2022

Linked to #126113.

We have had two occurrences reported to the fleet team where a conflict error is returned when creating saved objects during install of packages, e.g kubernetes:

[2022-03-01T10:17:03.798+00:00][WARN ][plugins.fleet] Failure to install package [kubernetes]: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]

In fleet we use the import method with overwrite: true to create our saved objects during package install here.

Here is a summary of the 2 occurrences of this error that have been reported to us:

1. ECK (Originally reported on internal slack here)

"Another ECK e2e test failure: when upgrading from 8.0.0 to 8.1.0-SNAPSHOT Kibana fails to install the kubernetes package claiming there is a conflict in index pattern. "

View Full Error Stack
[2022-03-01T10:17:03.798+00:00][WARN ][plugins.fleet] Failure to install package [kubernetes]: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
[2022-03-01T10:17:03.799+00:00][ERROR][plugins.fleet] uninstalling kubernetes-0.14.0 after error installing: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
[2022-03-01T10:17:05.989+00:00][ERROR][plugins.fleet] failed to uninstall or rollback package after installation error Error: Saved object [epm-packages/kubernetes] not found
[2022-03-01T10:17:23.555+00:00][WARN ][plugins.fleet] Failed installing package [kubernetes] due to error: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
[2022-03-01T10:17:26.589+00:00][ERROR][plugins.fleet] Error: [Elastic Agent on ECK policy] could not be added. [kubernetes] could not be installed due to error: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
    at /usr/share/kibana/x-pack/plugins/fleet/server/services/preconfiguration.js:293:19
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 1)
    at ensurePreconfiguredPackagesAndPolicies (/usr/share/kibana/x-pack/plugins/fleet/server/services/preconfiguration.js:275:40)
    at createSetupSideEffects (/usr/share/kibana/x-pack/plugins/fleet/server/services/setup.js:88:7)
    at awaitIfPending (/usr/share/kibana/x-pack/plugins/fleet/server/services/setup_utils.js:38:20)
    at /usr/share/kibana/x-pack/plugins/fleet/server/plugin.js:243:9
[2022-03-01T10:17:26.591+00:00][ERROR][plugins.fleet] Error: [Elastic Agent on ECK policy] could not be added. [kubernetes] could not be installed due to error: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
    at /usr/share/kibana/x-pack/plugins/fleet/server/services/preconfiguration.js:293:19
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 1)
    at ensurePreconfiguredPackagesAndPolicies (/usr/share/kibana/x-pack/plugins/fleet/server/services/preconfiguration.js:275:40)
    at createSetupSideEffects (/usr/share/kibana/x-pack/plugins/fleet/server/services/setup.js:88:7)
    at awaitIfPending (/usr/share/kibana/x-pack/plugins/fleet/server/services/setup_utils.js:38:20)
    at /usr/share/kibana/x-pack/plugins/fleet/server/plugin.js:243:9
[2022-03-01T10:17:26.592+00:00][ERROR][plugins.fleet] Error: [Elastic Agent on ECK policy] could not be added. [kubernetes] could not be installed due to error: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
    at /usr/share/kibana/x-pack/plugins/fleet/server/services/preconfiguration.js:293:19
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 1)
    at ensurePreconfiguredPackagesAndPolicies (/usr/share/kibana/x-pack/plugins/fleet/server/services/preconfiguration.js:275:40)
    at createSetupSideEffects (/usr/share/kibana/x-pack/plugins/fleet/server/services/setup.js:88:7)
    at awaitIfPending (/usr/share/kibana/x-pack/plugins/fleet/server/services/setup_utils.js:38:20)
    at /usr/share/kibana/x-pack/plugins/fleet/server/plugin.js:243:9
[2022-03-01T10:17:26.592+00:00][WARN ][plugins.fleet] Fleet setup failed
[2022-03-01T10:17:26.592+00:00][WARN ][plugins.fleet] Error: [Elastic Agent on ECK policy] could not be added. [kubernetes] could not be installed due to error: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
    at /usr/share/kibana/x-pack/plugins/fleet/server/services/preconfiguration.js:293:19
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 1)
    at ensurePreconfiguredPackagesAndPolicies (/usr/share/kibana/x-pack/plugins/fleet/server/services/preconfiguration.js:275:40)
    at createSetupSideEffects (/usr/share/kibana/x-pack/plugins/fleet/server/services/setup.js:88:7)
    at awaitIfPending (/usr/share/kibana/x-pack/plugins/fleet/server/services/setup_utils.js:38:20)
    at /usr/share/kibana/x-pack/plugins/fleet/server/plugin.js:243:9
[2022-03-01T10:17:26.610+00:00][INFO ][plugins.securitySolution] Dependent plugin setup complete - Starting ManifestTask

2. Docker (#126113)
On prem docker installation upgrading from 7.17.0 to 8.0.0, see issue for more detail.

[2022-02-21T20:17:58.530+00:00][WARN ][plugins.fleet] Failure to install package [docker]: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
[2022-02-21T20:17:58.535+00:00][ERROR][plugins.fleet] uninstalling docker-1.0.0 after error installing: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
[2022-02-21T20:17:58.537+00:00][WARN ][plugins.fleet] Failure to install package [linux]: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
[2022-02-21T20:17:58.537+00:00][ERROR][plugins.fleet] uninstalling linux-0.4.1 after error installing: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
[2022-02-21T20:17:58.554+00:00][WARN ][plugins.fleet] Failure to install package [apache]: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
[2022-02-21T20:17:58.555+00:00][ERROR][plugins.fleet] uninstalling apache-1.3.2 after error installing: [Error: Encountered 2 errors creating saved objects: [{"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}},{"type":"index-pattern","id":"metrics-*","error":{"type":"conflict"}}]]
@botelastic botelastic bot added the needs-team Issues missing a team label label Mar 1, 2022
@hop-dev hop-dev changed the title [Fleet] 'conflict' creating index-pattern SO when installing package in 8.0.0 to 8.1.0-SNAPSHOT upgrade (docker only?) [Fleet] 'conflict' creating index-pattern SO when installing package during 8.0.0 to 8.1.0 upgrade (docker only?) Mar 1, 2022
@hop-dev hop-dev added bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team labels Mar 1, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Mar 1, 2022
@joshdover
Copy link
Contributor

Some interesting facts about the second case (upgrade from 7.17.0 to 8.0.0):

  • No Spaces except the default space existed
  • There were no legacy-url-alias objects present after the upgrade
  • Upgrading these packages manually afterward succeeded without any problems

@elastic/kibana-security could there be other cases that explain why there were these {"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}} errors even if there shouldn't have been any alias objects?

@jportner
Copy link
Contributor

jportner commented Mar 2, 2022

@elastic/kibana-security could there be other cases that explain why there were these {"type":"index-pattern","id":"logs-*","error":{"type":"conflict"}} errors even if there shouldn't have been any alias objects?

Aliases wouldn't cause regular conflict errors like this, but it's good to know regardless that only the Default space is being used and there were no aliases present.
I'm struggling to see how these regular conflict errors could occur when using overwrite: true.

I did a brief review of the importSavedObjects and bulkCreate implementations and nothing jumped out at me.
I also haven't heard of any other consumers complaining about unexpected conflict errors.
If we can manage to reproduce this problem with ES query logs enabled, that would give us more information to troubleshoot.

@hop-dev
Copy link
Contributor Author

hop-dev commented Mar 2, 2022

@jportner Just a bit more detail that may be useful or not, I've debugged locally and the conflict errors are coming from the create call here, and overwrite is set to true :https://github.com/elastic/kibana/blob/main/src/core/server/saved_objects/import/import_saved_objects.ts#L145

@hop-dev
Copy link
Contributor Author

hop-dev commented Mar 2, 2022

I also saw the conflicts being correctly detected and then discarded in the checkConflicts method

@jportner
Copy link
Contributor

jportner commented Mar 2, 2022

Thanks for that detail, so it seems like it's not a bug inside importSavedObjects. Yeah, in that case it would be really helpful to see ES query logs to better understand what is going on.

@joshdover
Copy link
Contributor

I'm finding this line a bit curious in the repository bulk create method:

...(overwrite && versionProperties),

For context, the versionProperties object contains the if_seq_no and if_primary_term properties used for optimistic concurrency control. Without stepping through the code, it's hard to determine where we're getting those since the import code explicitly removes the version info before calling bulkCreate. My guess is that it's being retrieved from the existing document in this logic, but I'm not familiar without this preflight logic works:

versionProperties = getExpectedVersionProperties(version, existingDocument);

If my reading of this is right, it doesn't appear that bulkCreate currently exposes a way to disable optimistic concurrency control for overwriting docs.

@jportner
Copy link
Contributor

jportner commented Mar 3, 2022

My guess is that it's being retrieved from the existing document in this logic, but I'm not familiar without this preflight logic works:

versionProperties = getExpectedVersionProperties(version, existingDocument);

If my reading of this is right, it doesn't appear that bulkCreate currently exposes a way to disable optimistic concurrency control for overwriting docs.

Ah, good find, and sorry this didn't occur to me earlier. It looks like this would indeed result in a 409 error for this object.

There is a related issue for the SOC.update API that was opened a week ago: #126240
However, we determined that the consumer who was having problems wasn't running into this theoretical issue.
There was also a Slack discussion on the topic that I replied to, I'll DM you the link to that.

TL;DR we defensively added OCC here to make sure that a saved object didn't change spaces between the point where we 1. conducted the preflight check and 2. overwrote the object.
I'm a bit surprised you'd be running into this during import, are you importing the same index-pattern multiple times in quick succession? And why would it only be happening in Docker?

At any rate, if we determine this is the actual root cause and removing the preflight OCC solves the issue, we can do that. I'm not sure we get much practical benefit from this OCC, it just seemed like a good idea at the time.

@joshdover
Copy link
Contributor

Makes sense, thanks for the explanation.

I'm a bit surprised you'd be running into this during import, are you importing the same index-pattern multiple times in quick succession? And why would it only be happening in Docker?

Yeah it's because we're trying to install several packages in parallel so they may try to import the same index pattern. That said, Mark and I just discussed we think we can safely ignore this error if the another package install created it and just move on. We'll experiment with that before considering removing the OCC from this.

thbkrkr added a commit to elastic/cloud-on-k8s that referenced this issue Jun 27, 2023
Move version checks to agent builder to skip tests due to the following stack bugs:
- Kibana bug "index conflict on install policy" (elastic/kibana#126611)
- Elastic agent bug "deadlock on startup" (#6331 (comment))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants