-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail to bootstrap Fleet Server with self-signed certificates in 7.17 #3435
Comments
@jsoriano Looks like it happened due to b4de6f5 as stated by @cmacknz here - https://github.com/elastic/dev/issues/2547#issuecomment-2040631630 I'll add this to one of our upcoming sprint. |
Hi @michel-laterman 👋 A customer is experiencing this issue after upgrading from 7.17.18 to 7.17.20 (https://github.com/elastic/sdh-beats/issues/4608) - could you confirm whether they should roll back until elastic/beats#38785 is merged? Edit: thank you @jlind23 for https://github.com/elastic/sdh-beats/issues/4608#issuecomment-2056255255 |
https://github.com/elastic/ingest-docs/tree/main/docs/en/ingest-management/release-notes |
Nevermind, release note already exists elastic/ingest-docs#1006 was looking at the main branch for docs on 7.17. |
It does not look like elastic/beats#38785 by itself will resolve the issue. I've tried to see if it's an issue with tlscommon by updating the module in beats for the 7.17 branch with whatever is in the elastic-agent-libs repo and building an elastic-agent, and altering my fleet-server to build with it as well (branch for beats here). I've altered my fleet-server to also report its config when the bootstrap message is updated so we can see it on the command line.
I'll investigate more tomorrow |
I can recreate the issue by updating the fleet-server beats import to I think the |
Confirmed this has happened in ESS in both Fleet and APM, with downgrade of Fleet or APM (only) to
|
@cmacknz can this be handled with the highest priority as it is breaking all setups with self signed certificates and also negatively impacts apm-servers managed by EA/Fleet Server? |
There is a PR to fix this now: #3473 Following up |
Reopening so we can confirm release will work |
Are there 7.17 snapshot containers to test this change with |
There should be 7.17.21-SNAPSHOT containers, I was able to create a cloud deployment for this version #3435 (comment) |
Closing this as fixed now. There will be an accelerated 7.17.21 release to fix this soon. |
Ah ok, I tried but I could confirm though that
Great, thanks! |
This seems wrong, possibly something has been lost in the Buildkite migration. Let me follow up with eng prod. |
|
Fleet Server has started to fail to bootstrap with self-signed certificates in 7.17 branch.
It doesn't affect any released version at the moment.It affects 7.17.20.It fails with:
The very same certificates work with 7.17.19 and any other version tested, including 8.14.0-SNAPSHOT.
The only significant change since 7.17.19 is #3391, which updates beats dependency from 7.11.2 to 7.17.18.
One change included in Beats between these versions completely rewrites certificate validation: https://github.com/elastic/beats/pull/22495/files#diff-6225b0191feaa98542380b95e652c6f1f6805ee9a141330788011681cce0e487R153
There is a change in elastic-agent that solves an issue with certificate validation when bootstrapping, that was not backported to 7.x. Not sure if could be related: https://github.com/elastic/elastic-agent/pull/1867/files#diff-7efa04d4079650519d21a6e8ba217a7874130cecf8adc92dc2a78d4cfd2aee09R556
For confirmed bugs, please report:
elastic-package stack up -v -d --version 7.17-SNAPSHOT
. That under the hood:/etc/ssl/certs
).The text was updated successfully, but these errors were encountered: