Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove forcing localhost for insecure bootstrap #2198

Closed

Conversation

michel-laterman
Copy link
Contributor

@michel-laterman michel-laterman commented Jan 26, 2023

What does this PR do?

Remove the forced use of localhost as the host when bootstrapping and insecure fleet-server instance.

Why is it important?

Bootstrapping process on 8.6.0+ forces a bind to localhost, this does not allow other agents to enroll with the server.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

How to test this PR locally

  1. Start ES + Kibana, specify xpack.fleet.agents.fleet_server.hosts: [ http://HOSTNAME:8220 ] in the kibana config
  2. Enroll a fleet-server agent with the command that Kibana provides and append the --fleet-server-insecure-http flag
  3. After enrollment run sudo lsof | grep 8220 -> verify that something like TCP *:8220 (LISTEN) appears in the output. Elastic agent logs will also contain:
{"log.level":"info","@timestamp":"2023-01-26T20:58:42.791Z","message":"server listening","component":{"binary":"fleet-server","dataset":"elastic_agent.fleet_server","id":"fleet-server-default","type":"fleet-server"},"log":{"source":"fleet-server-default"},"ecs.version":"1.6.0","service.name":"fleet-server","bind":"0.0.0.0:8220","rdTimeout":60000,"wrTimeout":600000,"ecs.version":"1.6.0"}

You can also test with a docker container:

docker run --name elastic-agent --rm --env FLEET_SERVER_ENABLE=1 --env FLEET_SERVER_ELASTICSEARCH_HOST=$ELASTICSEARCH_HOSTS --env FLEET_SERVER_INSECURE_HTTP=true --env FLEET_SERVER_POLICY_ID=fleet-server-policy --env FLEET_SERVER_SERVICE_TOKEN=$SERVICE_TOKEN  docker.elastic.co/beats/elastic-agent-complete:8.7.0-SNAPSHOT container

The same log line will appear.

Related issues

@michel-laterman michel-laterman added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Fleet Label for the Fleet team backport-v8.6.0 Automated backport with mergify labels Jan 26, 2023
@michel-laterman michel-laterman requested a review from a team as a code owner January 26, 2023 21:15
@michel-laterman michel-laterman requested review from michalpristas and removed request for a team January 26, 2023 21:15
@elasticmachine
Copy link
Contributor

elasticmachine commented Jan 26, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-01-30T23:09:49.485+0000

  • Duration: 17 min 14 sec

Test stats 🧪

Test Results
Failed 0
Passed 4861
Skipped 13
Total 4874

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages.

  • run integration tests : Run the Elastic Agent Integration tests.

  • run end-to-end tests : Generate the packages and run the E2E Tests.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@elasticmachine
Copy link
Contributor

elasticmachine commented Jan 26, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 98.305% (58/59) 👍
Files 69.268% (142/205) 👍
Classes 69.231% (270/390) 👍
Methods 54.112% (829/1532) 👍 0.065
Lines 39.365% (9065/23028) 👍 0.127
Conditionals 100.0% (0/0) 💚

@@ -395,10 +395,6 @@ func (c *enrollCmd) prepareFleetTLS() error {
}
if c.options.FleetServer.Cert == "" && c.options.FleetServer.CertKey == "" {
if c.options.FleetServer.Insecure {
// running insecure, force the binding to localhost (unless specified)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the original reason this was done? Just covering the case where the host was unset? Is there a test we can add to keep this bug from happening again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this was done here. I'll try to add a couple unit tests for this method.

I think we need to do some minor cleanup around bootstapping (in another pr) as we have both this method and createFleetServerBootstrapConfig altering config at different locations

Copy link
Contributor

@michalpristas michalpristas Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @blakerouse do you remember why we added it here?
introduced here: elastic/beats#24142

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added for security reasons, we didn't want the Fleet Server to be exposed outside of localhost insecure mode.

We only allow it to be exposed in insecure mode when the user is specific and provide a host. I think we should really think about this before we make this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the issue description it sounds like when it was explicitly configured to bind to 0.0.0.0 it binds to localhost anyway.

Does c.options.FleetServer.Host evaluate to "" when the host was set to 0.0.0.0 intentionally?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmacknz to clarify, I ran into this issue by setting the host in the Fleet Server integration, through Kibana:

# kibana.yml
# ...
        inputs:
          - type: fleet-server
            vars:
              - name: host
                value: 0.0.0.0

If I configure the host directly in the Elastic Agent, it works.

I think Kibana's value is ignored because Elastic Agent has no way to tell whether the user explicitly requested 0.0.0.0 or simply relied on the default, which happens to be 0.0.0.0 as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kibana value is always ignored for Fleet Server, and it has always been that way. Only the values used during the bootstrap process is what is used for Fleet Server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakerouse in that case I'm fine with having to explicitly set the host in the Elastic Agent, since the root cause of the issue is my assumption that Kibana's values were actually taken into account.

It's still confusing to upgrade from 8.5 and 8.6 and realize that things aren't working anymore, but this could have easily been addressed with a changelog entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider removing the ability to alter the host value in Kibana to make it clear that it's ignored?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO yes, since this isn't obvious at all.

Copy link
Member

@cmacknz cmacknz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed that the new test fails if the fix is removed, thanks!

@@ -395,10 +395,6 @@ func (c *enrollCmd) prepareFleetTLS() error {
}
if c.options.FleetServer.Cert == "" && c.options.FleetServer.CertKey == "" {
if c.options.FleetServer.Insecure {
// running insecure, force the binding to localhost (unless specified)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added for security reasons, we didn't want the Fleet Server to be exposed outside of localhost insecure mode.

We only allow it to be exposed in insecure mode when the user is specific and provide a host. I think we should really think about this before we make this change.

antoineco added a commit to deviantony/docker-elk that referenced this pull request Feb 1, 2023
Required in insecure mode. The value from the Kibana policy gets ignored and the host is forced to localhost.

elastic/elastic-agent#2198
@michel-laterman
Copy link
Contributor Author

Closing this pr, users should set the host with --fleet-server-host flag, or if they are running in a container use the env var FLEET_SERVER_HOST to explicitly set the host

@michel-laterman michel-laterman deleted the insecure-bootrap branch April 12, 2023 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.6.0 Automated backport with mergify bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Fleet Label for the Fleet team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fleet Server listens on loopback when TLS is disabled in v8.6.0
6 participants