Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heartbeat TLS servername set incorrectly #17123

Closed
andrewvc opened this issue Mar 19, 2020 · 1 comment · Fixed by #17549
Closed

Heartbeat TLS servername set incorrectly #17123

andrewvc opened this issue Mar 19, 2020 · 1 comment · Fixed by #17549
Assignees
Labels
bug Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team test-plan Add this PR to be manual test plan test-plan-ok This PR passed manual testing v7.8.0

Comments

@andrewvc
Copy link
Contributor

From: https://discuss.elastic.co/t/heartbeat-tls-servername-not-being-set-correctly/224280

When trying to monitor an SSL enabled endpoint it seems like Heartbeat is setting the ServerName to the IP address instead of the hostname. For setups where the certificate is only signed with the domain (and not all IP addresses) this causes verification to fail. For setups where different certificates are served depending on hostname (Kubernetes nginx-ingress being a popular example) it means that Heartbeat tries to verify the invalid fake certificate that is returned when no matching domains are found.

Looking through the code and docs I can't find a way to have the verified ServerName be the domain instead of the resolved IP address.

Example configuration:

- type: "tcp"
  name: "elastic.co"
  schedule: "@every 60s"
  hosts: ["tls://elastic.co:443"]
  ssl:
    enabled: true

Error message:

x509: cannot validate certificate for 151.101.194.217 because it doesn't contain any IP SANs"

Here is the section which is setting the ServerName

func (c *TLSConfig) BuildModuleConfig(host string) *tls.Config {
if c == nil {
// use default TLS settings, if config is empty.
return &tls.Config{ServerName: host}
}
config := c.ToConfig()
config.ServerName = host
return config
}

The host string which gets passed in actually ends up being the resolved IP address.

To confirm that this was the issue I hard coded it to "elastic.co" allows the certificate to be verified properly.

config.ServerName = "elastic.co"

I have reproduced this with the latest releases version (running from official Docker images) and also with the latest builds from the master branch.

@andrewvc andrewvc added bug Heartbeat [zube]: Ready Team:obs-ds-hosted-services Label for the Observability Hosted Services team labels Mar 19, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (:uptime)

andrewvc added a commit that referenced this issue Apr 20, 2020
Refactors the TCP monitor to make the code easier to follow, more testable, and fixes #17123 where TLS server name was not correctly sent. This is important because the code had accrued a lot of cruft and become very hard to follow. There were many wrappers and intermediate variable names that often subtly changed names as they crossed various functions. When debugging #17123 I frequently found myself lost tracing the execution.

This new code should be simpler to understand for a few reasons:

    Less code (almost a 2x reduction) and fewer, simpler, better organized functions
    Less variable passing/renaming due to use of struct for key config variables
    More consistent and descriptive variable names
    Creation of the dialer as late as possible, to remove the confusing partial states, and clarity as to when which dialer layers are used.
    Adds (frustratingly tricky) integration tests for #17123 using mismatched TLS certs, and also against a real SOCKS5 proxy
    Adds, for testing only, the ability to override the real network resolver for TCP checks which is invaluable in debugging TLS checks that depend on setting hostnames correctly. In the future if we decide to let users use a custom DNS resolver this will be nice.
    Reorganized giant TCP test file into multiple files
@zube zube bot added [zube]: Done and removed [zube]: Ready labels Apr 20, 2020
andrewvc added a commit to andrewvc/beats that referenced this issue Apr 20, 2020
Refactors the TCP monitor to make the code easier to follow, more testable, and fixes elastic#17123 where TLS server name was not correctly sent. This is important because the code had accrued a lot of cruft and become very hard to follow. There were many wrappers and intermediate variable names that often subtly changed names as they crossed various functions. When debugging elastic#17123 I frequently found myself lost tracing the execution.

This new code should be simpler to understand for a few reasons:

    Less code (almost a 2x reduction) and fewer, simpler, better organized functions
    Less variable passing/renaming due to use of struct for key config variables
    More consistent and descriptive variable names
    Creation of the dialer as late as possible, to remove the confusing partial states, and clarity as to when which dialer layers are used.
    Adds (frustratingly tricky) integration tests for elastic#17123 using mismatched TLS certs, and also against a real SOCKS5 proxy
    Adds, for testing only, the ability to override the real network resolver for TCP checks which is invaluable in debugging TLS checks that depend on setting hostnames correctly. In the future if we decide to let users use a custom DNS resolver this will be nice.
    Reorganized giant TCP test file into multiple files

(cherry picked from commit dfe8c4b)
andrewvc added a commit that referenced this issue Apr 21, 2020
Refactors the TCP monitor to make the code easier to follow, more testable, and fixes #17123 where TLS server name was not correctly sent. This is important because the code had accrued a lot of cruft and become very hard to follow. There were many wrappers and intermediate variable names that often subtly changed names as they crossed various functions. When debugging #17123 I frequently found myself lost tracing the execution.

This new code should be simpler to understand for a few reasons:

    Less code (almost a 2x reduction) and fewer, simpler, better organized functions
    Less variable passing/renaming due to use of struct for key config variables
    More consistent and descriptive variable names
    Creation of the dialer as late as possible, to remove the confusing partial states, and clarity as to when which dialer layers are used.
    Adds (frustratingly tricky) integration tests for #17123 using mismatched TLS certs, and also against a real SOCKS5 proxy
    Adds, for testing only, the ability to override the real network resolver for TCP checks which is invaluable in debugging TLS checks that depend on setting hostnames correctly. In the future if we decide to let users use a custom DNS resolver this will be nice.
    Reorganized giant TCP test file into multiple files

(cherry picked from commit dfe8c4b)
@andrewvc andrewvc added v7.8.0 test-plan Add this PR to be manual test plan labels May 5, 2020
@shahzad31 shahzad31 assigned shahzad31 and unassigned shahzad31 May 26, 2020
@justinkambic justinkambic self-assigned this May 26, 2020
@justinkambic justinkambic added the test-plan-ok This PR passed manual testing label May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team test-plan Add this PR to be manual test plan test-plan-ok This PR passed manual testing v7.8.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants