Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inputs.x509_cert unexpected behavior with SNI #8914

Closed
MorphBonehunter opened this issue Feb 26, 2021 · 11 comments · Fixed by #9289
Closed

inputs.x509_cert unexpected behavior with SNI #8914

MorphBonehunter opened this issue Feb 26, 2021 · 11 comments · Fixed by #9289
Labels
bug unexpected problem or unintended behavior

Comments

@MorphBonehunter
Copy link

MorphBonehunter commented Feb 26, 2021

Relevant telegraf.conf:

[global_tags]

[agent]
  interval = "60s"
  round_interval = false
  metric_batch_size = 500
  metric_buffer_limit = 5000
  collection_jitter = "0s"
  flush_interval = "60s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = false
  logfile = ""
  hostname = ""
  omit_hostname = false

[[inputs.x509_cert]]
  sources = ["https://www.underverse.net:443", "https://www.gynpraxis-bahls.de:443"]
  [inputs.x509_cert.tagdrop]
    common_name  = ["Let's Encrypt Authority X3", "R3"]

System info:

Telegraf 1.17.3 on ArchLinux

Steps to reproduce:

I ran two little Websites on one of my Servers. DNS resolves to the same IP and i use SNI to distinguish between the sites.
After upgrading telegraf to 1.17.3 (from 1.17.0) i realized that the behavior changed of the x509 plugin.
Before i got the right certificates from the server and got 2 different measurements from the plugin.
After the update it shows for both entries the same certificate (its the default one).
So i guess PR #7897 with 545d996 (introduced in 1.17.1) breaks the original behavior of the plugin.

  1. Run the above config with telegraf 1.17.0 and inspect output (expected behavior).
  2. Run the above config with telegraf 1.17.3 and inspect output (actual behavior).

Expected behavior:

2021-02-26T14:07:10Z I! Starting Telegraf 1.17.0
2021-02-26T14:07:10Z D! [agent] Initializing plugins
2021-02-26T14:07:10Z D! [agent] Starting service inputs
> x509_cert,common_name=underverse.net,host=evelyn,issuer_common_name=R3,public_key_algorithm=RSA,san=*.underverse.net\,underverse.net,serial_number=4c11c31923c7824b9055dbd729f06c41ab2,signature_algorithm=SHA256-RSA,source=https://www.underverse.net:443,verification=valid age=5208391i,enddate=1616916039i,expiry=2567608i,startdate=1609140039i,verification_code=0i 1614348431000000000
2021-02-26T14:07:10Z D! [agent] Stopping service inputs
> x509_cert,common_name=gynpraxis-bahls.de,host=evelyn,issuer_common_name=R3,public_key_algorithm=RSA,san=gynpraxis-bahls.de\,www.gynpraxis-bahls.de,serial_number=32b96a849248c02179ae09479551704d315,signature_algorithm=SHA256-RSA,source=https://www.gynpraxis-bahls.de:443,verification=valid age=5208429i,enddate=1616916001i,expiry=2567570i,startdate=1609140001i,verification_code=0i 1614348431000000000
2021-02-26T14:07:10Z D! [agent] Input channel closed
2021-02-26T14:07:10Z D! [agent] Stopped Successfully

Actual behavior:

2021-02-26T14:07:31Z I! Starting Telegraf 1.17.3
2021-02-26T14:07:31Z D! [agent] Initializing plugins
2021-02-26T14:07:31Z D! [agent] Starting service inputs
> x509_cert,common_name=underverse.net,host=evelyn,issuer_common_name=R3,public_key_algorithm=RSA,san=*.underverse.net\,underverse.net,serial_number=4c11c31923c7824b9055dbd729f06c41ab2,signature_algorithm=SHA256-RSA,source=https://www.underverse.net:443,verification=valid age=5208412i,enddate=1616916039i,expiry=2567587i,startdate=1609140039i,verification_code=0i 1614348451000000000
> x509_cert,common_name=underverse.net,host=evelyn,issuer_common_name=R3,public_key_algorithm=RSA,san=*.underverse.net\,underverse.net,serial_number=4c11c31923c7824b9055dbd729f06c41ab2,signature_algorithm=SHA256-RSA,source=https://www.gynpraxis-bahls.de:443,verification=valid age=5208412i,enddate=1616916039i,expiry=2567587i,startdate=1609140039i,verification_code=0i 1614348451000000000
2021-02-26T14:07:31Z D! [agent] Stopping service inputs
2021-02-26T14:07:31Z D! [agent] Input channel closed
2021-02-26T14:07:31Z D! [agent] Stopped Successfully
@MorphBonehunter MorphBonehunter added the bug unexpected problem or unintended behavior label Feb 26, 2021
@moorglade
Copy link

moorglade commented Mar 15, 2021

I've encountered a similar problem: in my scenario, I specify 10 different domains (with different IP addresses) in the sources list, but since upgrading to 1.17.3, I only get proper measurements for two of them. This used to work fine in previous Telegraf releases (<= 1.16).

@MorphBonehunter
Copy link
Author

@moorglade you can "fix" this if you specify you 10 domains in 10 different inputs.
i mean from

[[inputs.x509_cert]]
  sources = ["https://domain1:443", "https://domain2:443", "..."]

to

[[inputs.x509_cert]]
  sources = ["https://domain1:443"]
[[inputs.x509_cert]]
  sources = ["https://domain2:443"]
...

But this is a workaround and i don't think that's the expected behavior.

@moorglade
Copy link

@MorphBonehunter thanks, I will do that if this issue not fixed, but nevertheless it would be nice to know if it's a bug or not ;)

@helmut72
Copy link

Thanks, had the same problem. This workaround works.

@lephisto
Copy link

NIce workaround, but is this scheduled to be fixed soon?

@jjh74
Copy link
Contributor

jjh74 commented May 19, 2021

I think I found what's wrong with this. x509_cert input sets/uses SNI in this order (https://github.com/influxdata/telegraf/blob/master/plugins/inputs/x509_cert/x509_cert.go#L93 / https://github.com/influxdata/telegraf/blob/master/plugins/inputs/x509_cert/x509_cert.go#L124):

  1. c.tlsCfg.ServerName
  2. c.ServerName
  3. u.Hostname() (uri hostname)

c.tlsCfg.ServerName is not reset between certificates so x509_cert (re)uses first u.Hostname() for all certs (unless configuration has explicit tls_server_name or server_name).

Something like this probably fixes the issue:

--- a/plugins/inputs/x509_cert/x509_cert.go
+++ b/plugins/inputs/x509_cert/x509_cert.go
@@ -102,6 +102,7 @@ func (c *X509Cert) getCert(u *url.URL, timeout time.Duration) ([]*x509.Certifica
 
                serverName, err := c.serverName(u)
                if err != nil {
+                       c.tlsCfg.ServerName = "" // reset SNI, otherwise we reuse c.tlsCfg.ServerName
                        return nil, err
                }
                c.tlsCfg.ServerName = serverName
@@ -112,10 +113,12 @@ func (c *X509Cert) getCert(u *url.URL, timeout time.Duration) ([]*x509.Certifica
 
                hsErr := conn.Handshake()
                if hsErr != nil {
+                       c.tlsCfg.ServerName = "" // reset SNI, otherwise we reuse c.tlsCfg.ServerName
                        return nil, hsErr
                }
 
                certs := conn.ConnectionState().PeerCertificates
+               c.tlsCfg.ServerName = "" // reset SNI, otherwise we reuse c.tlsCfg.ServerName
 
                return certs, nil
        case "file":
@@ -208,6 +211,15 @@ func getTags(cert *x509.Certificate, location string) map[string]string {
 func (c *X509Cert) Gather(acc telegraf.Accumulator) error {
        now := time.Now()
 
+       if c.tlsCfg.ServerName != "" && c.ServerName == "" {
+               // Set SNI from c.tlsCfg.ServerName and reset c.tlsCfg.ServerName
+               // we need to reset c.tlsCfg.ServerName for each certificate when there's
+               // no explicit SNI (c.tlsCfg.ServerName or c.ServerName) otherwise we'll always use
+               // first uri HostName for all certs (see issue 8914)
+               c.ServerName = c.tlsCfg.ServerName
+               c.tlsCfg.ServerName = ""
+       }
+
        for _, location := range c.Sources {
                u, err := c.locationToURL(location)
                if err != nil {

If this looks ok to maintainers I can create PR (CC: @ssoroka)

@srebhan
Copy link
Member

srebhan commented May 20, 2021

@jjh74 please submit a PR and let's discuss there. Please reference this issue (and the potential duplicate) in the PR. Thank you for caring!

@jjh74
Copy link
Contributor

jjh74 commented May 21, 2021

@srebhan I'll create a PR, but I think in current master https/tcp uris are broken after #6952
sourcesToURLs() (https://github.com/influxdata/telegraf/blob/master/plugins/inputs/x509_cert/x509_cert.go#L70) (strings.Index(source, ":\") != 1) always matches https:// and tcp://.

This results in: [inputs.x509_cert] could not find file: &{https://example.org:443 false false }
[inputs.x509_cert] could not find file: &{tcp://2.example.org:443 false false }

@jjh74 jjh74 mentioned this issue May 21, 2021
2 tasks
@aslgithub
Copy link

@srebhan I'll create a PR, but I think in current master https/tcp uris are broken after #6952
sourcesToURLs() (https://github.com/influxdata/telegraf/blob/master/plugins/inputs/x509_cert/x509_cert.go#L70) (strings.Index(source, ":") != 1) always matches https:// and tcp://.

This results in: [inputs.x509_cert] could not find file: &{https://example.org:443 false false }
[inputs.x509_cert] could not find file: &{tcp://2.example.org:443 false false }

@jjh74 @srebhan I've just raised #9384 as I have found this issue in testing v1.19.0.
#6952 is the only change I can see between working on v1.18.1 and broken on v1.19.0

@depinski
Copy link

Just did a code review and arrived at the same conclusion as @jjh74. Line 124 of the plugin breaks the functionality quite obviously

@srebhan
Copy link
Member

srebhan commented Jun 22, 2021

@depinski there is no doubt about this. We have some fixes in #9289 and #9400 and I'm working on getting those in ASAP. If you want to test, you can use the artifacts produced in #9400 as it comprises the other PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants