Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plugin telegraf_inputs-openldap.conf broken after update to telegraf-1.29.5 from telegraf-1.29.2 #15436

Closed
paulusc opened this issue May 31, 2024 · 13 comments
Labels
bug unexpected problem or unintended behavior

Comments

@paulusc
Copy link

paulusc commented May 31, 2024

Relevant telegraf.conf

# OpenLDAP cn=Monitor plugin
[[inputs.openldap]]
  host = "myhost.mydomain.com"
  port = 636

  # ldaps, starttls, or no encryption. default is an empty string, disabling all encryption.
  # note that port will likely need to be changed to 636 for ldaps
  # valid options: "" | "starttls" | "ldaps"
  tls = ldaps

  # skip peer certificate verification. Default is false.
  insecure_skip_verify = true

  # Path to PEM-encoded Root certificate to use to verify server certificate
  #tls_ca = "/etc/openldap/cacerts/openldapCA.pem"

  # dn/password to bind with. If bind_dn is empty, an anonymous bind is performed.
  bind_dn = "cn=StatisticsActt,ou=InternalAccount,dc=mydomain,dc=com"
  bind_password = "godknowswhatitisfromday1"

  # reverse metric names so they sort more naturally
  # Defaults to false if unset, but is set to true when generating a new config
  reverse_metric_names = true

Logs from Telegraf

[inputs.openldap] Error in plugin: LDAP Result Code 200 "Network Error": remote error: tls: handshake failure

System info

telegraf-1.29.2.5 RHEL7

Docker

No response

Steps to reproduce

...

Expected behavior

Telegraf able to communicate openldap metrics to the host

Actual behavior

no communication

Additional info

happens when upgrading telegram from 1.29.2 to 1.29.5
Version 1.30.2 was also tested and same issue :
[inputs.openldap] Error in plugin: LDAP Result Code 200 "Network Error": remote error: tls: handshake failure

@paulusc paulusc added the bug unexpected problem or unintended behavior label May 31, 2024
@powersj
Copy link
Contributor

powersj commented May 31, 2024

Hi,

tls: handshake failure

Can you:

  1. Confirm what cipher your server is using? (e.g. openssl s_client -connect myhost.mydomain.com:636)
  2. Confirm this is consistently happening?
  3. When you upgraded you carried over any local certificates?

happens when upgrading telegram from 1.29.2 to 1.29.5

This is the diff between 1.29.2 and 1.29.5. Unfortunately, I see no changes to any LDAP plugin code or any relevant dependencies. There was a similar report in #15236 where it turned out a gRPC library changed the default cipher suites allowed. Knowing what is expected could shed light on that.

@powersj powersj added the waiting for response waiting for response from contributor label May 31, 2024
@paulusc
Copy link
Author

paulusc commented Jun 1, 2024

Hi,
The cipher is :
openssl s_client -connect localhost:636 | grep Cipher
. . .
New, TLSv1/SSLv3, Cipher is AES256-GCM-SHA384
Cipher : AES256-GCM-SHA384
Yes, this is happening consistently

To narrow down the issue, we upgrade from last known working version 1.29.2-1 until we get the tls: handshake failure message. For us the last good/working version is telegraf-1.29.4-1.x86_64

For sake of simplicity localhost is used in the plugin conf file
grep -vP "#|^$" /etc/telegraf/telegraf.d/telegraf_inputs-openldap.conf
[[inputs.openldap]]
host = "localhost"
port = 636
tls = "ldaps"
insecure_skip_verify = true
bind_dn = "cn=StatisticsAcct,ou=InternalAccount,dc=mydomain,dc=com"
bind_password = "----------------"
reverse_metric_names = true

With 1.29.5 the error appears: tls:handshake failure
yum install telegraf-1.29.5-1
systemctl restart telegraf
systemctl status telegraf -l
. . .
2024-05-31T19:53:37Z D! [agent] Starting service inputs
2024-05-31T19:54:00Z E! [inputs.openldap] Error in plugin: LDAP Result Code 200 "Network Error": remote error: tls: handshake failure
Thanks

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jun 1, 2024
@powersj
Copy link
Contributor

powersj commented Jun 3, 2024

To narrow down the issue, we upgrade from last known working version 1.29.2-1

Thank you very much for doing this. I think this does possibly narrow it down to the upgrade to go1.22.

New, TLSv1/SSLv3, Cipher is AES256-GCM-SHA384

I believe the TLS 1.0 and SSL 3.0 is the issue here, from the go1.22 docs:

By default, the minimum version offered by crypto/tls servers is now TLS 1.2

What we do for other plugins is expose some common TLS options that allow the user to specify the minimum version (e.g. VersionTLS10), cipher suites, etc. We need to do a little refactoring to expose all of these for this plugin.

Let me double check with the team today and we can hopefully get a PR up for you to test.

@srebhan
Copy link
Member

srebhan commented Jun 5, 2024

@paulusc please use the newer inputs.ldap plugin instead of the openldap one as it supports more TLS options

[[inputs.ldap]]
  server = "ldaps://myhost.mydomain.com:636"

  bind_dn = "cn=StatisticsActt,ou=InternalAccount,dc=mydomain,dc=com"
  bind_password = "godknowswhatitisfromday1"

  reverse_field_names = true

  ## TLS options
  tls_min_version = "TLS10"
  tls_cipher_suites = ["TLS_AES_256_GCM_SHA384"]
  insecure_skip_verify = true

according to your config above.

@srebhan srebhan added the waiting for response waiting for response from contributor label Jun 5, 2024
@paulusc
Copy link
Author

paulusc commented Jun 9, 2024

@srebhan the latest version available (10.30.2) was not able to digest the tls_cipher_suites option. Install the telegraf-nightly.x86_64.rpm this one does not show errors but no datas received. Still working on it.

## TLS options
 tls_min_version = "TLS10"
 tls_cipher_suites = ["TLS_AES_256_GCM_SHA384"]
 insecure_skip_verify = true

# rpm -qip telegraf-nightly.x86_64.rpm
Name        : telegraf
Version     : 1.31.0
Release     : 0
Architecture: x86_64
Install Date: (not installed)
Group       : default
Size        : 240184724
License     : MIT
Signature   : (none)
Source RPM  : telegraf-1.31.0-0.src.rpm
Build Date  : Wed 05 Jun 2024 08:58:57 PM EDT
Build Host  : 8a5e59fbdbc3
Relocations : /
Packager    : [support@influxdb.com](mailto:support@influxdb.com)
Vendor      : InfluxData
URL         : https://github.com/influxdata/telegraf
Summary     : Plugin-driven server agent for reporting metrics into InfluxDB.
Description :
Plugin-driven server agent for reporting metrics into InfluxDB.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jun 9, 2024
@srebhan
Copy link
Member

srebhan commented Jun 11, 2024

@paulusc which plugin are you using? You have to use inputs.ldap instead of inputs.openldap!

@srebhan srebhan added the waiting for response waiting for response from contributor label Jun 11, 2024
@paulusc
Copy link
Author

paulusc commented Jun 15, 2024

@srebhan we are using the inputs.ldap plugin.

This is our last attempt to get it working. Sorry for the delay to respond, busy week!

[root@xxxxxxxxxxxxxx telegraf.d]# /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d -debug
WARN[0000]log.go:244 gosnowflake.(*defaultLogger).Warn DBUS_SESSION_BUS_ADDRESS envvar looks to be not set, this can lead to runaway dbus-daemon processes. To avoid this, set envvar DBUS_SESSION_BUS_ADDRESS=$XDG_RUNTIME_DIR/bus (if it exists) or DBUS_SESSION_BUS_ADDRESS=/dev/null.
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-cpu.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-disk.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-diskio.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-filestat.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-internal.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-interrupts.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-kernel-vmstat.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-kernel.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-ldap.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-mem.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-net.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-netstats.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-processes.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-swap.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-sysctl_fs.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_inputs-system.conf
2024-06-14T17:56:32Z I! Loading config: /etc/telegraf/telegraf.d/telegraf_outputs-influxdb.conf
2024-06-14T17:56:32Z I! Starting Telegraf 1.31.0-079c9d28 brought to you by InfluxData the makers of InfluxDB
2024-06-14T17:56:32Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-06-14T17:56:32Z I! Loaded inputs: cpu (2x) disk (2x) diskio (2x) filestat internal interrupts kernel (2x) kernel_vmstat ldap linux_sysctl_fs mem (2x) net netstat processes (2x) swap (2x) system (2x)
2024-06-14T17:56:32Z I! Loaded aggregators:
2024-06-14T17:56:32Z I! Loaded processors:
2024-06-14T17:56:32Z I! Loaded secretstores:
2024-06-14T17:56:32Z I! Loaded outputs: influxdb
2024-06-14T17:56:32Z I! Tags enabled: host=xxxxxxxxxxxxxxx.xxxxxxx.xxx
2024-06-14T17:56:32Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"xxxxxxxxxxxxxxx.xxxxxxx.xxx", Flush Interval:1m0s
2024-06-14T17:56:32Z D! [agent] Initializing plugins
2024-06-14T17:56:32Z W! DeprecationWarning: Value "false" for option "ignore_protocol_stats" of plugin "[inputs.net](http://inputs.net/)" deprecated since version 1.27.3 and will be removed in 1.36.0: use the 'inputs.nstat' plugin instead for protocol stats
2024-06-14T17:56:32Z D! [agent] Connecting outputs
2024-06-14T17:56:32Z D! [agent] Attempting connection to [outputs.influxdb]
2024-06-14T17:56:32Z D! [agent] Successfully connected to outputs.influxdb
2024-06-14T17:56:32Z D! [agent] Starting service inputs
2024-06-14T17:57:00Z E! [inputs.ldap] Error in plugin: connection failed: LDAP Result Code 200 "Network Error": remote error: tls: handshake failure

using the following config file

[root@xxxxxxxxxxxxxx telegraf.d]# cat telegraf_inputs-ldap.conf
[[inputs.ldap]]
  server = "[ldaps://](ldaps://xxxxxxxxxxxxxx.xxxxxxxxxx.xxx)[xxxxxxxxxxxxxx](ldaps://xxxxxxxxxxxxxx.xxxxxxxxxx.xxx)[.xxxxxxxxxx.xxx](ldaps://xxxxxxxxxxxxxx.xxxxxxxxxx.xxx)"
  bind_dn = "cn=StatisticsAcct,ou=InternalAccount,dc=xxxxxxxxxx,dc=xxx"
  bind_password = "xxxxxxxxxxxxxxxxxxxxx"
  reverse_field_names = true
  # TLS options
  tls_min_version = "TLS10"
  tls_cipher_suites = ["TLS_AES_256_GCM_SHA384"]
  insecure_skip_verify = true

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jun 15, 2024
@srebhan
Copy link
Member

srebhan commented Jun 17, 2024

@paulusc when running openssl s_client -connect localhost:636 what are the lines below SSL-Session:?

@srebhan srebhan added the waiting for response waiting for response from contributor label Jun 17, 2024
@paulusc
Copy link
Author

paulusc commented Jun 18, 2024

@srebhan please find below the lines below SSL-Session:

SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : AES256-GCM-SHA384
    Session-ID: 9F27CDFEB561AEC1BDE8B8538C23AADA9CDE461F1A83D02C54B8088CC5CE953F
    Session-ID-ctx:
    Master-Key: RKM0MDBFNDRBRDVENZI3MENBRDQ3N0M3NDFCNZQWNDK0MZVFQTQ3QKQ1NTAYRDVCRUU4RJK5Q0EWMKE5NTM1QTE5NZIYQUYY
    Key-Arg   : None
    Krb5 Principal: None
    PSK identity: None
    PSK identity hint: None
    TLS session ticket lifetime hint: 300 (seconds)
    TLS session ticket:
    0000 - f5 d3 bf 7e d2 d2 a1 40-56 0f c6 40 de 13 0e 82   ...~...@V..@....
    0010 - ea 39 f3 4d c3 7f 0d bf-36 4d d2 35 30 59 a9 81   .9.M....6M.50Y..
    0020 - bc e6 49 2a 92 69 e6 a3-ef ad 98 28 ea 8f 10 7e   ..I*.i.....(...~
    0030 - bd 6f 8f da cc c3 14 f1-88 e0 65 e9 f7 6c 46 44   .o........e..lFD
    0040 - b3 19 c6 99 c0 ef 25 95-e0 51 30 22 33 3a 64 46   ......%..Q0"3:dF
    0050 - 18 d2 29 7c 34 a1 17 24-47 bc f3 c2 43 f9 4d 91   ..)|4..$G...C.M.
    0060 - 37 c3 f6 1b 00 20 42 73-66 51 f5 94 7a b2 15 a2   7.... BsfQ..z...
    0070 - 6f 3e 9e cf 7d cd ef a5-8a 06 56 72 5a 4b 0b c6   o>..}.....VrZK..
    0080 - 93 bd d0 4b 1f 75 4d 4d-2d 8a 56 16 5e b9 45 4b   ...K.uMM-.V.^.EK
    0090 - 92 95 3c 0b 68 ea 14 e7-20 41 28 20 ae 8b 40 6f   ..<.h... A( ..@o

    Start Time: 1718634179
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jun 18, 2024
@srebhan
Copy link
Member

srebhan commented Jun 27, 2024

Just to make sure we are not hunting ghosts, your server string looks quite strange in

[[inputs.ldap]]
  server = "[ldaps://](ldaps://xxxxxxxxxxxxxx.xxxxxxxxxx.xxx)[xxxxxxxxxxxxxx](ldaps://xxxxxxxxxxxxxx.xxxxxxxxxx.xxx)[.xxxxxxxxxx.xxx](ldaps://xxxxxxxxxxxxxx.xxxxxxxxxx.xxx)"
  bind_dn = "cn=StatisticsAcct,ou=InternalAccount,dc=xxxxxxxxxx,dc=xxx"
  bind_password = "xxxxxxxxxxxxxxxxxxxxx"
  reverse_field_names = true
  # TLS options
  tls_min_version = "TLS10"
  tls_cipher_suites = ["TLS_AES_256_GCM_SHA384"]
  insecure_skip_verify = true

it should be something like

  server = "ldaps://xxxxxxxxxxxxxx.xxxxxxxxxx.xxx:636"

@srebhan
Copy link
Member

srebhan commented Jun 27, 2024

@paulusc I think I found the issue. Cipher TLS_AES_256_GCM_SHA384 is a TLS1.3 only cipher, you probably need to use TLS_RSA_WITH_AES_256_GCM_SHA384 for TLS1.2.

I put up PR #15570 which allows to specify all, secure and insecure as cipher-suite aliases, so you could try with all using the binary in the PR (available as soon as CI finished the tests)...

Furthermore, you likely do not need to restrict the TLS minimum version as the server offers TLS1.2...

@srebhan srebhan added the waiting for response waiting for response from contributor label Jun 27, 2024
@paulusc
Copy link
Author

paulusc commented Jun 29, 2024

@srebhan All right you nailed it. With this configuration and the latest night build we are all ok.
Thank you very much for your time and dedication, highly appreciated.

# rpm -q telegraf
telegraf-1.32.0-0.x86_64

[[inputs.ldap]]  
server = "ldaps://xxxxxxxxx.xxxxxxxxx.com:636"  
bind_dn = "cn=StatisticsAcct,ou=InternalAccount,dc=desjardins,dc=com"  
bind_password = "xxxxxxxxxxxxx"  
reverse_field_names = true
tls_cipher_suites = ["all"]  
insecure_skip_verify = true

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jun 29, 2024
@srebhan
Copy link
Member

srebhan commented Jul 1, 2024

Closing this issue as the solution is to enable the corresponding insecure cipher. PR #15570 making this easier is already merged and will be released with v1.32.0...

@srebhan srebhan closed this as completed Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants