Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Fix cert validation #2020

Merged
merged 1 commit into from
Dec 13, 2021
Merged

Fix cert validation #2020

merged 1 commit into from
Dec 13, 2021

Conversation

dloucasfx
Copy link
Contributor

The current code fails to validate intermediate cert that has an expired root cert that is part of the trust store, example: DST root ca https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/
example, run the monitor against letsencrypt.org and the cert validation will fail, whereas it passes with openssl and chrome.

The problem with the current logic is that it verifies every cert in the chain "independently", so when it reaches the last CA in the chain , in our example: "ISRG Root CA" , the verification func fails to build a chain > 0 with a cert from the trust store as the CA for ISRG is DST which is expired .

The other problem, the current logic does not count for certs with multiple intermediate chains.

I discussed this issue with @keitwb and we decided that it made sense to simply use the client to do the validation, as it covers all those corner cases.

I changed the code to count on Handshake to set the TLS stats

@xp-1000 please take a look

Signed-off-by: Dani Louca dlouca@splunk.com

Signed-off-by: Dani Louca <dlouca@splunk.com>
@dloucasfx
Copy link
Contributor Author

@xp-1000 do you have any concerns about the change?

Copy link
Contributor

@xp-1000 xp-1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @dloucasfx, sorry for the delay here.

I right understood the problem thanks to the detailed description of the PR.
What you propose seems good from the functional point of view and this drastically simplify the code on TLS part.
However, even if my code was clumsy the goal was to always send tls metrics including the case where certificate is invalid.

Here, if understand right, when certificate is invalid you will not send the other tls metrics anymore (like expiry time).
I personally think that metrics form agent should always be consistent even if it means sending a "zero value".
Having an erratic reporting behavior could lead to undesired display on chart or worst on false alert for detectors (especially for heartbeat ones).

That said, I think in this particular case this could be acceptable because:

  • this is few probable someone built complex chart with tls metrics
  • if someone created an heartbeat detectors for TLS related metrics he still can use http.cert_valid instead of http.cert_expiry
  • in our self centered case (Claranet, the coroporate I work for) we only use http.status_code for heatbeat so it should not be problematic: https://github.com/claranet/terraform-signalfx-detectors/blob/master/modules/smart-agent_http/conf/00-heartbeat.yaml#L10
  • back to the creation of this monitor, it was to fill the gap for simple webcheck monitoring. now with Rigor and Splunk Observability this is less necessary even if it is still useful for private checks.

So I think this change is acceptable but please could you add a note on readme to explain that tls related metrics can be erratic depending on the result of the certificate validation warn the users.

@dloucasfx
Copy link
Contributor Author

@xp-1000
Thanks for looking!

Here, if understand right, when certificate is invalid you will not send the other tls metrics anymore (like expiry time).
I personally think that metrics form agent should always be consistent even if it means sending a "zero value".

The current code change in this PR will always send all TLS metrics, https://github.com/signalfx/signalfx-agent/pull/2020/files#diff-09c5869e1f3558b54e8f6e5bdda3a471477975363b94d5aed0cc0978e2ab6122R251-R253
the non-initialized of secondsLeft will be 0 when cert validation fails

I will go ahead and merge if you there is no other concerns.

@dloucasfx dloucasfx merged commit 5b271a8 into main Dec 13, 2021
@dloucasfx dloucasfx deleted the SWAT-3920 branch December 14, 2021 21:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants