[kube-prometheus-stack] tlsConfig support for servicemonitors and default alertingEndpoints to work with istio strict mtls #145

infa-ddeore · 2020-09-27T06:51:34Z

Is your feature request related to a problem? Please describe.
prometheus stack installed with istio sidecar with STRICT mtls policy

prometheus cant scrape grafana, alertmanager, kube-state-metrics and prometheus-operator
prometheus cant connect to alertmanager

Describe the solution you'd like

Provide scheme and mtls options in grafana, alertmanager, kube-state-metrics and prometheus-operator servicemonitor definition
provide an option to update alertmanager's scheme and tlsconfig

Describe alternatives you've considered

manually update the default servicemonitors with https scheme and tlsConfig using istio tls certs
use prometheus. prometheusSpec.alertingEndpoints https and tlsconfig

Additional context
below is the override values (considering "prom" is the installed helm chart name, eg. helm install prom) that makes istio write the tls cert on the memory volume and prometheus mounts that volume
NOTE:

Prerequisites:

install istio 1.6.1 or higher
enable istio sidecar injection (eg. label the namespace with istio-injection=enabled)

Istio 1.6.1 and higher needs to be installed to make it work, prometheus will have istio certs mounted under /etc/istio-certs/ dir, once PeerAuthentication is set to STRICT, prometheus will fail to scrape connect to other components like grafana, operator, alertmenager.

prometheusOperator:
  podAnnotations:
    sidecar.istio.io/rewriteAppHTTPProbers: "true"
  admissionWebhooks:
    patch:
      podAnnotations:
        sidecar.istio.io/inject: "false"

grafana:
  podAnnotations:
    sidecar.istio.io/rewriteAppHTTPProbers: "true"

prometheus:
  prometheusSpec:
    logLevel: info
    alertingEndpoints:
    - name: prom-prometheus-operator-alertmanager
      namespace: default
      port: web
      scheme: https
      pathPrefix: /
      apiVersion: v2
      tlsConfig:
        caFile: /etc/istio-certs/root-cert.pem
        keyFile: /etc/istio-certs/key.pem
        certFile: /etc/istio-certs/cert-chain.pem
        insecureSkipVerify: true    
    volumeMounts:
    - name: istio-certs
      mountPath: /etc/istio-certs/
      readOnly: true
    podMetadata:
      annotations:
        sidecar.istio.io/rewriteAppHTTPProbers: "true"
        sidecar.istio.io/userVolume: '[{"name": "istio-certs", "emptyDir": {"medium": "Memory"}}]'
        sidecar.istio.io/userVolumeMount: '[{"name": "istio-certs", "mountPath": "/etc/istio-certs/"}]'

        proxy.istio.io/config: |-
          proxyMetadata:
            OUTPUT_CERTS: /etc/istio-certs/     

alertmanager:
  alertmanagerSpec:
    podMetadata:
      annotations:
        sidecar.istio.io/rewriteAppHTTPProbers: "true"

  serviceMonitor:
    tlsConfig:
      caFile: /etc/istio-certs/root-cert.pem
      keyFile: /etc/istio-certs/key.pem
      certFile: /etc/istio-certs/cert-chain.pem
      insecureSkipVerify: true

kube-state-metrics:
  podAnnotations:
    sidecar.istio.io/rewriteAppHTTPProbers: "true"

edit servicemonitor to make prometheus scrape the target over https with tlsconfig as below kubectl edit servicemonitors.monitoring.coreos.com prom-prometheus-operator-alertmanager, now prometheus can scrape the target over mtls.

spec:
  endpoints:
  - path: /metrics
    port: web
    scheme: https
    tlsConfig:
      caFile: /etc/istio-certs/root-cert.pem
      certFile: /etc/istio-certs/cert-chain.pem
      insecureSkipVerify: true
      keyFile: /etc/istio-certs/key.pem
  namespaceSelector:

The text was updated successfully, but these errors were encountered:

stale · 2020-10-27T09:27:06Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

infa-ddeore · 2020-10-27T10:13:27Z

@bismarck @gianrubio @gkarthiks @scottrigby @vsliouniaev @Xtigyro
could you take a look at this feature request?

it will help many who want to use promeththeus operator with istio STRICT mTLS between prometheus component

bismarck · 2020-10-27T13:20:40Z

@infa-ddeore We would be happy to accept a PR implementing this feature.

stale · 2020-11-26T13:45:24Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

stale · 2020-12-10T14:05:32Z

This issue is being automatically closed due to inactivity.

stevehipwell · 2021-01-12T17:05:09Z

Is nobody still interested in this?

RenaudDemarneffe · 2021-01-12T19:17:14Z

Yes, I'm very interested in this.

uvw · 2021-01-13T00:55:47Z

I end up using the Helm post-renderer approach with kustomize to patch the chart on the fly. Here is a distilled example repo: https://github.com/uvw/kube-prometheus-stack-istio.

stevehipwell · 2021-01-13T09:07:34Z

Looking into the Istio code and docs further it looks like Metric merging is the supported way of handling metrics with MTLS set to strict. I've asked for clarification on the correct pattern to implement this.

Crazyigor1987 · 2021-02-15T17:02:03Z

@infa-ddeore have you tested it with the latest kube-prometheus-stack?
I've followed your settings, but unfortunately the prometheus-container cannot load the istio-certs, because they have UID 1337 instead of 1000. So I get a "permission denied".

stevehipwell · 2021-02-15T18:28:24Z

@Crazyigor1987 have you set a fsGroup for the containers?

Crazyigor1987 · 2021-02-16T08:52:06Z

@stevehipwell THANK YOU, that was the right keyword! Finally kube-prometheus-stack works like I wanted it to.

kharouny · 2021-02-19T21:12:08Z

How come Prometheus needs the TLS configs for the Service Monitors if it has the side-car already injected? Shouldn't that allow scraping to happen over plain http?

infa-ddeore · 2021-02-20T03:39:40Z

How come Prometheus needs the TLS configs for the Service Monitors if it has the side-car already injected? Shouldn't that allow scraping to happen over plain http?

prometheus scrapes pod IPs directly, istio doesnt interfere the connections talking directly to the pod IPs

check this discussion, specifically this comment

stevehipwell · 2021-02-20T14:11:22Z

You can use metric relabelling (we do this currently) to have the Istio proxy allow non mTLS scraping but you have to live with all metrics being associated with the Istio proxy pod. If Istio added originating pod labels we could re-write the metrics to get them back under the correct pod, this would be a pretty good solution.

shadjac · 2021-04-14T21:38:02Z

You can use metric relabelling (we do this currently) to have the Istio proxy allow non mTLS scraping but you have to live with all metrics being associated with the Istio proxy pod. If Istio added originating pod labels we could re-write the metrics to get them back under the correct pod, this would be a pretty good solution.

@stevehipwell This sounds interesting. Would you please give more details about this approach? Thanks already for the idea!

stevehipwell · 2021-04-15T07:59:14Z

@shadjac you can setup Prometheus Operator for Istio via the control plane (istiod) service monitor and data plane (istio-proxy) pod monitor from the example template. Once you've got this set up you can enable metrics merging to collect your container metrics directly from the data plane. With merging enabled the data plane will merge pod metrics if you annotate it with the standard (non-operator) Prometheus annotations (prometheus.io/scrape, prometheus.io/path & prometheus.io/port).

shadjac · 2021-04-15T10:38:31Z

@stevehipwell Thanks for elaborating this. However, in our case, the application pushes its metrics to port 8080 at path /metrics/prometheus. I assume the default values of the annotations are -

prometheus.io/path: /stats/prometheus
prometheus.io/port: "15020"

and they cant be changed.
Can I still make it work?

stevehipwell · 2021-04-15T11:13:12Z

@shadjac the annotations that you add to your pod should be set to your values, the istio-proxy will re-write the annotations to it's values (which will be ignored by Prometheus Operator) but it will merge the metrics from your endpoint with its own to be collected by the pod monitor.

anoop-l · 2021-05-04T10:56:42Z

@infa-ddeore we implemented the way you suggested but we cannot read the cert/key files mounted. I can see them at the path /etc/istio-certs.

err="error creating HTTP client: unable to load specified CA cert /etc/istio-certs/root-cert.pem: open /etc/istio-certs/root-cert.pem: permission denied" scrape_pool=serviceMonitor/monitoring/test/0

Do we have to do something with the runAsUser. It's set to 1000 now. Should it be 0?

infa-ddeore · 2021-05-04T11:20:59Z

@infa-ddeore we implemented the way you suggested but we cannot read the cert/key files mounted. I can see them at the path /etc/istio-certs.

err="error creating HTTP client: unable to load specified CA cert /etc/istio-certs/root-cert.pem: open /etc/istio-certs/root-cert.pem: permission denied" scrape_pool=serviceMonitor/monitoring/test/0

Do we have to do something with the runAsUser. It's set to 1000 now. Should it be 0?

I haven't come across such issue, may be the "kube-prometheus-stack" I used had different user id which allowed the access.
can you try fsGroup as per #145 (comment) or https://stackoverflow.com/a/43545277/3397719 ?

anoop-l · 2021-05-04T16:23:11Z

Thanks @infa-ddeore
Finally I got it working with user ID 1337 which was used by Istio
securityContext:
runAsGroup: 1337
runAsNonRoot: true
runAsUser: 1337
fsGroup: 1337

frankjkelly · 2021-08-17T20:57:58Z

Can someone comment why insecureSkipVerify is set to true? Is that required for this to work and if so why is that? Thanks in advance

infa-ddeore · 2021-08-18T03:06:50Z

Can someone comment why insecureSkipVerify is set to true? Is that required for this to work and if so why is that? Thanks in advance

yes, it is required because the cert is signed by istio and prometheus wont trust it

frankjkelly · 2021-08-18T10:20:46Z

Thank you @infa-ddeore

TamasNeumer · 2021-08-30T10:02:11Z

@infa-ddeore

Update:
On the AlertManager I started excluding proxying on the 9094and this enabled the alertmanager instances to do the initial sync among each other.

Now my problem is that Prometheus tries to fire an alert by sending it to AlertManager over HTTPS, who apperently doesn't like this...

"https://10.9.13.20:9093/api/v2/alerts\": http: server gave HTTP response to HTTPS client" Any ideas how I can force AlertManager to cooperate?

Background: As described here I'm trying to force Prometheus to use mTLS when communicating with the AlertManager

====

I have followed your guide, however the AlertManagers don't seem to be able to discover each other. I get errors like these:

Failed to join 10.9.12.9: read tcp 10.9.11.9:56142->10.9.12.9:9094: read: connection reset by peer

I believe it happens when an AlertManagers (I have three instances running) starts up, fetches the IPs of the other two and tries syncing with them.

Any idea why this might happen?

As you previously describe pod to pod communication when using IPs should not be interfered by the Istio sidecar.

In the envoy proxy config of one of the AlertManagers I see the following:

10.8.30.232   9093  Trans: raw_buffer; App: HTTP  Route: alertmanager-main.monitoring.svc.cluster.local:9093
10.8.30.232   9093  ALL                           Cluster: outbound|9093||alertmanager-main.monitoring.svc.cluster.local
10.9.11.9     9093  Trans: raw_buffer; App: HTTP  Route: alertmanager-operated.monitoring.svc.cluster.local:9093
10.9.11.9     9093  ALL                           Cluster: outbound|9093||alertmanager-operated.monitoring.svc.cluster.local
10.9.12.9     9093  Trans: raw_buffer; App: HTTP  Route: alertmanager-operated.monitoring.svc.cluster.local:9093
10.9.12.9     9093  ALL                           Cluster: outbound|9093||alertmanager-operated.monitoring.svc.cluster.local
10.9.13.10    9093  Trans: raw_buffer; App: HTTP  Route: alertmanager-operated.monitoring.svc.cluster.local:9093
10.9.13.10    9093  ALL                           Cluster: outbound|9093||alertmanager-operated.monitoring.svc.cluster.local
10.9.11.9     9094  ALL                           Cluster: outbound|9094||alertmanager-operated.monitoring.svc.cluster.local
10.9.12.9     9094  ALL                           Cluster: outbound|9094||alertmanager-operated.monitoring.svc.cluster.local
10.9.13.10    9094  ALL                           Cluster: outbound|9094||alertmanager-operated.monitoring.svc.cluster.local

mjhoffman65 · 2021-12-06T21:45:32Z

Can this be re-opened?

infa-ddeore added the enhancement New feature or request label Sep 27, 2020

infa-ddeore changed the title ~~[kube-prometheus-stack] tlsConfig support for servicemonitors and default alertingEndpoints~~ [kube-prometheus-stack] tlsConfig support for servicemonitors and default alertingEndpoints to work with istio strict mtls Sep 27, 2020

This was referenced Sep 27, 2020

Prometheus Operator failing with mTLS (Istio) prometheus-operator/prometheus-operator#2907

Closed

Prometheus not able to scrape from Pod istio/istio#21402

Closed

CIDR doesnt work in ServiceEntry spec.addresses for HTTP traffic istio/istio#23024

Closed

infa-ddeore mentioned this issue Oct 7, 2020

Secure communication between Prometheus and Istio components istio/istio#7352

Closed

stale bot added the lifecycle/stale label Oct 27, 2020

stale bot removed the lifecycle/stale label Oct 27, 2020

stale bot added the lifecycle/stale label Nov 26, 2020

stale bot closed this as completed Dec 10, 2020

twiden mentioned this issue Feb 10, 2023

[kube-prometheus-stack] Add permission to the alertmanagers/status resource #3016

Merged

3 tasks

cccsss01 mentioned this issue May 24, 2023

Istio-proxy sidecar in Prometheus with Prometheus operator prometheus-operator/prometheus-operator#3926

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[kube-prometheus-stack] tlsConfig support for servicemonitors and default alertingEndpoints to work with istio strict mtls #145

[kube-prometheus-stack] tlsConfig support for servicemonitors and default alertingEndpoints to work with istio strict mtls #145

infa-ddeore commented Sep 27, 2020 •

edited

Loading

stale bot commented Oct 27, 2020

infa-ddeore commented Oct 27, 2020

bismarck commented Oct 27, 2020

stale bot commented Nov 26, 2020

stale bot commented Dec 10, 2020

stevehipwell commented Jan 12, 2021

RenaudDemarneffe commented Jan 12, 2021

uvw commented Jan 13, 2021

stevehipwell commented Jan 13, 2021

Crazyigor1987 commented Feb 15, 2021

stevehipwell commented Feb 15, 2021

Crazyigor1987 commented Feb 16, 2021

kharouny commented Feb 19, 2021

infa-ddeore commented Feb 20, 2021

stevehipwell commented Feb 20, 2021

shadjac commented Apr 14, 2021

stevehipwell commented Apr 15, 2021

shadjac commented Apr 15, 2021

stevehipwell commented Apr 15, 2021

anoop-l commented May 4, 2021

infa-ddeore commented May 4, 2021

anoop-l commented May 4, 2021

frankjkelly commented Aug 17, 2021

infa-ddeore commented Aug 18, 2021

frankjkelly commented Aug 18, 2021

TamasNeumer commented Aug 30, 2021 •

edited

Loading

mjhoffman65 commented Dec 6, 2021 •

edited

Loading

[kube-prometheus-stack] tlsConfig support for servicemonitors and default alertingEndpoints to work with istio strict mtls #145

[kube-prometheus-stack] tlsConfig support for servicemonitors and default alertingEndpoints to work with istio strict mtls #145

Comments

infa-ddeore commented Sep 27, 2020 • edited Loading

stale bot commented Oct 27, 2020

infa-ddeore commented Oct 27, 2020

bismarck commented Oct 27, 2020

stale bot commented Nov 26, 2020

stale bot commented Dec 10, 2020

stevehipwell commented Jan 12, 2021

RenaudDemarneffe commented Jan 12, 2021

uvw commented Jan 13, 2021

stevehipwell commented Jan 13, 2021

Crazyigor1987 commented Feb 15, 2021

stevehipwell commented Feb 15, 2021

Crazyigor1987 commented Feb 16, 2021

kharouny commented Feb 19, 2021

infa-ddeore commented Feb 20, 2021

stevehipwell commented Feb 20, 2021

shadjac commented Apr 14, 2021

stevehipwell commented Apr 15, 2021

shadjac commented Apr 15, 2021

stevehipwell commented Apr 15, 2021

anoop-l commented May 4, 2021

infa-ddeore commented May 4, 2021

anoop-l commented May 4, 2021

frankjkelly commented Aug 17, 2021

infa-ddeore commented Aug 18, 2021

frankjkelly commented Aug 18, 2021

TamasNeumer commented Aug 30, 2021 • edited Loading

mjhoffman65 commented Dec 6, 2021 • edited Loading

infa-ddeore commented Sep 27, 2020 •

edited

Loading

TamasNeumer commented Aug 30, 2021 •

edited

Loading

mjhoffman65 commented Dec 6, 2021 •

edited

Loading