Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta] Istio Metricbeat Module #15505

Closed
11 tasks
ChrsMark opened this issue Jan 13, 2020 · 12 comments
Closed
11 tasks

[Meta] Istio Metricbeat Module #15505

ChrsMark opened this issue Jan 13, 2020 · 12 comments
Labels
enhancement Metricbeat Metricbeat Stalled Team:Integrations Label for the Integrations team Team:Platforms Label for the Integrations - Platforms team

Comments

@ChrsMark
Copy link
Member

ChrsMark commented Jan 13, 2020

Istio Metricbeat Module

This is issue aims to track any implementation, concern, discussion, proposal around Istio integration with Metricbeat.

Istio: https://istio.io/
Istio Metrics:

Metricbeat Module / Dataset release checklist

This checklist is intended for Devs which create or update a module to make sure modules are consistent.

Modules

For a metricset to go GA, the following criterias should be met:

  • Supported versions are documented
  • Supported operating systems are documented (if applicable)
  • Integration tests exist
  • System tests exist
  • Automated checks that all fields are documented
  • Documentation
  • Fields follow ECS and naming conventions
  • Dashboards exists (if applicable)
  • Kibana Home Tutorial (if applicable)
    • Open PR against Kibana repo with tutorial. Examples can be found here.

Metricbeat module

  • Example data.json exists and an automated way to generate it exists (go test -data)
  • Test environment in Docker exist for integration tests

cc: @exekias

@ChrsMark ChrsMark self-assigned this Jan 13, 2020
@ChrsMark
Copy link
Member Author

ChrsMark commented Jan 14, 2020

We should scrape metrics from each exporter and not from the Prometheus federate API to avoid problems with types as mentioned at #15535 (comment).

Proposed Metricsets

istio-telemetry.istio-system:42422: The istio-mesh job returns all Mixer-generated metrics.
istio-telemetry.istio-system:15014: The istio-telemetry job returns all Mixer-specific metrics. Use this endpoint to monitor Mixer itself.
istio-proxy:15090: The envoy-stats job returns raw stats generated by Envoy. Prometheus is configured to look for pods with the envoy-prom endpoint exposed. The add-on configuration filters out a large number of envoy metrics during collection in an attempt to limit the scale of data by the add-on processes.
istio-pilot.istio-system:15014: The pilot job returns the Pilot-generated metrics.
istio-galley.istio-system:15014: The galley job returns the Galley-generated metrics.
istio-policy.istio-system:15014: The istio-policy job returns all policy-related metrics.
istio-citadel.istio-system:15014: The istio-citadel job returns all Citadel-generated metrics.

Tasks

  • Implement metricsets
  • Refactor histogram type metrics on top of new ES Histograms, issue
  • Create Dashboards
  • Explore possible integration with ES graphs

@ioandr
Copy link
Contributor

ioandr commented Jan 15, 2020

Hi folks,

this issue seems very interesting and I would like to contribute some dev cycles on this. For a start, I could take on some of the proposed metricsets, e.g., the Pilot metrics.

@ChrsMark WDYT?

@ChrsMark
Copy link
Member Author

This is great @ioandr, feel free to work on this!

@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@ChrsMark
Copy link
Member Author

We may need to revisit this one after the introduction of the Istiod (https://istio.io/latest/news/releases/1.5.x/announcing-1.5/#introducing-istiod).

@ChrsMark
Copy link
Member Author

I did an investigation on the new istiod regarding monitoring. The thing here is that all standalone prometheus exporters of the previous microservices are not existent any more with this new monolithic approach. Currently all metrics are exposed by a central Prometheus server and not all of the internal endpoints from where Prometheus scrapes are exposed outside directly.

Maybe in order to support the newer versions we should go with a solution based on a light weight module on top of Prometheus module which will collect all the related metrics from the Prometheus federate API.

@exekias
Copy link
Contributor

exekias commented Jul 16, 2020

@ChrsMark I'm wondering... is Prometheus server a mandatory piece in Istio deployments?

@ChrsMark
Copy link
Member Author

@ChrsMark I'm wondering... is Prometheus server a mandatory piece in Istio deployments?

From https://preliminary.istio.io/latest/docs/ops/best-practices/observability/#using-prometheus-for-production-scale-monitoring, it seems so:

In default deployments of Istio, a deployment of Prometheus is provided for collecting metrics generated
for all mesh traffic. This deployment of Prometheus is intentionally deployed with a very short retention
window (6 hours). 

@ChrsMark
Copy link
Member Author

However, installing 1.7.1 (https://istio.io/latest/docs/setup/getting-started/) seems that istiod exposes metrics at 15014 port:

Port:              http-monitoring  15014/TCP
TargetPort:        15014/TCP
Endpoints:         172.17.0.14:15014

In order to access metrics endpoint: kubectl -n istio-system port-forward svc/istiod 15014:15014
And then hit localhost:15014/metrics.
Sample metrics:
istiod.txt

pilot, mixer, galley and citadel metrics are exposed. So, we can create a new metricset called istiod to cover the new versions?

cc: @masci

@exekias
Copy link
Contributor

exekias commented Sep 16, 2020

Does istiod provide the same level of detail? IIRC mixer and others were pretty verbose

@ChrsMark
Copy link
Member Author

ChrsMark commented Oct 15, 2020

Latest implementations to support new versions of Istio (>=1.5):

Dashboards have been added for these new metricsets.

Note: The above metricsets are BETA. The reason of this is that they are implemented as leight-weight modules on top of prometheus module and make use of use_types setting. This setting is still in BETA and hence we cannot move these metricsets to GA before moving the setting to GA (the setting is expected to change in the near future because of elastic/elasticsearch#61939)

@botelastic
Copy link

botelastic bot commented Dec 20, 2022

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Dec 20, 2022
@botelastic botelastic bot closed this as completed Jun 18, 2023
@zube zube bot removed the [zube]: Done label Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Metricbeat Metricbeat Stalled Team:Integrations Label for the Integrations team Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

No branches or pull requests

5 participants