Skip to content

Commit

Permalink
[Metricbeat][vSphere] Support for configurable IntervalId for perform…
Browse files Browse the repository at this point in the history
…ance API (#40678)

* initial commit for intervalId supports for performance metrics

* update docs and fix CI

* Add changelog entry

* fix CI

* resolve review comments

* fix loggers

* resolved review comments

* update versions

* update UTs

* update integration tests

* 10s -> 20s

* Update CHANGELOG.next.asciidoc

Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com>

* Update metricbeat/docs/modules/vsphere.asciidoc

Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com>

* make update

* add recover for ToMetricSeries panic

* return error instead just logging it.

* remove restriction of interval IDs

* remove unnecessary validations

* remove recover and add empty condition

* update changelog entry

* Fix wrapping of errors in loggers

* update data.json

* update data.json

* fix CI and loggers

* update changelog entries

* make update

* fix changelog entries

* update changelog entry

---------

Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com>
  • Loading branch information
kush-elastic and devamanv authored Sep 11, 2024
1 parent a94ff04 commit c75a7a4
Show file tree
Hide file tree
Showing 20 changed files with 434 additions and 256 deletions.
13 changes: 7 additions & 6 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,6 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Allow metricsets to report their status via control v2 protocol. {pull}40025[40025]
- Remove fallback to the node limit for the `kubernetes.pod.cpu.usage.limit.pct` and `kubernetes.pod.memory.usage.limit.pct` metrics calculation
- Add support for Kibana status metricset in v8 format {pull}40275[40275]
- Add new metrics for the vSphere Datastore metricset. {pull}40441[40441]
- Update metrics for the vSphere Host metricset. {pull}40429[40429]
- Mark system process metricsets as running if metrics are partially available {pull}40565[40565]
- Added back `elasticsearch.node.stats.jvm.mem.pools.*` to the `node_stats` metricset {pull}40571[40571]

Expand Down Expand Up @@ -293,7 +291,6 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Improve logging in Okta Entity Analytics provider. {issue}40106[40106] {pull}40347[40347]
- Document `winlog` input. {issue}40074[40074] {pull}40462[40462]
- Added retry logic to websocket connections in the streaming input. {issue}40271[40271] {pull}40601[40601]
- Add new metricset cluster for the vSphere module. {pull}40536[40536]
- Disable event normalization for netflow input {pull}40635[40635]
- Allow attribute selection in the Active Directory entity analytics provider. {issue}40482[40482] {pull}40662[40662]
- Improve error quality when CEL program does not correctly return an events array. {pull}40580[40580]
Expand Down Expand Up @@ -324,16 +321,20 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Add SSL support to mysql module {pull}37997[37997]
- Add SSL support for aerospike module {pull}38126[38126]
- Add `use_kubeadm` config option in kubernetes module in order to toggle kubeadm-config api requests {pull}40086[40086]
- Log the total time taken for GCP `ListTimeSeries` and `AggregatedList` requests {pull}40661[40661]
- Add new metrics for the vSphere Host metricset. {pull}40429[40429]
- Add new metrics for the vSphere Datastore metricset. {pull}40441[40441]
- Add new metricset cluster for the vSphere module. {pull}40536[40536]
- Add new metricset network for the vSphere module. {pull}40559[40559]
- Add new metricset resourcepool for the vSphere module. {pull}40456[40456]
- Log the total time taken for GCP `ListTimeSeries` and `AggregatedList` requests {pull}40661[40661]
- Add metrics for the vSphere Virtualmachine metricset. {pull}40485[40485]
- Add new metricset datastorecluster for vSphere module. {pull}40634[40634] {pull}40694[40694]
- Add new metrics for the vSphere Virtualmachine metricset. {pull}40485[40485]
- Add support for snapshot in vSphere virtualmachine metricset {pull}40683[40683]
- Update fields to use mapstr in vSphere virtualmachine metricset {pull}40707[40707]
- Add support for period based intervalID in vSphere host and datastore metricsets {pull}40678[40678]

*Metricbeat*

- Add support for new metrics for vSphere module datastorecluster metricset. {pull}40694[40694]

*Osquerybeat*

Expand Down
85 changes: 80 additions & 5 deletions metricbeat/docs/modules/vsphere.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,79 @@ This file is generated! See scripts/mage/docs_collector.go
[[metricbeat-module-vsphere]]
== vSphere module

The vSphere module uses the https://github.com/vmware/govmomi[Govmomi] library to collect metrics from any Vmware SDK URL (ESXi/VCenter). This library is built for and tested against ESXi and vCenter 5.5, 6.0 and 6.5.
The vSphere module uses the https://github.com/vmware/govmomi[Govmomi] library to collect metrics from any VMware SDK URL (ESXi/VCenter).

By default it enables the metricsets `cluster`, `datastore`, `datastorecluster`, `host`, `network`, `resourcepool` and `virtualmachine`.
This module has been tested against ESXi and vCenter versions 5.5, 6.0, 6.5, and 7.0.3.

By default, the vSphere module enables the following metricsets:

1. cluster

2. datastore

3. datastorecluster

4. host

5. network

6. resourcepool

7. virtualmachine

[float]
=== Supported Periods:
The Datastore and Host metricsets support performance data collection using the vSphere performance API. Given that the performance API imposes usage restrictions based on data collection intervals, users should configure the period optimally to ensure the receipt of real-time data. This configuration can be determined based on the https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html[Data Collection Intervals] and https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html[Data Collection Levels].

[IMPORTANT]

Only host and datastore metricsets have limitation of system configured period from vSphere instance. Users can still collect summary metrics if performance metrics are not supported for the configured instance.

[float]
==== Real-time data collection default interval:
- 20s

[float]
==== Historical data collection default intervals:
- 300s
- 1800s
- 7200s
- 86400s

[float]
=== Example:
If you need to configure multiple metricsets with different periods, you can achieve this by setting up multiple vSphere modules with different metricsets as demonstrated below:

[source,yaml]
----
- module: vsphere
metricsets:
- cluster
- datastorecluster
- network
- resourcepool
- virtualmachine
period: 10s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
insecure: false
- module: vsphere
metricsets:
- datastore
- host
period: 300s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
insecure: false
----

[float]
=== Dashboard

The vsphere module comes with a predefined dashboard. For example:
The vSphere module includes a predefined dashboard. For example:

image::./images/metricbeat_vsphere_dashboard.png[]
image::./images/metricbeat_vsphere_vm_dashboard.png[]
Expand All @@ -36,15 +101,25 @@ metricbeat.modules:
- module: vsphere
enabled: true
metricsets: ["cluster", "datastore", "datastorecluster", "host", "network", "resourcepool", "virtualmachine"]
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds.
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default.
# Supported Periods:
# The Datastore and Host metricsets support performance data collection using the vSphere performance API.
# Since the performance API has usage restrictions based on data collection intervals,
# users should ensure that the period is configured optimally to receive real-time data.
# users can still collect summary metrics if performance metrics are not supported for the configured instance.
# This configuration can be determined based on the Data Collection Intervals and Data Collection Levels.
# Reference Links:
# Data Collection Intervals: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html
# Data Collection Levels: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html
period: 20s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
# If insecure is true, don't verify the server's certificate chain
insecure: false
# Get custom fields when using virtualmachine metric set. Default false.
# Get custom fields when using virtualmachine metricset. Default false.
# get_custom_fields: false
----

Expand Down
14 changes: 12 additions & 2 deletions metricbeat/metricbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1008,15 +1008,25 @@ metricbeat.modules:
- module: vsphere
enabled: true
metricsets: ["cluster", "datastore", "datastorecluster", "host", "network", "resourcepool", "virtualmachine"]
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds.

# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default.
# Supported Periods:
# The Datastore and Host metricsets support performance data collection using the vSphere performance API.
# Since the performance API has usage restrictions based on data collection intervals,
# users should ensure that the period is configured optimally to receive real-time data.
# users can still collect summary metrics if performance metrics are not supported for the configured instance.
# This configuration can be determined based on the Data Collection Intervals and Data Collection Levels.
# Reference Links:
# Data Collection Intervals: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html
# Data Collection Levels: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html
period: 20s
hosts: ["https://localhost/sdk"]

username: "user"
password: "password"
# If insecure is true, don't verify the server's certificate chain
insecure: false
# Get custom fields when using virtualmachine metric set. Default false.
# Get custom fields when using virtualmachine metricset. Default false.
# get_custom_fields: false

#------------------------------- Windows Module -------------------------------
Expand Down
14 changes: 12 additions & 2 deletions metricbeat/module/vsphere/_meta/config.reference.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,23 @@
- module: vsphere
enabled: true
metricsets: ["cluster", "datastore", "datastorecluster", "host", "network", "resourcepool", "virtualmachine"]
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds.

# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default.
# Supported Periods:
# The Datastore and Host metricsets support performance data collection using the vSphere performance API.
# Since the performance API has usage restrictions based on data collection intervals,
# users should ensure that the period is configured optimally to receive real-time data.
# users can still collect summary metrics if performance metrics are not supported for the configured instance.
# This configuration can be determined based on the Data Collection Intervals and Data Collection Levels.
# Reference Links:
# Data Collection Intervals: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html
# Data Collection Levels: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html
period: 20s
hosts: ["https://localhost/sdk"]

username: "user"
password: "password"
# If insecure is true, don't verify the server's certificate chain
insecure: false
# Get custom fields when using virtualmachine metric set. Default false.
# Get custom fields when using virtualmachine metricset. Default false.
# get_custom_fields: false
14 changes: 12 additions & 2 deletions metricbeat/module/vsphere/_meta/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,23 @@
# - network
# - resourcepool
# - virtualmachine
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds.

# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default.
# Supported Periods:
# The Datastore and Host metricsets support performance data collection using the vSphere performance API.
# Since the performance API has usage restrictions based on data collection intervals,
# users should ensure that the period is configured optimally to receive real-time data.
# users can still collect summary metrics if performance metrics are not supported for the configured instance.
# This configuration can be determined based on the Data Collection Intervals and Data Collection Levels.
# Reference Links:
# Data Collection Intervals: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html
# Data Collection Levels: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html
period: 20s
hosts: ["https://localhost/sdk"]

username: "user"
password: "password"
# If insecure is true, don't verify the server's certificate chain
insecure: false
# Get custom fields when using virtualmachine metric set. Default false.
# Get custom fields when using virtualmachine metricset. Default false.
# get_custom_fields: false
71 changes: 68 additions & 3 deletions metricbeat/module/vsphere/_meta/docs.asciidoc
Original file line number Diff line number Diff line change
@@ -1,11 +1,76 @@
The vSphere module uses the https://github.com/vmware/govmomi[Govmomi] library to collect metrics from any Vmware SDK URL (ESXi/VCenter). This library is built for and tested against ESXi and vCenter 5.5, 6.0 and 6.5.
The vSphere module uses the https://github.com/vmware/govmomi[Govmomi] library to collect metrics from any VMware SDK URL (ESXi/VCenter).

By default it enables the metricsets `cluster`, `datastore`, `datastorecluster`, `host`, `network`, `resourcepool` and `virtualmachine`.
This module has been tested against ESXi and vCenter versions 5.5, 6.0, 6.5, and 7.0.3.

By default, the vSphere module enables the following metricsets:

1. cluster
2. datastore
3. datastorecluster
4. host
5. network
6. resourcepool
7. virtualmachine
[float]
=== Supported Periods:
The Datastore and Host metricsets support performance data collection using the vSphere performance API. Given that the performance API imposes usage restrictions based on data collection intervals, users should configure the period optimally to ensure the receipt of real-time data. This configuration can be determined based on the https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html[Data Collection Intervals] and https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html[Data Collection Levels].

[IMPORTANT]

Only host and datastore metricsets have limitation of system configured period from vSphere instance. Users can still collect summary metrics if performance metrics are not supported for the configured instance.

[float]
==== Real-time data collection default interval:
- 20s

[float]
==== Historical data collection default intervals:
- 300s
- 1800s
- 7200s
- 86400s

[float]
=== Example:
If you need to configure multiple metricsets with different periods, you can achieve this by setting up multiple vSphere modules with different metricsets as demonstrated below:

[source,yaml]
----
- module: vsphere
metricsets:
- cluster
- datastorecluster
- network
- resourcepool
- virtualmachine
period: 10s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
insecure: false
- module: vsphere
metricsets:
- datastore
- host
period: 300s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
insecure: false
----

[float]
=== Dashboard

The vsphere module comes with a predefined dashboard. For example:
The vSphere module includes a predefined dashboard. For example:

image::./images/metricbeat_vsphere_dashboard.png[]
image::./images/metricbeat_vsphere_vm_dashboard.png[]
4 changes: 2 additions & 2 deletions metricbeat/module/vsphere/cluster/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ func (m *ClusterMetricSet) Fetch(ctx context.Context, reporter mb.ReporterV2) er
}
defer func() {
if err := client.Logout(ctx); err != nil {
m.Logger().Debug(fmt.Errorf("error trying to logout from vSphere: %w", err))
m.Logger().Errorf("error trying to logout from vSphere: %v", err)
}
}()

Expand All @@ -91,7 +91,7 @@ func (m *ClusterMetricSet) Fetch(ctx context.Context, reporter mb.ReporterV2) er

defer func() {
if err := v.Destroy(ctx); err != nil {
m.Logger().Errorf("error trying to destroy view from vSphere: %w", err)
m.Logger().Errorf("error trying to destroy view from vSphere: %v", err)
}
}()

Expand Down
Loading

0 comments on commit c75a7a4

Please sign in to comment.