Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metricbeat][vSphere] Support for configurable IntervalId for performance API #40678

Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
04ae955
initial commit for intervalId supports for performance metrics
kush-elastic Sep 3, 2024
2e5d6a1
Merge branch 'main' into 36-vsphere-support-for-configurable-interval…
kush-elastic Sep 4, 2024
831b070
Merge branch 'main' into 36-vsphere-support-for-configurable-interval…
kush-elastic Sep 5, 2024
1704e6e
update docs and fix CI
kush-elastic Sep 5, 2024
2e995ba
Add changelog entry
kush-elastic Sep 5, 2024
3727a3b
fix CI
kush-elastic Sep 5, 2024
416177c
resolve review comments
kush-elastic Sep 5, 2024
d2ac1b6
fix loggers
kush-elastic Sep 5, 2024
23fd9e4
resolved review comments
kush-elastic Sep 5, 2024
8da1d2a
update versions
kush-elastic Sep 6, 2024
ab175fd
update UTs
kush-elastic Sep 6, 2024
6b08b73
update integration tests
kush-elastic Sep 6, 2024
48e72b7
10s -> 20s
kush-elastic Sep 6, 2024
24bfb61
Merge branch 'main' into 36-vsphere-support-for-configurable-interval…
kush-elastic Sep 6, 2024
7b38b73
Update CHANGELOG.next.asciidoc
kush-elastic Sep 6, 2024
587aac6
Update metricbeat/docs/modules/vsphere.asciidoc
kush-elastic Sep 6, 2024
d5cd481
make update
kush-elastic Sep 6, 2024
0c4cb23
add recover for ToMetricSeries panic
kush-elastic Sep 6, 2024
7907f68
return error instead just logging it.
kush-elastic Sep 6, 2024
1806e5c
remove restriction of interval IDs
kush-elastic Sep 9, 2024
417703a
remove unnecessary validations
kush-elastic Sep 9, 2024
7fc5691
Merge branch 'main' into 36-vsphere-support-for-configurable-interval…
kush-elastic Sep 9, 2024
ad1c095
remove recover and add empty condition
kush-elastic Sep 9, 2024
c538f15
update changelog entry
kush-elastic Sep 9, 2024
0a156c8
Merge branch 'main' into 36-vsphere-support-for-configurable-interval…
kush-elastic Sep 9, 2024
9808c1e
Fix wrapping of errors in loggers
kush-elastic Sep 9, 2024
5a79756
update data.json
kush-elastic Sep 10, 2024
a2ee751
update data.json
kush-elastic Sep 10, 2024
09f540c
fix CI and loggers
kush-elastic Sep 10, 2024
d17fa70
Merge branch 'main' into 36-vsphere-support-for-configurable-interval…
kush-elastic Sep 10, 2024
4013033
update changelog entries
kush-elastic Sep 10, 2024
3edf243
make update
kush-elastic Sep 10, 2024
2c55bfe
Merge branch 'main' of https://github.com/kush-elastic/beats into 36-…
kush-elastic Sep 10, 2024
8f162b3
fix changelog entries
kush-elastic Sep 10, 2024
99be228
update changelog entry
kush-elastic Sep 10, 2024
1002544
Merge branch 'main' into 36-vsphere-support-for-configurable-interval…
kush-elastic Sep 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,6 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Remove fallback to the node limit for the `kubernetes.pod.cpu.usage.limit.pct` and `kubernetes.pod.memory.usage.limit.pct` metrics calculation
- Add support for Kibana status metricset in v8 format {pull}40275[40275]
- Add metrics for the vSphere Virtualmachine metricset. {pull}40485[40485]
- Add new metrics for the vSphere Datastore metricset. {pull}40441[40441]
- Update metrics for the vSphere Host metricset. {pull}40429[40429]
- Mark system process metricsets as running if metrics are partially available {pull}40565[40565]
- Added back `elasticsearch.node.stats.jvm.mem.pools.*` to the `node_stats` metricset {pull}40571[40571]
- Add support for snapshot in vSphere virtualmachine metricset {pull}40683[40683]
Expand Down Expand Up @@ -293,7 +291,6 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Improve logging in Okta Entity Analytics provider. {issue}40106[40106] {pull}40347[40347]
- Document `winlog` input. {issue}40074[40074] {pull}40462[40462]
- Added retry logic to websocket connections in the streaming input. {issue}40271[40271] {pull}40601[40601]
- Add new metricset cluster for the vSphere module. {pull}40536[40536]
- Disable event normalization for netflow input {pull}40635[40635]
- Allow attribute selection in the Active Directory entity analytics provider. {issue}40482[40482] {pull}40662[40662]
- Improve error quality when CEL program does not correctly return an events array. {pull}40580[40580]
Expand Down Expand Up @@ -324,13 +321,17 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Add SSL support to mysql module {pull}37997[37997]
- Add SSL support for aerospike module {pull}38126[38126]
- Add `use_kubeadm` config option in kubernetes module in order to toggle kubeadm-config api requests {pull}40086[40086]
- Log the total time taken for GCP `ListTimeSeries` and `AggregatedList` requests {pull}40661[40661]
- Add new metrics for the vSphere Host metricset. {pull}40429[40429]
- Add new metrics for the vSphere Datastore metricset. {pull}40441[40441]
- Add new metricset cluster for the vSphere module. {pull}40536[40536]
- Add new metricset network for the vSphere module. {pull}40559[40559]
- Add new metricset resourcepool for the vSphere module. {pull}40456[40456]
- Log the total time taken for GCP `ListTimeSeries` and `AggregatedList` requests {pull}40661[40661]
- Add new metricset datastorecluster for vSphere module. {pull}40634[40634] {pull}40694[40694]
- Add support for period based intervalID in vSphere host and datastore metricsets {pull}40678[40678]

*Metricbeat*

- Add support for new metrics for vSphere module datastorecluster metricset. {pull}40694[40694]

*Osquerybeat*

Expand Down
85 changes: 80 additions & 5 deletions metricbeat/docs/modules/vsphere.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,79 @@ This file is generated! See scripts/mage/docs_collector.go
[[metricbeat-module-vsphere]]
== vSphere module

The vSphere module uses the https://github.com/vmware/govmomi[Govmomi] library to collect metrics from any Vmware SDK URL (ESXi/VCenter). This library is built for and tested against ESXi and vCenter 5.5, 6.0 and 6.5.
The vSphere module uses the https://github.com/vmware/govmomi[Govmomi] library to collect metrics from any VMware SDK URL (ESXi/VCenter).

By default it enables the metricsets `cluster`, `datastore`, `datastorecluster`, `host`, `network`, `resourcepool` and `virtualmachine`.
This module has been tested against ESXi and vCenter versions 5.5, 6.0, 6.5, and 7.0.3.

By default, the vSphere module enables the following metricsets:

1. cluster

2. datastore

3. datastorecluster

4. host

5. network

6. resourcepool

7. virtualmachine

[float]
=== Supported Periods:
The Datastore and Host metricsets support performance data collection using the vSphere performance API. Given that the performance API imposes usage restrictions based on data collection intervals, users should configure the period optimally to ensure the receipt of real-time data. This configuration can be determined based on the https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html[Data Collection Intervals] and https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html[Data Collection Levels].

[IMPORTANT]

Only host and datastore metricsets have limitation of system configured period from vSphere instance. Users can still collect summary metrics if performance metrics are not supported for the configured instance.

[float]
==== Real-time data collection default interval:
- 20s

[float]
==== Historical data collection default intervals:
- 300s
- 1800s
- 7200s
- 86400s

[float]
=== Example:
If you need to configure multiple metricsets with different periods, you can achieve this by setting up multiple vSphere modules with different metricsets as demonstrated below:

[source,yaml]
----
- module: vsphere
metricsets:
- cluster
- datastorecluster
- network
- resourcepool
- virtualmachine
period: 10s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
insecure: false

- module: vsphere
metricsets:
- datastore
- host
period: 300s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
insecure: false
----

[float]
=== Dashboard

The vsphere module comes with a predefined dashboard. For example:
The vSphere module includes a predefined dashboard. For example:

image::./images/metricbeat_vsphere_dashboard.png[]
image::./images/metricbeat_vsphere_vm_dashboard.png[]
Expand All @@ -36,15 +101,25 @@ metricbeat.modules:
- module: vsphere
enabled: true
metricsets: ["cluster", "datastore", "datastorecluster", "host", "network", "resourcepool", "virtualmachine"]
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds.

# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default.
# Supported Periods:
# The Datastore and Host metricsets support performance data collection using the vSphere performance API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we also mention that this will not impact the metrics collection other than perf metrics.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea. let me do that.

# Since the performance API has usage restrictions based on data collection intervals,
# users should ensure that the period is configured optimally to receive real-time data.
# users can still collect summary metrics if performance metrics are not supported for the configured instance.
# This configuration can be determined based on the Data Collection Intervals and Data Collection Levels.
# Reference Links:
# Data Collection Intervals: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html
# Data Collection Levels: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html
period: 20s
hosts: ["https://localhost/sdk"]

username: "user"
password: "password"
# If insecure is true, don't verify the server's certificate chain
insecure: false
# Get custom fields when using virtualmachine metric set. Default false.
# Get custom fields when using virtualmachine metricset. Default false.
# get_custom_fields: false
----

Expand Down
14 changes: 12 additions & 2 deletions metricbeat/metricbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1008,15 +1008,25 @@ metricbeat.modules:
- module: vsphere
enabled: true
metricsets: ["cluster", "datastore", "datastorecluster", "host", "network", "resourcepool", "virtualmachine"]
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds.

# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default.
# Supported Periods:
# The Datastore and Host metricsets support performance data collection using the vSphere performance API.
# Since the performance API has usage restrictions based on data collection intervals,
# users should ensure that the period is configured optimally to receive real-time data.
# users can still collect summary metrics if performance metrics are not supported for the configured instance.
# This configuration can be determined based on the Data Collection Intervals and Data Collection Levels.
# Reference Links:
# Data Collection Intervals: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html
# Data Collection Levels: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html
period: 20s
hosts: ["https://localhost/sdk"]

username: "user"
password: "password"
# If insecure is true, don't verify the server's certificate chain
insecure: false
# Get custom fields when using virtualmachine metric set. Default false.
# Get custom fields when using virtualmachine metricset. Default false.
# get_custom_fields: false

#------------------------------- Windows Module -------------------------------
Expand Down
14 changes: 12 additions & 2 deletions metricbeat/module/vsphere/_meta/config.reference.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,23 @@
- module: vsphere
enabled: true
metricsets: ["cluster", "datastore", "datastorecluster", "host", "network", "resourcepool", "virtualmachine"]
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds.

# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default.
# Supported Periods:
# The Datastore and Host metricsets support performance data collection using the vSphere performance API.
# Since the performance API has usage restrictions based on data collection intervals,
# users should ensure that the period is configured optimally to receive real-time data.
# users can still collect summary metrics if performance metrics are not supported for the configured instance.
# This configuration can be determined based on the Data Collection Intervals and Data Collection Levels.
# Reference Links:
# Data Collection Intervals: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html
# Data Collection Levels: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html
period: 20s
hosts: ["https://localhost/sdk"]

username: "user"
password: "password"
# If insecure is true, don't verify the server's certificate chain
insecure: false
# Get custom fields when using virtualmachine metric set. Default false.
# Get custom fields when using virtualmachine metricset. Default false.
# get_custom_fields: false
14 changes: 12 additions & 2 deletions metricbeat/module/vsphere/_meta/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,23 @@
# - network
# - resourcepool
# - virtualmachine
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds.

# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default.
# Supported Periods:
# The Datastore and Host metricsets support performance data collection using the vSphere performance API.
# Since the performance API has usage restrictions based on data collection intervals,
# users should ensure that the period is configured optimally to receive real-time data.
# users can still collect summary metrics if performance metrics are not supported for the configured instance.
# This configuration can be determined based on the Data Collection Intervals and Data Collection Levels.
# Reference Links:
# Data Collection Intervals: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html
# Data Collection Levels: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html
period: 20s
hosts: ["https://localhost/sdk"]

username: "user"
password: "password"
# If insecure is true, don't verify the server's certificate chain
insecure: false
# Get custom fields when using virtualmachine metric set. Default false.
# Get custom fields when using virtualmachine metricset. Default false.
# get_custom_fields: false
71 changes: 68 additions & 3 deletions metricbeat/module/vsphere/_meta/docs.asciidoc
Original file line number Diff line number Diff line change
@@ -1,11 +1,76 @@
The vSphere module uses the https://github.com/vmware/govmomi[Govmomi] library to collect metrics from any Vmware SDK URL (ESXi/VCenter). This library is built for and tested against ESXi and vCenter 5.5, 6.0 and 6.5.
The vSphere module uses the https://github.com/vmware/govmomi[Govmomi] library to collect metrics from any VMware SDK URL (ESXi/VCenter).

By default it enables the metricsets `cluster`, `datastore`, `datastorecluster`, `host`, `network`, `resourcepool` and `virtualmachine`.
This module has been tested against ESXi and vCenter versions 5.5, 6.0, 6.5, and 7.0.3.

By default, the vSphere module enables the following metricsets:

1. cluster

2. datastore

3. datastorecluster

4. host

5. network

6. resourcepool

7. virtualmachine

[float]
=== Supported Periods:
The Datastore and Host metricsets support performance data collection using the vSphere performance API. Given that the performance API imposes usage restrictions based on data collection intervals, users should configure the period optimally to ensure the receipt of real-time data. This configuration can be determined based on the https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-247646EA-A04B-411A-8DD4-62A3DCFCF49B.html[Data Collection Intervals] and https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html[Data Collection Levels].

[IMPORTANT]

Only host and datastore metricsets have limitation of system configured period from vSphere instance. Users can still collect summary metrics if performance metrics are not supported for the configured instance.

[float]
==== Real-time data collection default interval:
- 20s

[float]
==== Historical data collection default intervals:
- 300s
- 1800s
- 7200s
- 86400s

[float]
=== Example:
If you need to configure multiple metricsets with different periods, you can achieve this by setting up multiple vSphere modules with different metricsets as demonstrated below:

[source,yaml]
----
- module: vsphere
metricsets:
- cluster
- datastorecluster
- network
- resourcepool
- virtualmachine
period: 10s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
insecure: false

- module: vsphere
metricsets:
- datastore
- host
period: 300s
hosts: ["https://localhost/sdk"]
username: "user"
password: "password"
insecure: false
----

[float]
=== Dashboard

The vsphere module comes with a predefined dashboard. For example:
The vSphere module includes a predefined dashboard. For example:

image::./images/metricbeat_vsphere_dashboard.png[]
image::./images/metricbeat_vsphere_vm_dashboard.png[]
4 changes: 2 additions & 2 deletions metricbeat/module/vsphere/cluster/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ func (m *ClusterMetricSet) Fetch(ctx context.Context, reporter mb.ReporterV2) er
}
defer func() {
if err := client.Logout(ctx); err != nil {
m.Logger().Debug(fmt.Errorf("error trying to logout from vSphere: %w", err))
m.Logger().Errorf("error trying to logout from vSphere: %v", err)
}
}()

Expand All @@ -91,7 +91,7 @@ func (m *ClusterMetricSet) Fetch(ctx context.Context, reporter mb.ReporterV2) er

defer func() {
if err := v.Destroy(ctx); err != nil {
m.Logger().Errorf("error trying to destroy view from vSphere: %w", err)
m.Logger().Errorf("error trying to destroy view from vSphere: %v", err)
}
}()

Expand Down
Loading