-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metricbeat][vSphere] Support for configurable IntervalId for performance API #40678
[Metricbeat][vSphere] Support for configurable IntervalId for performance API #40678
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
…id-for-performance-api
…id-for-performance-api
m.Logger().Errorf("failed to convert performance data to metric series: %v", err) | ||
} | ||
|
||
for _, result := range results[0].Value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should make sure that results
has at least one value like we have done for samples. Or it could be a potential panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another check before it if there are zero samples for results then it will return from there itself and if there is any issue while converting it to series we will return with error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got your point. However, looking at the definition of ToMetricSeries
, I still see two potential issues with it
- If the type assertion
s, ok := series[i].(*types.PerfEntityMetric)
fails, the function panics. This could lead to a crash unless you want this behavior. Should we return an error instead of panicking to handle unexpected input more gracefully? - The line
v := s.Value[j].(*types.PerfMetricIntSeries)
assumes that all metrics in s.Value are of type *types.PerfMetricIntSeries. If that's not guaranteed (or there are other possible types), this will panic.
Both cases could potentially result in a crash. Unless we intend to abort the execution or expect this to happen, we should be handling it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Aman. The only case it is returning an error is when CounterInfoByKey fails. Even if it passes; it doesn't guarantee that the returned slice from ToMetricSeries
will have at least 1 element.
So, we should ideally handle it here just to be on the safe side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right about panic.
We should probably handle it.
Either we handle panic or we need to manually convert to the series.
I can create new method as well because we are already doing CounterInfoByName
. I can just reuse it.
And for other stuff we can return error instead of Panic.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding recover to make sure code does not fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kush-elastic I'm thinking if instead of just logging, would setting the returned error value to this would be more useful? From the official Go blog about recover:
The convention in the Go libraries is that even when a package uses panic internally, its external API still presents explicit error return values
One such example would be this. Here it explicitly sets the error value for the caller to handle it.
Either way, use of recover
looks like a good way to handle. I will let you decide on how to treat this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked this pretty quick; will request you to verify as well as I am busy in some other work:
Query calls QueryPerf and that makes sure that the types are right: https://github.com/vmware/govmomi/blob/c1151f859ba649771c4ab60992953019904355fe/simulator/performance_manager.go#L219
Except for length, I don't think we need to check for types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can only iff https://github.com/vmware/govmomi/blob/c1151f859ba649771c4ab60992953019904355fe/simulator/performance_manager.go#L249 is done. But you are not setting the format in here:
spec := types.PerfQuerySpec{
Entity: hst.Reference(),
MetricId: metricIds,
MaxSample: 1,
IntervalId: refreshRate,
}
So safe.
|
||
# Real-time data collection – An ESXi Server collects data for each performance counter every 20 seconds by default. | ||
# Supported Periods: | ||
# The Datastore and Host metricsets support performance data collection using the vSphere performance API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we also mention that this will not impact the metrics collection other than perf metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea. let me do that.
This pull request is now in conflicts. Could you fix it? 🙏
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you incorporate the documentation change, GTG
…vsphere-support-for-configurable-intervalid-for-performance-api
…id-for-performance-api
…ance API (#40678) * initial commit for intervalId supports for performance metrics * update docs and fix CI * Add changelog entry * fix CI * resolve review comments * fix loggers * resolved review comments * update versions * update UTs * update integration tests * 10s -> 20s * Update CHANGELOG.next.asciidoc Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> * Update metricbeat/docs/modules/vsphere.asciidoc Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> * make update * add recover for ToMetricSeries panic * return error instead just logging it. * remove restriction of interval IDs * remove unnecessary validations * remove recover and add empty condition * update changelog entry * Fix wrapping of errors in loggers * update data.json * update data.json * fix CI and loggers * update changelog entries * make update * fix changelog entries * update changelog entry --------- Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> (cherry picked from commit c75a7a4)
…ance API (#40678) * initial commit for intervalId supports for performance metrics * update docs and fix CI * Add changelog entry * fix CI * resolve review comments * fix loggers * resolved review comments * update versions * update UTs * update integration tests * 10s -> 20s * Update CHANGELOG.next.asciidoc Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> * Update metricbeat/docs/modules/vsphere.asciidoc Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> * make update * add recover for ToMetricSeries panic * return error instead just logging it. * remove restriction of interval IDs * remove unnecessary validations * remove recover and add empty condition * update changelog entry * Fix wrapping of errors in loggers * update data.json * update data.json * fix CI and loggers * update changelog entries * make update * fix changelog entries * update changelog entry --------- Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> (cherry picked from commit c75a7a4)
…e IntervalId for performance API (#40833) * [Metricbeat][vSphere] Support for configurable IntervalId for performance API (#40678) * initial commit for intervalId supports for performance metrics * update docs and fix CI * Add changelog entry * fix CI * resolve review comments * fix loggers * resolved review comments * update versions * update UTs * update integration tests * 10s -> 20s * Update CHANGELOG.next.asciidoc Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> * Update metricbeat/docs/modules/vsphere.asciidoc Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> * make update * add recover for ToMetricSeries panic * return error instead just logging it. * remove restriction of interval IDs * remove unnecessary validations * remove recover and add empty condition * update changelog entry * Fix wrapping of errors in loggers * update data.json * update data.json * fix CI and loggers * update changelog entries * make update * fix changelog entries * update changelog entry --------- Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> (cherry picked from commit c75a7a4) * Update CHANGELOG.next.asciidoc --------- Co-authored-by: Kush Rana <89848966+kush-elastic@users.noreply.github.com> Co-authored-by: Ishleen Kaur <102962586+ishleenk17@users.noreply.github.com>
… IntervalId for performance API (#40897) * [Metricbeat][vSphere] Support for configurable IntervalId for performance API (#40678) * initial commit for intervalId supports for performance metrics * update docs and fix CI * Add changelog entry * fix CI * resolve review comments * fix loggers * resolved review comments * update versions * update UTs * update integration tests * 10s -> 20s * Update CHANGELOG.next.asciidoc Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> * Update metricbeat/docs/modules/vsphere.asciidoc Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> * make update * add recover for ToMetricSeries panic * return error instead just logging it. * remove restriction of interval IDs * remove unnecessary validations * remove recover and add empty condition * update changelog entry * Fix wrapping of errors in loggers * update changelog entry --------- Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com> (cherry picked from commit c75a7a4) * fix changelog entry * fix changelog entry * remove extra entries
The vSphere Performance API offers support for Intervals and Levels, allowing users to collect various types of counters and metrics from vSphere environments. Depending on the configured Interval, the API provides detailed or aggregated performance data from vSphere.
For more information, refer to the following resources:
We can add support for configuration that help users collect the specific performance data they need.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues