Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregate latency metrics for vm exits handling IO #4346

Merged
merged 4 commits into from
Jan 5, 2024

Conversation

wearyzen
Copy link
Contributor

@wearyzen wearyzen commented Dec 28, 2023

Changes

  • Add a new struct to record aggregate (min/max/sum) of time
    difference as a metric.
    Use the Aggregate metrics structure to record aggregate of
    IO and MMIO vm exits.
  • Firecracker till now had a "group_metrics:key_metrics:value" pair and
    with new metrics for kvm exits this relation changes to "group_metrics:
    key_metrics:sub_key_metrics:value".
    Update the flush_fc_metrics_to_cw() to be able to parse this and make
    it future proof to support further sub_key_metrics levels.

Reason

  • latency of kvm exits will improve observability.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following
Developer Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

- [ ] If a specific issue led to this PR, this PR closes the issue.

  • The description of changes is clear and encompassing.

- [ ] Any required documentation changes (code and docs) are included in this PR.

- [ ] API changes follow the Runbook for Firecracker API changes.
- [ ] User-facing changes are mentioned in CHANGELOG.md.

  • All added/changed functionality is tested.

- [ ] New TODOs link to an issue.


  • This functionality cannot be added in rust-vmm.

Copy link

codecov bot commented Dec 28, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (a6f0b0c) 81.55% compared to head (ba77af2) 81.57%.

Files Patch % Lines
src/vmm/src/logger/metrics.rs 97.14% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4346      +/-   ##
==========================================
+ Coverage   81.55%   81.57%   +0.02%     
==========================================
  Files         240      240              
  Lines       29370    29409      +39     
==========================================
+ Hits        23952    23990      +38     
- Misses       5418     5419       +1     
Flag Coverage Δ
4.14-c7g.metal 76.97% <97.29%> (+0.03%) ⬆️
4.14-m5d.metal 78.89% <97.43%> (+0.01%) ⬆️
4.14-m6a.metal 78.00% <97.43%> (+0.02%) ⬆️
4.14-m6g.metal 76.97% <97.29%> (+0.03%) ⬆️
4.14-m6i.metal 78.87% <97.43%> (+0.02%) ⬆️
5.10-c7g.metal 79.86% <97.29%> (+0.02%) ⬆️
5.10-m5d.metal 81.54% <97.43%> (+0.02%) ⬆️
5.10-m6a.metal 80.76% <97.43%> (+0.02%) ⬆️
5.10-m6g.metal 79.86% <97.29%> (+0.02%) ⬆️
5.10-m6i.metal 81.53% <97.43%> (+0.02%) ⬆️
6.1-c7g.metal 79.86% <97.29%> (+0.02%) ⬆️
6.1-m5d.metal 81.54% <97.43%> (+0.02%) ⬆️
6.1-m6a.metal 80.76% <97.43%> (+0.02%) ⬆️
6.1-m6g.metal 79.86% <97.29%> (+0.02%) ⬆️
6.1-m6i.metal 81.53% <97.43%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wearyzen wearyzen self-assigned this Dec 28, 2023
@wearyzen wearyzen added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Dec 28, 2023
src/vmm/src/logger/metrics.rs Outdated Show resolved Hide resolved
src/vmm/src/logger/metrics.rs Outdated Show resolved Hide resolved
src/vmm/src/logger/metrics.rs Outdated Show resolved Hide resolved
src/vmm/src/vstate/vcpu/mod.rs Outdated Show resolved Hide resolved
@wearyzen wearyzen force-pushed the kvm_metrics branch 2 times, most recently from 49d22ef to 6aeb5f9 Compare December 29, 2023 13:56
@wearyzen wearyzen requested a review from pb8o December 29, 2023 14:11
@wearyzen wearyzen force-pushed the kvm_metrics branch 4 times, most recently from 96a2ae1 to c57e5a9 Compare January 2, 2024 12:24
src/vmm/src/logger/metrics.rs Outdated Show resolved Hide resolved
src/vmm/src/vstate/vcpu/mod.rs Outdated Show resolved Hide resolved
@wearyzen wearyzen force-pushed the kvm_metrics branch 3 times, most recently from 1509b2c to 74d2f98 Compare January 2, 2024 19:02
Firecracker till now had a "group_metrics:key_metrics:value" pair and
with new metrics for kvm exits this relation changes to "group_metrics:
key_metrics:sub_key_metrics:value".
Update the flush_fc_metrics_to_cw() to be able to parse this and make
it future proof to support further sub_key_metrics levels.

Signed-off-by: Sudan Landge <sudanl@amazon.com>
@wearyzen wearyzen force-pushed the kvm_metrics branch 2 times, most recently from 2efa813 to 2ef448a Compare January 3, 2024 12:16
ShadowCurse
ShadowCurse previously approved these changes Jan 3, 2024
@wearyzen wearyzen force-pushed the kvm_metrics branch 3 times, most recently from 0535e6a to a53bd1c Compare January 3, 2024 18:19
@wearyzen wearyzen changed the title Kvm metrics Aggregate latency metrics for vm exits handling IO Jan 3, 2024
Sudan Landge added 2 commits January 3, 2024 18:59
Add a new struct to record aggregate (min/max/sum) of time
difference as a metric.
Use the Aggregate metrics structure to record aggregate of
IO and MMIO vm exits.
With this change Firecracker now emits new metrics corresponding to
min/max/sum for kvm exits handling IO.

Signed-off-by: Sudan Landge <sudanl@amazon.com>
While running AB perf tests with new kvm_metrics enabled
a new sock intermittent ci issues was found and ping_latency
showed a regression.
vsock iperf test fails intermittently with error:
`/tmp/iperf3-vsock: Text file busy`
ping_latency regression was not consistent and fixed on retry.
Since the only variable here is collecting metrics
adjusting the point of collection for final metrics for
performance tests fixes both the issues.

Signed-off-by: Sudan Landge <sudanl@amazon.com>
Update CHANGELOG to publish that Firecracker now emits new metrics.
Also, document using a pseudo code how to extract metrics units
from Firecracker metrics names.

Signed-off-by: Sudan Landge <sudanl@amazon.com>
Copy link
Contributor

@bchalios bchalios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Main question is why we do this only for x86?

src/vmm/src/vstate/vcpu/mod.rs Show resolved Hide resolved
@wearyzen wearyzen requested a review from bchalios January 4, 2024 17:31
CHANGELOG.md Show resolved Hide resolved
@wearyzen wearyzen merged commit ab3e4fe into firecracker-microvm:main Jan 5, 2024
7 checks passed
@wearyzen wearyzen deleted the kvm_metrics branch January 17, 2024 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Awaiting review Indicates that a pull request is ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants