Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the reference time rounding on Azure Metrics #37365

Merged
merged 11 commits into from
Jan 5, 2024

Conversation

zmoog
Copy link
Contributor

@zmoog zmoog commented Dec 8, 2023

Proposed commit message

What

Change the MetricRegistry.NeedsUpdate() method to decide whether to collect the metrics by comparing the collection interval with the time grain.

If the time since the last collection < time grain duration, then the metrics skip the collection.

For example, given the following scenario:

Scenario A: collect PT1M metrics every 60s

  • time grain: PT1M (one minute, or 60s)
  • collection interval: 60s

In this case, the time since the last collection is never shorter than the time grain, so the metricset fetch metric values on every collection.

Scenario B: collect PT15M metrics every 60s

  • time grain: PT5M (five minutes, or 300s)
  • collection interval: 60s

In this case, the time since the last collection is shorter (60s, 120s, 180s, 240s) than the time grain for four collections. The metricset fetch metric values every five collections.

The jitter

During our tests, we noticed the collection scheduling had some variations, causing the time since the last collection to be shorter than expected by a few milliseconds. To compensate for these scheduling fluctuations, the function also adds a short jitter duration (1 second) to avoid false positives due to small fluctuations in collection scheduling.

Why

During a testing session on 8.11.2, we noticed one out of four agents skipped some metrics collections.

The debug logs revealed the metricset skipped collections due to a 1-second difference between the reference time for the current and previous collections (299s instead of 300s).

CleanShot 2023-12-08 at 20 13 19

The 1-second difference may happen due to an inaccurate rounding in the reference time.

For example, suppose the following two events occur:

  1. Metricbeat calls Fetch() on the metricset a few milliseconds earlier than in the previous collection.
  2. The timestamp is 2023-12-08T10:58:32.999Z.

In this case, the reference time becomes 2023-12-08T10:58:32.000Z due to the truncation.

This problem happened to one test agent. However, if it happens to one agent, it can happen to others.

Extended Structured Logging

We also added new fields to the debug structured logs:

$ cat metricbeat.log.ndjson | grep "MetricRegistry" | head -n 1 | jq                                                                                                                                                    
{
  "log.level": "debug",
  "@timestamp": "2024-01-05T15:03:12.235+0100",
  "log.logger": "azure monitor client",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/x-pack/metricbeat/module/azure.(*MetricRegistry).NeedsUpdate",
    "file.name": "azure/metric_registry.go",
    "file.line": 80
  },
  "message": "MetricRegistry: Metric needs an update",
  "service.name": "metricbeat",
  "needs_update": true,
  "reference_time": "2024-01-05T14:03:07.197Z",
  "last_collection_time": "2024-01-05T14:02:07.199Z",
  "time_since_last_collection_seconds": 66.035681,
  "time_grain": "PT1M",
  "time_grain_duration_seconds": 60,
  "resource_id": "/subscriptions/123/resourceGroups/crest-test-lens-migration/providers/Microsoft.Compute/virtualMachines/rajvi-test-vm",
  "namespace": "Microsoft.Compute/virtualMachines",
  "aggregation": "Total",
  "names": "Network In,Network Out,Disk Read Bytes,Disk Write Bytes,Network In Total,Network Out Total",
  "ecs.version": "1.6.0"
}

Here's an example using jq:

cat metricbeat.log.ndjson | grep "MetricRegistry" | jq -r  '[.namespace, .aggregation, .needs_update, .reference_time, .last_collection_time//"na", .time_since_last_collection_seconds//"na", .time_grain_duration_seconds//"na", .time_grain] | @tsv' | grep Microsoft.Compute/virtualMachines
Command output
.aggregation                            aggregation   .needs_update   .reference_time                 .last_collection_time           time_since_last_collection_seconds .time_grain_duration_seconds .time_grain
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        61.00097                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        62.037233                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        62.302503                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        63.910846                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        64.786996                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        66.24997                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        66.914266                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        69.199998                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        69.869772                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        70.96449                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        71.634757                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        72.399627                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        72.65506                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:06:07.197Z        2024-01-05T14:05:07.196Z        74.554774                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        60.999038                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        62.568239                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        63.009686                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        64.735293                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        65.406713                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        66.811182                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        67.217416                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        67.946707                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        68.25472                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        70.140379                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        70.853599                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        71.689807                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        71.932574                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:07:07.196Z        2024-01-05T14:06:07.197Z        73.852817                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        64.228885                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        65.114602                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        66.984628                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        67.737368                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        67.996219                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        69.273389                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        70.001487                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        70.941944                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        71.300543                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        73.335651                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        73.95351                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        75.584913                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        76.420105                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:08:07.196Z        2024-01-05T14:07:07.196Z        78.096104                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        61.003248                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        62.715302                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        63.60085                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        64.456666                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        65.888113                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        68.009692                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        68.837141                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        70.939683                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        71.585741                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        73.533029                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        74.183447                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        74.605523                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        76.190121                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:09:07.199Z        2024-01-05T14:08:07.196Z        76.889351                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        60.996318                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        62.28805                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        62.909445                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        64.624709                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        65.431786                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        65.847615                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        67.033418                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        67.739314                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        67.906611                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        69.50609                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        70.174633                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        72.386937                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        114.737149                         60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:10:07.195Z        2024-01-05T14:09:07.199Z        116.805914                         60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        60.999675                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        62.670519                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        63.257914                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        63.659665                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        64.993466                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        65.586929                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        65.80711                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        67.910164                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        68.562776                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        69.946754                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        70.287227                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        71.683296                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        72.071265                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:11:07.195Z        2024-01-05T14:10:07.195Z        73.588746                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        60.998972                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        61.40119                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        62.703711                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        63.25565                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        63.419331                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        65.507727                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        66.112249                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        67.153382                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        67.447255                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        69.450744                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        70.236798                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        71.908428                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        72.327266                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:12:07.194Z        2024-01-05T14:11:07.195Z        74.403147                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        61.000777                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        61.833059                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        62.149017                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        63.787981                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        64.452981                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        65.40734                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        65.806                             60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        69.861397                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        70.523869                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        72.433781                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        73.166025                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        75.320218                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        75.766171                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:13:07.194Z        2024-01-05T14:12:07.194Z        76.141732                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        60.998842                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        61.90092                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        62.224257                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        64.192877                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        64.809266                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        66.908917                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        67.536084                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        69.229226                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        69.785417                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        70.299662                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        71.875143                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        72.624899                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        72.918337                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:14:07.193Z        2024-01-05T14:13:07.194Z        74.47974                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        61.000513                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        61.922146                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        62.194077                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        64.037054                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        64.880603                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        67.105248                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        67.979603                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        69.635532                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        70.223733                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        70.847764                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        72.666689                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        73.44926                           60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        73.618585                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:15:07.193Z        2024-01-05T14:14:07.193Z        75.405363                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        60.999661                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        61.795341                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        62.080088                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        64.929579                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        65.632209                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        67.832918                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        68.576239                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        69.927988                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.351148                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.872058                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.47401                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.971242                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        73.143605                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        74.831489                          60                           PT1M

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 8, 2023
@zmoog zmoog added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Dec 8, 2023
@zmoog zmoog self-assigned this Dec 8, 2023
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Dec 8, 2023
Copy link
Contributor

mergify bot commented Dec 8, 2023

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @zmoog? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@zmoog zmoog added the bug label Dec 8, 2023
@elasticmachine
Copy link
Collaborator

❕ Build Aborted

There is a new build on-going so the previous on-going builds have been aborted.

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2023-12-08T18:29:57.574+0000

  • Duration: 9 min 16 sec

Steps errors 1

Expand to view the steps failures

Error signal
  • Took 0 min 0 sec . View more details here
  • Description: Error 'org.jenkinsci.plugins.workflow.steps.FlowInterruptedException'

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@zmoog zmoog added backport-v8.11.0 Automated backport with mergify backport-v8.12.0 Automated backport with mergify labels Dec 8, 2023
@zmoog zmoog changed the title Fix reference time rounding in Azure Metrics Fix rounding in reference time on Azure Metrics Dec 8, 2023
@zmoog zmoog changed the title Fix rounding in reference time on Azure Metrics Fix the reference time rounding on Azure Metrics Dec 8, 2023
@elasticmachine
Copy link
Collaborator

❕ Build Aborted

Either there was a build timeout or someone aborted the build.

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Duration: 44 min 29 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 51 min 51 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@zmoog zmoog marked this pull request as ready for review December 8, 2023 20:04
@zmoog zmoog requested a review from a team as a code owner December 8, 2023 20:04
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@zmoog zmoog requested a review from a team December 8, 2023 20:04
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-12-08T20:05:12.064+0000

  • Duration: 54 min 2 sec

Test stats 🧪

Test Results
Failed 0
Passed 1558
Skipped 96
Total 1654

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

//
// See "Round outer limits" and "Round inner limits" tests in
// the metric_registry_test.go for more information.
referenceTime := time.Now().UTC().Round(time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to what extent do we care about duplicate collections vs skipped collections?

i.e. does it make sense to widen the window here to allow for drift in the collection period, maybe up to 5s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this jitter I added with 72c5b69 allows the collection time to drift a little bit to compensate fluctuations.

I am currently using a 1-second jitter, but we can go with 2-5s I guess.

// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package azure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a clean copy-paste?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied and pasted the license header from another file; let me check if I picked a bad one 👀

@zmoog
Copy link
Contributor Author

zmoog commented Dec 12, 2023

Hey @tommyers-elastic, with 72c5b69 I switched from truncating/rounding to use a jitter during timestamp comparison:

The pros should be:

  • Avoid having the thresholds we have with truncating or rounding, where a 1ms difference can flip the final result to the next or previous second.
  • Using a jitter gives us more flexibility (we can make it configurable)
  • Keeping the referenceTime value intact helps with troubleshooting

In my tests, this version works as good as the previous one. There's an image available with this change and info about how to run this version on a local stack.

@zmoog
Copy link
Contributor Author

zmoog commented Dec 12, 2023

Here's how difference (distance := lastCollection.timestamp.Sub(timeGrainStartTime)) varies over collections from the Metricbeat logs on my machine:

$ cat metricbeat.log.ndjson | grep "MetricRegistry" | jq -r  '[.namespace, .time_grain, .needs_update, .reference_time, .last_collection_at//"na", .time_grain_start_time//"na", .distance//"na", .elapsed//"na", .jitter//"na"] | @tsv' | grep Microsoft.Compute/virtualMachines
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:45:03.982Z        2023-12-12T13:40:03.985Z        2023-12-12T13:40:03.982Z        3.251ms 4m59.996749s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:45:03.982Z        2023-12-12T13:40:03.985Z        2023-12-12T13:40:03.982Z        3.251ms 4m59.996749s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:45:03.982Z        2023-12-12T13:40:03.985Z        2023-12-12T13:40:03.982Z        3.251ms 4m59.996749s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:45:03.982Z        2023-12-12T13:40:03.985Z        2023-12-12T13:40:03.982Z        3.251ms 4m59.996749s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:45:03.982Z        2023-12-12T13:40:03.985Z        2023-12-12T13:40:03.982Z        3.251ms 4m59.996749s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:45:03.982Z        2023-12-12T13:40:03.985Z        2023-12-12T13:40:03.982Z        3.251ms 4m59.996749s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:45:03.982Z        2023-12-12T13:40:03.985Z        2023-12-12T13:40:03.982Z        3.251ms 4m59.996749s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:50:03.980Z        2023-12-12T13:45:03.982Z        2023-12-12T13:45:03.980Z        2.171ms 4m59.997829s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T13:55:03.977Z        2023-12-12T13:50:03.980Z        2023-12-12T13:50:03.977Z        3.283ms 4m59.996717s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:00:03.974Z        2023-12-12T13:55:03.977Z        2023-12-12T13:55:03.974Z        2.489ms 4m59.997511s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:05:03.977Z        2023-12-12T14:00:03.974Z        2023-12-12T14:00:03.977Z        -2.857ms        5m0.002857s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:10:03.975Z        2023-12-12T14:05:03.977Z        2023-12-12T14:05:03.975Z        1.953ms 4m59.998047s    1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:15:03.976Z        2023-12-12T14:10:03.975Z        2023-12-12T14:10:03.976Z        -922µs  5m0.000922s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s
Microsoft.Compute/virtualMachines       PT5M    true    2023-12-12T14:20:03.976Z        2023-12-12T14:15:03.976Z        2023-12-12T14:15:03.976Z        -412µs  5m0.000412s     1s

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-12-12T14:08:51.306+0000

  • Duration: 52 min 32 sec

Test stats 🧪

Test Results
Failed 0
Passed 1558
Skipped 96
Total 1654

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@tommyers-elastic
Copy link
Contributor

awesome - thanks @zmoog !

@zmoog
Copy link
Contributor Author

zmoog commented Dec 29, 2023

Are these gaps a behavior specific to the Metricbeat implementation, or is it inherent to collecting metrics with a time grain equal to the collection interval (for example, collecting a PT1M metric using a 60-second collection interval)?

To answer this question, I set up an OTel Collector and tried to figure out how to collect Azure metrics using the azuremonitorreceiver.

After running the Azure Monitor Receiver for a while, I compared the metrics with the data on Azure Portal and Metricbeat (with the changes in this PR):

The gaps also appear on the Azure Monitor Receiver when time grain and collection interval have the same duration. The changes in this PR try to address this problem, avoiding the gaps.

Check zmoog/public-notes#67 (comment) to learn more.

@zmoog zmoog force-pushed the zmoog/round-reference-time-on-azure-metrics branch from 72c5b69 to fabacd6 Compare December 29, 2023 15:17
@zmoog zmoog requested a review from a team as a code owner December 29, 2023 15:17
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 54 min 23 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@zmoog zmoog force-pushed the zmoog/round-reference-time-on-azure-metrics branch from 9f266b7 to 76c879a Compare January 4, 2024 19:08
@zmoog
Copy link
Contributor Author

zmoog commented Jan 4, 2024

Hey @elastic/elastic-agent-data-plane, it seems you are now the only owner of x-pack/metricbeat/module/azure. Is this intentional, or is it a side-effect of something?

I wanted to include this fix in the last 8.12 BC.

Please let me know if we need to update the CODEOWNERS or if I need to ask you for a review.

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 56 min 47 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

Copy link
Member

@rdner rdner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmoog it looks like the entry for /x-pack/metricbeat/module/azure was never there, only for filebeat. So, it defaults to our team. Please update the codeowners. I'm approving this PR, so you can merge it.

@zmoog
Copy link
Contributor Author

zmoog commented Jan 5, 2024

it looks like the entry for /x-pack/metricbeat/module/azure was never there, only for filebeat. So, it defaults to our team.

Yep, indeed.

Previously, /x-pack/metricbeat/module/ was assigned to @elastic/integrations which allowed our team to work on this module independently.

I'll open a PR to add an entry to CODEOWNERS for each module we own.

@rdner, thanks for approving the PR in the meantime.

zmoog added 11 commits January 5, 2024 16:10
During a testing session on 8.11.2, we noticed some skipped
collections on one of the testing agents.

Debug information revealed the metricset skipped some collections due
to a 1-second difference between the reference time in the current
collection, and the reference time in the previous collection, making
the collection period is 1 second shorter (299s instead of 300s).

Collection skip may happen due to reference time rounding.

For example, the timestamp 2023-12-08T10:58:32.999Z may become
2023-12-08T10:58:32.000Z due to the truncation.

As of today, this problem is happening on one agent only, but the
problem is real, and we should replace the truncate(1s) with a
round(1s) to eliminate fluctuations.
Not just equal, I want to check the value is the expected one.
Instead of truncating or rounding `referenceTime` to a value, I am
opting to keep the `referenceTime` value intact and using a jitter
when comparing it with the last collected time.

Pros:

- avoid having the thresholds we have with truncating or rounding,
  where a 1ms difference can flip the final result to the next or
  previous second.
- using a jitter gives us more flexibility (we can make it
  configurable)
- keeping the `referenceTime` value intact helps with troubleshooting
Remove outdated tests
Drop another commented line of code.
@zmoog zmoog force-pushed the zmoog/round-reference-time-on-azure-metrics branch from 76c879a to cbf7167 Compare January 5, 2024 15:11
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 54 min 47 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@zmoog zmoog merged commit 824dd04 into elastic:main Jan 5, 2024
25 checks passed
@zmoog zmoog deleted the zmoog/round-reference-time-on-azure-metrics branch January 5, 2024 16:30
mergify bot pushed a commit that referenced this pull request Jan 5, 2024
### What

Change the `MetricRegistry.NeedsUpdate()` method to decide whether to collect the metrics by comparing the collection interval with the time grain.

If the time since the last collection < time grain duration, then the metrics skip the collection.

For example, given the following scenario:

#### Scenario A: collect PT1M metrics every 60s

- time grain: PT1M (one minute, or 60s)
- collection interval: 60s

In this case, the time since the last collection is never shorter than the time grain, so the metricset fetch metric values on every collection.

#### Scenario B: collect PT15M metrics every 60s

- time grain: PT5M (five minutes, or 300s)
- collection interval: 60s

In this case, the time since the last collection is shorter (60s, 120s, 180s, 240s) than the time grain for four collections. The metricset fetch metric values every five collections.

#### The jitter

During our tests, we noticed the collection scheduling had some variations, causing the time since the last collection to be shorter than expected by a few milliseconds. To compensate for these scheduling fluctuations, the function also adds a short jitter duration (1 second) to avoid false positives due to small fluctuations in collection scheduling.

### Why

During a testing session on 8.11.2, we [noticed](#37204 (comment)) one out of four agents skipped some metrics collections.

The debug logs revealed the metricset skipped collections due to a 1-second difference between the reference time for the current and previous collections (299s instead of 300s).

![CleanShot 2023-12-08 at 20 13 19](https://github.com/elastic/beats/assets/25941/dc3d5040-c89b-47d2-a86a-124eb838ca36)

The 1-second difference may happen due to an inaccurate rounding in the reference time.

For example, suppose the following two events occur:

1. Metricbeat calls `Fetch()` on the metricset a few milliseconds earlier than in the previous collection.
2. The timestamp is 2023-12-08T10:58:32.999Z.

In this case, the reference time becomes 2023-12-08T10:58:32.000Z due to the truncation.

This problem happened to one test agent. However, if it happens to one agent, it can happen to others.

### Extended Structured Logging

We also added new fields to the debug structured logs:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | head -n 1 | jq
{
  "log.level": "debug",
  "@timestamp": "2024-01-05T15:03:12.235+0100",
  "log.logger": "azure monitor client",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/x-pack/metricbeat/module/azure.(*MetricRegistry).NeedsUpdate",
    "file.name": "azure/metric_registry.go",
    "file.line": 80
  },
  "message": "MetricRegistry: Metric needs an update",
  "service.name": "metricbeat",
  "needs_update": true,
  "reference_time": "2024-01-05T14:03:07.197Z",
  "last_collection_time": "2024-01-05T14:02:07.199Z",
  "time_since_last_collection_seconds": 66.035681,
  "time_grain": "PT1M",
  "time_grain_duration_seconds": 60,
  "resource_id": "/subscriptions/123/resourceGroups/crest-test-lens-migration/providers/Microsoft.Compute/virtualMachines/rajvi-test-vm",
  "namespace": "Microsoft.Compute/virtualMachines",
  "aggregation": "Total",
  "names": "Network In,Network Out,Disk Read Bytes,Disk Write Bytes,Network In Total,Network Out Total",
  "ecs.version": "1.6.0"
}
```

Here's an example using `jq`:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | jq -r  '[.namespace, .aggregation, .needs_update, .reference_time, .last_collection_time//"na", .time_since_last_collection_seconds//"na", .time_grain_duration_seconds//"na", .time_grain] | @TSV' | grep Microsoft.Compute/virtualMachines

.aggregation                            aggregation   .needs_update   .reference_time                 .last_collection_time           time_since_last_collection_seconds .time_grain_duration_seconds .time_grain
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        60.999661                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        61.795341                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        62.080088                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        64.929579                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        65.632209                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        67.832918                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        68.576239                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        69.927988                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.351148                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.872058                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.47401                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.971242                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        73.143605                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        74.831489                          60                           PT1M
```

(cherry picked from commit 824dd04)
@zmoog zmoog added the backport-v8.11.0 Automated backport with mergify label Jan 5, 2024
mergify bot pushed a commit that referenced this pull request Jan 5, 2024
### What

Change the `MetricRegistry.NeedsUpdate()` method to decide whether to collect the metrics by comparing the collection interval with the time grain.

If the time since the last collection < time grain duration, then the metrics skip the collection.

For example, given the following scenario:

#### Scenario A: collect PT1M metrics every 60s

- time grain: PT1M (one minute, or 60s)
- collection interval: 60s

In this case, the time since the last collection is never shorter than the time grain, so the metricset fetch metric values on every collection.

#### Scenario B: collect PT15M metrics every 60s

- time grain: PT5M (five minutes, or 300s)
- collection interval: 60s

In this case, the time since the last collection is shorter (60s, 120s, 180s, 240s) than the time grain for four collections. The metricset fetch metric values every five collections.

#### The jitter

During our tests, we noticed the collection scheduling had some variations, causing the time since the last collection to be shorter than expected by a few milliseconds. To compensate for these scheduling fluctuations, the function also adds a short jitter duration (1 second) to avoid false positives due to small fluctuations in collection scheduling.

### Why

During a testing session on 8.11.2, we [noticed](#37204 (comment)) one out of four agents skipped some metrics collections.

The debug logs revealed the metricset skipped collections due to a 1-second difference between the reference time for the current and previous collections (299s instead of 300s).

![CleanShot 2023-12-08 at 20 13 19](https://github.com/elastic/beats/assets/25941/dc3d5040-c89b-47d2-a86a-124eb838ca36)

The 1-second difference may happen due to an inaccurate rounding in the reference time.

For example, suppose the following two events occur:

1. Metricbeat calls `Fetch()` on the metricset a few milliseconds earlier than in the previous collection.
2. The timestamp is 2023-12-08T10:58:32.999Z.

In this case, the reference time becomes 2023-12-08T10:58:32.000Z due to the truncation.

This problem happened to one test agent. However, if it happens to one agent, it can happen to others.

### Extended Structured Logging

We also added new fields to the debug structured logs:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | head -n 1 | jq
{
  "log.level": "debug",
  "@timestamp": "2024-01-05T15:03:12.235+0100",
  "log.logger": "azure monitor client",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/x-pack/metricbeat/module/azure.(*MetricRegistry).NeedsUpdate",
    "file.name": "azure/metric_registry.go",
    "file.line": 80
  },
  "message": "MetricRegistry: Metric needs an update",
  "service.name": "metricbeat",
  "needs_update": true,
  "reference_time": "2024-01-05T14:03:07.197Z",
  "last_collection_time": "2024-01-05T14:02:07.199Z",
  "time_since_last_collection_seconds": 66.035681,
  "time_grain": "PT1M",
  "time_grain_duration_seconds": 60,
  "resource_id": "/subscriptions/123/resourceGroups/crest-test-lens-migration/providers/Microsoft.Compute/virtualMachines/rajvi-test-vm",
  "namespace": "Microsoft.Compute/virtualMachines",
  "aggregation": "Total",
  "names": "Network In,Network Out,Disk Read Bytes,Disk Write Bytes,Network In Total,Network Out Total",
  "ecs.version": "1.6.0"
}
```

Here's an example using `jq`:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | jq -r  '[.namespace, .aggregation, .needs_update, .reference_time, .last_collection_time//"na", .time_since_last_collection_seconds//"na", .time_grain_duration_seconds//"na", .time_grain] | @TSV' | grep Microsoft.Compute/virtualMachines

.aggregation                            aggregation   .needs_update   .reference_time                 .last_collection_time           time_since_last_collection_seconds .time_grain_duration_seconds .time_grain
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        60.999661                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        61.795341                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        62.080088                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        64.929579                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        65.632209                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        67.832918                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        68.576239                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        69.927988                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.351148                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.872058                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.47401                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.971242                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        73.143605                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        74.831489                          60                           PT1M
```

(cherry picked from commit 824dd04)
zmoog added a commit that referenced this pull request Jan 6, 2024
…ics (#37557)

* Fix the reference time rounding on Azure Metrics (#37365)

### What

Change the `MetricRegistry.NeedsUpdate()` method to decide whether to collect the metrics by comparing the collection interval with the time grain.

If the time since the last collection < time grain duration, then the metrics skip the collection.

For example, given the following scenario:

#### Scenario A: collect PT1M metrics every 60s

- time grain: PT1M (one minute, or 60s)
- collection interval: 60s

In this case, the time since the last collection is never shorter than the time grain, so the metricset fetch metric values on every collection.

#### Scenario B: collect PT15M metrics every 60s

- time grain: PT5M (five minutes, or 300s)
- collection interval: 60s

In this case, the time since the last collection is shorter (60s, 120s, 180s, 240s) than the time grain for four collections. The metricset fetch metric values every five collections.

#### The jitter

During our tests, we noticed the collection scheduling had some variations, causing the time since the last collection to be shorter than expected by a few milliseconds. To compensate for these scheduling fluctuations, the function also adds a short jitter duration (1 second) to avoid false positives due to small fluctuations in collection scheduling.

### Why

During a testing session on 8.11.2, we [noticed](#37204 (comment)) one out of four agents skipped some metrics collections.

The debug logs revealed the metricset skipped collections due to a 1-second difference between the reference time for the current and previous collections (299s instead of 300s).

![CleanShot 2023-12-08 at 20 13 19](https://github.com/elastic/beats/assets/25941/dc3d5040-c89b-47d2-a86a-124eb838ca36)

The 1-second difference may happen due to an inaccurate rounding in the reference time.

For example, suppose the following two events occur:

1. Metricbeat calls `Fetch()` on the metricset a few milliseconds earlier than in the previous collection.
2. The timestamp is 2023-12-08T10:58:32.999Z.

In this case, the reference time becomes 2023-12-08T10:58:32.000Z due to the truncation.

This problem happened to one test agent. However, if it happens to one agent, it can happen to others.

### Extended Structured Logging

We also added new fields to the debug structured logs:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | head -n 1 | jq
{
  "log.level": "debug",
  "@timestamp": "2024-01-05T15:03:12.235+0100",
  "log.logger": "azure monitor client",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/x-pack/metricbeat/module/azure.(*MetricRegistry).NeedsUpdate",
    "file.name": "azure/metric_registry.go",
    "file.line": 80
  },
  "message": "MetricRegistry: Metric needs an update",
  "service.name": "metricbeat",
  "needs_update": true,
  "reference_time": "2024-01-05T14:03:07.197Z",
  "last_collection_time": "2024-01-05T14:02:07.199Z",
  "time_since_last_collection_seconds": 66.035681,
  "time_grain": "PT1M",
  "time_grain_duration_seconds": 60,
  "resource_id": "/subscriptions/123/resourceGroups/crest-test-lens-migration/providers/Microsoft.Compute/virtualMachines/rajvi-test-vm",
  "namespace": "Microsoft.Compute/virtualMachines",
  "aggregation": "Total",
  "names": "Network In,Network Out,Disk Read Bytes,Disk Write Bytes,Network In Total,Network Out Total",
  "ecs.version": "1.6.0"
}
```

Here's an example using `jq`:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | jq -r  '[.namespace, .aggregation, .needs_update, .reference_time, .last_collection_time//"na", .time_since_last_collection_seconds//"na", .time_grain_duration_seconds//"na", .time_grain] | @TSV' | grep Microsoft.Compute/virtualMachines

.aggregation                            aggregation   .needs_update   .reference_time                 .last_collection_time           time_since_last_collection_seconds .time_grain_duration_seconds .time_grain
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        60.999661                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        61.795341                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        62.080088                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        64.929579                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        65.632209                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        67.832918                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        68.576239                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        69.927988                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.351148                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.872058                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.47401                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.971242                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        73.143605                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        74.831489                          60                           PT1M
```

(cherry picked from commit 824dd04)

* Remove extra changelog entry

---------

Co-authored-by: Maurizio Branca <maurizio.branca@elastic.co>
zmoog added a commit that referenced this pull request Jan 6, 2024
### What

Change the `MetricRegistry.NeedsUpdate()` method to decide whether to collect the metrics by comparing the collection interval with the time grain.

If the time since the last collection < time grain duration, then the metrics skip the collection.

For example, given the following scenario:

#### Scenario A: collect PT1M metrics every 60s

- time grain: PT1M (one minute, or 60s)
- collection interval: 60s

In this case, the time since the last collection is never shorter than the time grain, so the metricset fetch metric values on every collection.

#### Scenario B: collect PT15M metrics every 60s

- time grain: PT5M (five minutes, or 300s)
- collection interval: 60s

In this case, the time since the last collection is shorter (60s, 120s, 180s, 240s) than the time grain for four collections. The metricset fetch metric values every five collections.

#### The jitter

During our tests, we noticed the collection scheduling had some variations, causing the time since the last collection to be shorter than expected by a few milliseconds. To compensate for these scheduling fluctuations, the function also adds a short jitter duration (1 second) to avoid false positives due to small fluctuations in collection scheduling.

### Why

During a testing session on 8.11.2, we [noticed](#37204 (comment)) one out of four agents skipped some metrics collections.

The debug logs revealed the metricset skipped collections due to a 1-second difference between the reference time for the current and previous collections (299s instead of 300s).

![CleanShot 2023-12-08 at 20 13 19](https://github.com/elastic/beats/assets/25941/dc3d5040-c89b-47d2-a86a-124eb838ca36)

The 1-second difference may happen due to an inaccurate rounding in the reference time.

For example, suppose the following two events occur:

1. Metricbeat calls `Fetch()` on the metricset a few milliseconds earlier than in the previous collection.
2. The timestamp is 2023-12-08T10:58:32.999Z.

In this case, the reference time becomes 2023-12-08T10:58:32.000Z due to the truncation.

This problem happened to one test agent. However, if it happens to one agent, it can happen to others.

### Extended Structured Logging

We also added new fields to the debug structured logs:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | head -n 1 | jq
{
  "log.level": "debug",
  "@timestamp": "2024-01-05T15:03:12.235+0100",
  "log.logger": "azure monitor client",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/x-pack/metricbeat/module/azure.(*MetricRegistry).NeedsUpdate",
    "file.name": "azure/metric_registry.go",
    "file.line": 80
  },
  "message": "MetricRegistry: Metric needs an update",
  "service.name": "metricbeat",
  "needs_update": true,
  "reference_time": "2024-01-05T14:03:07.197Z",
  "last_collection_time": "2024-01-05T14:02:07.199Z",
  "time_since_last_collection_seconds": 66.035681,
  "time_grain": "PT1M",
  "time_grain_duration_seconds": 60,
  "resource_id": "/subscriptions/123/resourceGroups/crest-test-lens-migration/providers/Microsoft.Compute/virtualMachines/rajvi-test-vm",
  "namespace": "Microsoft.Compute/virtualMachines",
  "aggregation": "Total",
  "names": "Network In,Network Out,Disk Read Bytes,Disk Write Bytes,Network In Total,Network Out Total",
  "ecs.version": "1.6.0"
}
```

Here's an example using `jq`:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | jq -r  '[.namespace, .aggregation, .needs_update, .reference_time, .last_collection_time//"na", .time_since_last_collection_seconds//"na", .time_grain_duration_seconds//"na", .time_grain] | @TSV' | grep Microsoft.Compute/virtualMachines

.aggregation                            aggregation   .needs_update   .reference_time                 .last_collection_time           time_since_last_collection_seconds .time_grain_duration_seconds .time_grain
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        60.999661                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        61.795341                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        62.080088                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        64.929579                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        65.632209                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        67.832918                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        68.576239                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        69.927988                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.351148                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.872058                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.47401                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.971242                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        73.143605                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        74.831489                          60                           PT1M
```

(cherry picked from commit 824dd04)

Co-authored-by: Maurizio Branca <maurizio.branca@elastic.co>
Scholar-Li pushed a commit to Scholar-Li/beats that referenced this pull request Feb 5, 2024
### What

Change the `MetricRegistry.NeedsUpdate()` method to decide whether to collect the metrics by comparing the collection interval with the time grain. 

If the time since the last collection < time grain duration, then the metrics skip the collection.

For example, given the following scenario:

#### Scenario A: collect PT1M metrics every 60s

- time grain: PT1M (one minute, or 60s)
- collection interval: 60s

In this case, the time since the last collection is never shorter than the time grain, so the metricset fetch metric values on every collection.

#### Scenario B: collect PT15M metrics every 60s

- time grain: PT5M (five minutes, or 300s)
- collection interval: 60s

In this case, the time since the last collection is shorter (60s, 120s, 180s, 240s) than the time grain for four collections. The metricset fetch metric values every five collections.

#### The jitter

During our tests, we noticed the collection scheduling had some variations, causing the time since the last collection to be shorter than expected by a few milliseconds. To compensate for these scheduling fluctuations, the function also adds a short jitter duration (1 second) to avoid false positives due to small fluctuations in collection scheduling. 

### Why

During a testing session on 8.11.2, we [noticed](elastic#37204 (comment)) one out of four agents skipped some metrics collections.

The debug logs revealed the metricset skipped collections due to a 1-second difference between the reference time for the current and previous collections (299s instead of 300s).

![CleanShot 2023-12-08 at 20 13 19](https://github.com/elastic/beats/assets/25941/dc3d5040-c89b-47d2-a86a-124eb838ca36)

The 1-second difference may happen due to an inaccurate rounding in the reference time.

For example, suppose the following two events occur:

1. Metricbeat calls `Fetch()` on the metricset a few milliseconds earlier than in the previous collection.
2. The timestamp is 2023-12-08T10:58:32.999Z. 

In this case, the reference time becomes 2023-12-08T10:58:32.000Z due to the truncation.

This problem happened to one test agent. However, if it happens to one agent, it can happen to others.

### Extended Structured Logging

We also added new fields to the debug structured logs:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | head -n 1 | jq                                                                                                                                                    
{
  "log.level": "debug",
  "@timestamp": "2024-01-05T15:03:12.235+0100",
  "log.logger": "azure monitor client",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/x-pack/metricbeat/module/azure.(*MetricRegistry).NeedsUpdate",
    "file.name": "azure/metric_registry.go",
    "file.line": 80
  },
  "message": "MetricRegistry: Metric needs an update",
  "service.name": "metricbeat",
  "needs_update": true,
  "reference_time": "2024-01-05T14:03:07.197Z",
  "last_collection_time": "2024-01-05T14:02:07.199Z",
  "time_since_last_collection_seconds": 66.035681,
  "time_grain": "PT1M",
  "time_grain_duration_seconds": 60,
  "resource_id": "/subscriptions/123/resourceGroups/crest-test-lens-migration/providers/Microsoft.Compute/virtualMachines/rajvi-test-vm",
  "namespace": "Microsoft.Compute/virtualMachines",
  "aggregation": "Total",
  "names": "Network In,Network Out,Disk Read Bytes,Disk Write Bytes,Network In Total,Network Out Total",
  "ecs.version": "1.6.0"
}
```

Here's an example using `jq`:

```shell
$ cat metricbeat.log.ndjson | grep "MetricRegistry" | jq -r  '[.namespace, .aggregation, .needs_update, .reference_time, .last_collection_time//"na", .time_since_last_collection_seconds//"na", .time_grain_duration_seconds//"na", .time_grain] | @TSV' | grep Microsoft.Compute/virtualMachines

.aggregation                            aggregation   .needs_update   .reference_time                 .last_collection_time           time_since_last_collection_seconds .time_grain_duration_seconds .time_grain
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        60.999661                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        61.795341                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        62.080088                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        64.929579                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        65.632209                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        67.832918                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        68.576239                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        69.927988                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.351148                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        70.872058                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.47401                           60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        72.971242                          60                           PT1M
Microsoft.Compute/virtualMachines       Average       true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        73.143605                          60                           PT1M
Microsoft.Compute/virtualMachines       Total         true            2024-01-05T14:16:07.193Z        2024-01-05T14:15:07.193Z        74.831489                          60                           PT1M
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.11.0 Automated backport with mergify backport-v8.12.0 Automated backport with mergify bug Team:Elastic-Agent Label for the Agent team Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Azure Monitor may skip a metric collection depending on the timing
6 participants