[Bug] Fix synchronous fetch count metric #1016

turan18 · 2023-12-29T19:05:48Z

Description

We increment a metric(SynchronousReadRegistryFetchCount) every time we need to fetch bytes of content synchronously. This isn't really indicative of registry fetches as we could have returned the content from our local cache. What we really need to know is the number of synchronous fetch operations that led to cache misses where we actually had to go over the network to get the requested bytes.

In a perfect scenario, this number should always be 0, meaning we should always have the content fetched and cached before the container needs it. In practice, this is almost always never the case, so looking at the metric should give us an idea of how well our background fetcher is able to fetch the needed content and an indication on how well optimizations like prioritized pre-fetching (eg: Load Order Documents #749) could benefit certain workloads.

Note: This isn't a perfect heuristic for determining the benefit of LOD as there could be several reasons that are unrelated to the ordering of span fetches as to why the background fetcher couldn't fetch and cache the requested content before hand. (eg: Maybe there were many sequential spans that the container needed and the background fetcher couldn't get to them in time or some transient failure occurred when trying to fetch/cache the content).

Steps to reproduce the bug

No response

Describe the results you expected

Ideally, we should increment this metric when we have to synchronously fetch the content over the network. We can move the metric increment to the end of getSpanContent. The metric itself is at the layer level and so it also requires the layer digest dimension, so we'll have to figure out a way to pass the layer digest into the span manager or just omit it entirely, although I think it may be helpful to be able to aggregate at a layer level.

Host information

OS: N/A
Snapshotter Version: v0.4.1
Containerd Version: N/A

Any additional context or information about the bug

Related to: #749 and #225

The text was updated successfully, but these errors were encountered:

turan18 added the bug Something isn't working label Dec 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Fix synchronous fetch count metric #1016

[Bug] Fix synchronous fetch count metric #1016

turan18 commented Dec 29, 2023 •

edited

Loading

[Bug] Fix synchronous fetch count metric #1016

[Bug] Fix synchronous fetch count metric #1016

Comments

turan18 commented Dec 29, 2023 • edited Loading

Description

Steps to reproduce the bug

Describe the results you expected

Host information

Any additional context or information about the bug

turan18 commented Dec 29, 2023 •

edited

Loading