Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce viewport aspect ratio validation for URL Metrics #1494

Merged
merged 7 commits into from
Aug 22, 2024

Conversation

westonruter
Copy link
Member

As mentioned in #1490, I was debugging URL Metrics on my own blog because I found that commenter avatars weren't getting loading=lazy even though they are far outside the initial viewport. When looking the URL Metrics for the URL, I was surprised to find a URL Metric with this viewport:

"viewport": {
    "width": 1024,
    "height": 12140
}

This is an aspect ratio of 0.084 which seems incredibly unusual. It turns out that Googlebot actually crawls pages with a viewport set so extremely high like this (cf. GoogleBot Crawls & Renders Tall, With 9000px High Viewport?). The result is that when computing the all_element_max_intersection_ratios, since this URL Metric from Googlebot always has the entire page in the viewport, they will always be 1. This means that loading=lazy would get removed from all images whenever there is a URL Metric stored from Googlebot for a page. Additionally, it could be that a large image outside of the viewport for normal visitors could end up getting identified as the LCP image due to it being in the viewport for Googlebot.

This is not just an issue related to Googlebot. It's also an issue with users visiting a site with unusual window sizes (e.g. split to the top half the monitor) which can skew the URL Metrics.

This PR introduces viewport aspect ratio validation when storing a URL Metric. The validation is also done at read time, so any existing URL Metrics with bad viewport aspect ratios will be immediately discarded.

The min/max aspect ratios I picked are to accommodate the device (Sony Xperia 1 ) with the most extreme aspect ratio I'm familiar with: 21:9. When in horizontal orientation, the aspect ratio is 2.33 and when in vertical orientation it is 0.43. So I picked 0.4 as the minimum and 2.5 as the maximum.

There are two new filters introduced to customize the aspect ratio range:

  • od_minimum_viewport_aspect_ratio
  • od_maximum_viewport_aspect_ratio

The od_maximum_viewport_aspect_ratio filter is particularly useful during development for example when you have DevTools open on the bottom of the screen.

When an error occurs when parsing URL Metrics out of the od_url_metrics post type, it now includes the UUID in the error message. Additionally, if the error is due to a JSON validation issue, the severity is now reduced from warning to notice, since JSON Schema changes are to be expected.

@westonruter westonruter added [Type] Bug An existing feature is broken [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Aug 22, 2024
Copy link

github-actions bot commented Aug 22, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: westonruter <westonruter@git.wordpress.org>
Co-authored-by: swissspidy <swissspidy@git.wordpress.org>
Co-authored-by: adamsilverstein <adamsilverstein@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

$this_function = __FUNCTION__;
$trigger_warning = static function ( string $message ) use ( $this_function ): void {
wp_trigger_error( $this_function, esc_html( $message ), E_USER_WARNING );
$this_function = __METHOD__;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of __FUNCTION__ was wrong as it didn't include the class name.

@westonruter westonruter force-pushed the update/od-allowed-viewport-aspect-ratios branch from 561c018 to aa98bcf Compare August 22, 2024 16:06
Copy link
Member

@adamsilverstein adamsilverstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic. Added some doc block suggestions

@westonruter westonruter force-pushed the update/od-allowed-viewport-aspect-ratios branch from 99a3b27 to 161b201 Compare August 22, 2024 19:29
@westonruter westonruter merged commit f8dad34 into trunk Aug 22, 2024
16 checks passed
@westonruter westonruter deleted the update/od-allowed-viewport-aspect-ratios branch August 22, 2024 19:43
@westonruter
Copy link
Member Author

I deployed this to my site and immediately after deploying I saw the URL Metrics with the extreme aspect ratio removed.

This resulted in all_element_max_intersection_ratios being corrected from being all 1 to being the expected values, especially the commenter avatar images which now also are getting loading=lazy as expected.

Before:

"all_element_max_intersection_ratios": {
    "/*[0][self::HTML]/*[1][self::BODY]/*[3][self::MAIN]/*[0][self::ARTICLE]/*[1][self::FIGURE]/*[0][self::DIV]/*[0][self::IMG]": 1,
    "/*[1][self::HTML]/*[2][self::BODY]/*[4][self::MAIN]/*[1][self::ARTICLE]/*[2][self::FIGURE]/*[1][self::DIV]/*[1][self::IMG]": 1,
    "/*[1][self::HTML]/*[2][self::BODY]/*[4][self::MAIN]/*[1][self::ARTICLE]/*[6][self::DIV]/*[1][self::DIV]/*[2][self::DIV]/*[1][self::DIV]/*[1][self::ARTICLE]/*[1][self::FOOTER]/*[1][self::DIV]/*[1][self::A]/*[1][self::IMG]": 1,
    "/*[1][self::HTML]/*[2][self::BODY]/*[4][self::MAIN]/*[1][self::ARTICLE]/*[6][self::DIV]/*[1][self::DIV]/*[2][self::DIV]/*[2][self::DIV]/*[1][self::ARTICLE]/*[1][self::FOOTER]/*[1][self::DIV]/*[1][self::A]/*[1][self::IMG]": 1,
    "/*[1][self::HTML]/*[2][self::BODY]/*[4][self::MAIN]/*[1][self::ARTICLE]/*[6][self::DIV]/*[1][self::DIV]/*[2][self::DIV]/*[3][self::DIV]/*[1][self::ARTICLE]/*[1][self::FOOTER]/*[1][self::DIV]/*[1][self::A]/*[1][self::IMG]": 1
}

After:

"all_element_max_intersection_ratios": {
    "/*[1][self::HTML]/*[2][self::BODY]/*[4][self::MAIN]/*[1][self::ARTICLE]/*[2][self::FIGURE]/*[1][self::DIV]/*[1][self::IMG]": 0.9963193535804749,
    "/*[1][self::HTML]/*[2][self::BODY]/*[4][self::MAIN]/*[1][self::ARTICLE]/*[6][self::DIV]/*[1][self::DIV]/*[2][self::DIV]/*[1][self::DIV]/*[1][self::ARTICLE]/*[1][self::FOOTER]/*[1][self::DIV]/*[1][self::A]/*[1][self::IMG]": 0,
    "/*[1][self::HTML]/*[2][self::BODY]/*[4][self::MAIN]/*[1][self::ARTICLE]/*[6][self::DIV]/*[1][self::DIV]/*[2][self::DIV]/*[2][self::DIV]/*[1][self::ARTICLE]/*[1][self::FOOTER]/*[1][self::DIV]/*[1][self::A]/*[1][self::IMG]": 0,
    "/*[1][self::HTML]/*[2][self::BODY]/*[4][self::MAIN]/*[1][self::ARTICLE]/*[6][self::DIV]/*[1][self::DIV]/*[2][self::DIV]/*[3][self::DIV]/*[1][self::ARTICLE]/*[1][self::FOOTER]/*[1][self::DIV]/*[1][self::A]/*[1][self::IMG]": 0
}

@westonruter
Copy link
Member Author

Aside: I'm not sure what this element is all about:

/*[0][self::HTML]/*[1][self::BODY]/*[3][self::MAIN]/*[0][self::ARTICLE]/*[1][self::FIGURE]/*[0][self::DIV]/*[0][self::IMG]

I'm currently fetching the page on my site every 5-10 minutes with WP_DEBU on and storing the results in a directory in a timestamped HTML file. I'll look at the HTML next week to see if I'm seeing any instability in the HTML structure. This is with the Twenty Twenty theme.

@westonruter
Copy link
Member Author

westonruter commented Aug 22, 2024

Oh, I just realized what it is. It's a very old URL Metric from before the XPath indices were fixed to be 1-based instead of 0-based! See #1191. Looking at the URL Metric data, I see the timestamp is 1712354502.752218 which is for a day back in April! This URL Metric wasn't for Googlebot as it had a viewport of 412x823. It appears that URL Metrics are not being collected frequently enough, or my site is truly devoid of traffic. I think for my site it may be in part due to page caching, where the detection time window (od_detection_time_window) is too short, so it is only collecting metrics for visitors who happen to visit when there is a MISS to the page cache, and if there is a stale-while-revalidate caching strategy going on with Varnish then it is even more likely for the visitor to never be within the detection time window (currentTime - serveTime > detectionTimeWindow):

// Abort running detection logic if it was served in a cached page.
if ( currentTime - serveTime > detectionTimeWindow ) {
if ( isDebug ) {
warn(
'Aborted detection due to being outside detection time window.'
);
}
return;
}

I think this logic needs to be revisited.

Update: Filed as #1496

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Bug An existing feature is broken
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants