Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.x]: Cache includes links with x-craft-live-preview in query params #15586

Closed
niektenhoopen opened this issue Aug 23, 2024 · 8 comments
Closed
Assignees
Labels

Comments

@niektenhoopen
Copy link

What happened?

Description

If you open preview mode on an entry that has no changes, the token is not in the URLs of links in the preview mode, but the x-craft-live-preview token is.

If you use {% cache %} in your templates, this part will be cached with the x-craft-live-preview added to all links in the cached part of code.

The reason that it's cached, is probably because there is no token:

if ($request->getHadToken()) {

This way, you will end up with x-craft-live-preview tokens in the Google index (and that's precisely what is happened now)

Steps to reproduce

  1. See above

Expected behavior

No x-craft-live-preview-URLs on a production website

Actual behavior

Craft CMS version

5.3.4

PHP version

8.3

Operating system and version

No response

Database type and version

No response

Image driver and version

No response

Installed plugins and versions

@mmikkel
Copy link
Contributor

mmikkel commented Aug 23, 2024

Worth mentioning that URLs with x-craft-preview or x-craft-live-preview params can also end up in template caches if the request URL includes an invalid or expired token, as the token param is omitted from generated URLs in such cases, but the preview param is still added.

These URLs are also eligible to be indexed, as Craft only sets the X-Robots-Tag header for tokenised requests.

@i-just
Copy link
Contributor

i-just commented Aug 23, 2024

Hi both, thanks for reporting and all the details! I raised a PR for the caching issue.

In terms of caching, a request with an invalid or expired token would still return true from the getHadToken() method, which means such a request wouldn’t be cached. The PR ensures that the template caching is also not enabled for a live preview request that doesn’t have a token (which is what will happen if you open preview mode on an entry that has no changes).

If the token param is omitted from generated URLs (e.g., if invalid or expired), but the preview param is present in the request, that preview param will be added to the generated URLs, and the x-robots-tag won’t be set. That’s something I’d like us to discuss internally before making changes.

@niektenhoopen
Copy link
Author

niektenhoopen commented Aug 23, 2024

Thanks @i-just! Any idea when someone will merge this? URLs with the preview token query parameter are now being indexed by Google and it's spreading every hour.

@nikolowry
Copy link

This was driving me crazy! I finally ended up adding the following redirect snippet to my web/index.php:

if (@$_GET['x-craft-preview']):
    unset($_GET['x-craft-preview']);

    $parsed_url = parse_url($_SERVER['REQUEST_URI']);

    $redirect = $parsed_url['path']
        . (!empty($_GET)
            ? '?' . http_build_query($_GET)
            : '');

    exit(header("Location: $redirect"));
endif;

Can't wait for this to get resolved so I can drop it.

@mmikkel
Copy link
Contributor

mmikkel commented Aug 24, 2024

I think another wrinkle to this (that should maybe be a separate issue, but here goes) is that unlike the token param (which is omitted from generated URLs if invalid) the preview param is rendered back out verbatim for all URLs. While it's great that preview requests will no longer be cached, this detail still makes me a little bit uneasy.

Considering that – AFAIK – preview param values are just random strings, would it make sense if Craft generated a new random string to use for the generated URLs' preview params, instead of using the value from the request URL?

@niektenhoopen
Copy link
Author

@nikolowry I did this and will probably have to keep this because Google already indexed 100+ pages and I will never get rid of them:

 Event::on(View::class, View::EVENT_BEGIN_PAGE, function (Event $e) {
    if (! Craft::$app->request->isPreview) {
        return;
    }
  
    if (! Craft::$app->user->isGuest) {
        return;
    }
  
    $parsedUrl = parse_url(Craft::$app->request->getAbsoluteUrl());
    parse_str($parsedUrl['query'] ?? '', $parameters);
  
    if ($parameters && isset($parameters['x-craft-live-preview'])) {
        unset($parameters['x-craft-live-preview']);
        return Craft::$app->response->redirect(UrlHelper::url($parsedUrl['path'], $parameters));
    }
});

@brandonkelly
Copy link
Member

Craft 4.11.5 and 5.3.6 are out with a fix for this, via #15589.

@brandonkelly
Copy link
Member

brandonkelly commented Sep 3, 2024

Craft 4.12.0 and 5.4.0 are out now with a related change: Craft now sends a x-robots-tag: none header on all preview requests, ensuring they don’t get indexed. (#15612)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants