Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Crawler return invalid content/cache to StaticFileCache #642

Closed
tomasnorre opened this issue Oct 28, 2020 · 0 comments
Closed

[BUG] Crawler return invalid content/cache to StaticFileCache #642

tomasnorre opened this issue Oct 28, 2020 · 0 comments
Labels

Comments

@tomasnorre
Copy link
Owner

Bug Report

This bug is still WIP in regard to describing it. But to not lose the problem I'll write down what I have for now.

There is an issue on the Static File Cache GitHub on Crawler Compatibility:
lochmueller/staticfilecache#260

The crawler appears to have some issues with the Middleware Handling and the content/caching that is handled over to the StaticFileCache.

This is currently resulting in an Invalid cache in Static File Cache. This is since commit: lochmueller/staticfilecache@975eff6 omitted by a warning in regard to the crawler.

As we don't want to break functionality of other extensions, and of course not lose users of the Crawler we will try to have this fixed.

If you have any information that could be helpful to solve this issue, please add a comment below and lets see how we can best solve this issue.

cweiske added a commit to mogic-le/t3x-crawler that referenced this issue May 25, 2023
History:
--------
Because of a problem with lochmueller/staticfilecache,
crawler issue tomasnorre#642
changed the middleware loading order to execute crawler after static file cache.
(commit 0f7cb6a)

The source of the problem was that the crawler CrawlerInitialization middleware
overwrote the HTTP response that was generated by TYPO3.

Since commit 8a9b896
(issue tomasnorre#837)
the HTTP response is not destroyed/overwritten by crawler anymore
but moved into a HTTP header "X-T3Crawler-Meta".
The loading order does not influence compatibility with
static file cache anymore.

Bug
---
The changed loading order in the bug fix led to the problem that
> indexed_search:TypoScriptFrontendHook
was executed before
> crawler:CrawlerInitialization

But CrawlerInitialization must be run before TypoScriptFrontendHook
because it loads request data that are needed by indexed_search.

This led to bug tomasnorre#729
- forced reindexing by the crawler did not work anymore if the
page was already in cache.

Solution
--------
Restore the HTTP middleware loading order as it was before
the fix for tomasnorre#642, so that the code path is again:

1. crawler:FrontendUserAuthenticator
   (aoe/crawler/authentication)

2. crawler:CrawlerInitialization
   (aoe/crawler/initialization)

3. indexed_search:TypoScriptFrontendHook
   (called by typo3/cms-frontend/prepare-tsfe-rendering)

Resolves: tomasnorre#729
cweiske added a commit to mogic-le/t3x-crawler that referenced this issue May 25, 2023
History:
--------
Because of a problem with lochmueller/staticfilecache,
crawler issue tomasnorre#642
changed the middleware loading order to execute crawler after static file cache.
(commit 0f7cb6a)

The source of the problem was that the crawler CrawlerInitialization middleware
overwrote the HTTP response that was generated by TYPO3.

Since commit 8a9b896
(issue tomasnorre#837)
the HTTP response is not destroyed/overwritten by crawler anymore
but moved into a HTTP header "X-T3Crawler-Meta".
The loading order does not influence compatibility with
static file cache anymore.

Bug
---
The changed loading order in the bug fix led to the problem that
> indexed_search:TypoScriptFrontendHook
was executed before
> crawler:CrawlerInitialization

But CrawlerInitialization must be run before TypoScriptFrontendHook
because it loads request data that are needed by indexed_search.

This led to bug tomasnorre#729
- forced reindexing by the crawler did not work anymore if the
page was already in cache.

Solution
--------
Restore the HTTP middleware loading order as it was before
the fix for tomasnorre#642, so that the code path is again:

1. crawler:FrontendUserAuthenticator
   (aoe/crawler/authentication)

2. crawler:CrawlerInitialization
   (aoe/crawler/initialization)

3. indexed_search:TypoScriptFrontendHook
   (called by typo3/cms-frontend/prepare-tsfe-rendering)

Resolves: tomasnorre#729
cweiske added a commit to mogic-le/t3x-crawler that referenced this issue May 26, 2023
History:
--------
Because of a problem with lochmueller/staticfilecache,
crawler issue tomasnorre#642
changed the middleware loading order to execute crawler after static file cache.
(commit 0f7cb6a)

The source of the problem was that the crawler CrawlerInitialization middleware
overwrote the HTTP response that was generated by TYPO3.

Since commit 8a9b896
(issue tomasnorre#837)
the HTTP response is not destroyed/overwritten by crawler anymore
but moved into a HTTP header "X-T3Crawler-Meta".
The loading order does not influence compatibility with
static file cache anymore.

Bug
---
The changed loading order in the bug fix led to the problem that
> indexed_search:TypoScriptFrontendHook
was executed before
> crawler:CrawlerInitialization

But CrawlerInitialization must be run before TypoScriptFrontendHook
because it loads request data that are needed by indexed_search.

This led to bug tomasnorre#729
- forced reindexing by the crawler did not work anymore if the
page was already in cache.

Solution
--------
Restore the HTTP middleware loading order as it was before
the fix for tomasnorre#642, so that the code path is again:

1. crawler:FrontendUserAuthenticator
   (aoe/crawler/authentication)

2. crawler:CrawlerInitialization
   (aoe/crawler/initialization)

3. indexed_search:TypoScriptFrontendHook
   (called by typo3/cms-frontend/prepare-tsfe-rendering)

Resolves: tomasnorre#729
tomasnorre pushed a commit that referenced this issue Feb 13, 2024
History:
--------
Because of a problem with lochmueller/staticfilecache,
crawler issue #642
changed the middleware loading order to execute crawler after static file cache.
(commit 0f7cb6a)

The source of the problem was that the crawler CrawlerInitialization middleware
overwrote the HTTP response that was generated by TYPO3.

Since commit 8a9b896
(issue #837)
the HTTP response is not destroyed/overwritten by crawler anymore
but moved into a HTTP header "X-T3Crawler-Meta".
The loading order does not influence compatibility with
static file cache anymore.

Bug
---
The changed loading order in the bug fix led to the problem that
> indexed_search:TypoScriptFrontendHook
was executed before
> crawler:CrawlerInitialization

But CrawlerInitialization must be run before TypoScriptFrontendHook
because it loads request data that are needed by indexed_search.

This led to bug #729
- forced reindexing by the crawler did not work anymore if the
page was already in cache.

Solution
--------
Restore the HTTP middleware loading order as it was before
the fix for #642, so that the code path is again:

1. crawler:FrontendUserAuthenticator
   (aoe/crawler/authentication)

2. crawler:CrawlerInitialization
   (aoe/crawler/initialization)

3. indexed_search:TypoScriptFrontendHook
   (called by typo3/cms-frontend/prepare-tsfe-rendering)

Resolves: #729
cweiske added a commit to mogic-le/t3x-crawler that referenced this issue Feb 13, 2024
History:
--------
Because of a problem with lochmueller/staticfilecache,
crawler issue tomasnorre#642
changed the middleware loading order to execute crawler after static file cache.
(commit 0f7cb6a)

The source of the problem was that the crawler CrawlerInitialization middleware
overwrote the HTTP response that was generated by TYPO3.

Since commit 8a9b896
(issue tomasnorre#837)
the HTTP response is not destroyed/overwritten by crawler anymore
but moved into a HTTP header "X-T3Crawler-Meta".
The loading order does not influence compatibility with
static file cache anymore.

Bug
---
The changed loading order in the bug fix led to the problem that
> indexed_search:TypoScriptFrontendHook
was executed before
> crawler:CrawlerInitialization

But CrawlerInitialization must be run before TypoScriptFrontendHook
because it loads request data that are needed by indexed_search.

This led to bug tomasnorre#729
- forced reindexing by the crawler did not work anymore if the
page was already in cache.

Solution
--------
Restore the HTTP middleware loading order as it was before
the fix for tomasnorre#642, so that the code path is again:

1. crawler:FrontendUserAuthenticator
   (aoe/crawler/authentication)

2. crawler:CrawlerInitialization
   (aoe/crawler/initialization)

3. indexed_search:TypoScriptFrontendHook
   (called by typo3/cms-frontend/prepare-tsfe-rendering)

Resolves: tomasnorre#729
tomasnorre pushed a commit that referenced this issue Feb 13, 2024
History:
--------
Because of a problem with lochmueller/staticfilecache,
crawler issue #642
changed the middleware loading order to execute crawler after static file cache.
(commit 0f7cb6a)

The source of the problem was that the crawler CrawlerInitialization middleware
overwrote the HTTP response that was generated by TYPO3.

Since commit 8a9b896
(issue #837)
the HTTP response is not destroyed/overwritten by crawler anymore
but moved into a HTTP header "X-T3Crawler-Meta".
The loading order does not influence compatibility with
static file cache anymore.

Bug
---
The changed loading order in the bug fix led to the problem that
> indexed_search:TypoScriptFrontendHook
was executed before
> crawler:CrawlerInitialization

But CrawlerInitialization must be run before TypoScriptFrontendHook
because it loads request data that are needed by indexed_search.

This led to bug #729
- forced reindexing by the crawler did not work anymore if the
page was already in cache.

Solution
--------
Restore the HTTP middleware loading order as it was before
the fix for #642, so that the code path is again:

1. crawler:FrontendUserAuthenticator
   (aoe/crawler/authentication)

2. crawler:CrawlerInitialization
   (aoe/crawler/initialization)

3. indexed_search:TypoScriptFrontendHook
   (called by typo3/cms-frontend/prepare-tsfe-rendering)

Resolves: #729
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant