-
-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Crawler return invalid content/cache to StaticFileCache #642
Labels
3rd party ext
Issue related to 3rd party extension e.g. News
Bug
Priority 1
refactoring
TYPO3v9
TYPO3v10
Comments
tomasnorre
added
Bug
TYPO3v9
refactoring
TYPO3v10
3rd party ext
Issue related to 3rd party extension e.g. News
Priority 1
labels
Oct 28, 2020
cweiske
added a commit
to mogic-le/t3x-crawler
that referenced
this issue
May 25, 2023
History: -------- Because of a problem with lochmueller/staticfilecache, crawler issue tomasnorre#642 changed the middleware loading order to execute crawler after static file cache. (commit 0f7cb6a) The source of the problem was that the crawler CrawlerInitialization middleware overwrote the HTTP response that was generated by TYPO3. Since commit 8a9b896 (issue tomasnorre#837) the HTTP response is not destroyed/overwritten by crawler anymore but moved into a HTTP header "X-T3Crawler-Meta". The loading order does not influence compatibility with static file cache anymore. Bug --- The changed loading order in the bug fix led to the problem that > indexed_search:TypoScriptFrontendHook was executed before > crawler:CrawlerInitialization But CrawlerInitialization must be run before TypoScriptFrontendHook because it loads request data that are needed by indexed_search. This led to bug tomasnorre#729 - forced reindexing by the crawler did not work anymore if the page was already in cache. Solution -------- Restore the HTTP middleware loading order as it was before the fix for tomasnorre#642, so that the code path is again: 1. crawler:FrontendUserAuthenticator (aoe/crawler/authentication) 2. crawler:CrawlerInitialization (aoe/crawler/initialization) 3. indexed_search:TypoScriptFrontendHook (called by typo3/cms-frontend/prepare-tsfe-rendering) Resolves: tomasnorre#729
cweiske
added a commit
to mogic-le/t3x-crawler
that referenced
this issue
May 25, 2023
History: -------- Because of a problem with lochmueller/staticfilecache, crawler issue tomasnorre#642 changed the middleware loading order to execute crawler after static file cache. (commit 0f7cb6a) The source of the problem was that the crawler CrawlerInitialization middleware overwrote the HTTP response that was generated by TYPO3. Since commit 8a9b896 (issue tomasnorre#837) the HTTP response is not destroyed/overwritten by crawler anymore but moved into a HTTP header "X-T3Crawler-Meta". The loading order does not influence compatibility with static file cache anymore. Bug --- The changed loading order in the bug fix led to the problem that > indexed_search:TypoScriptFrontendHook was executed before > crawler:CrawlerInitialization But CrawlerInitialization must be run before TypoScriptFrontendHook because it loads request data that are needed by indexed_search. This led to bug tomasnorre#729 - forced reindexing by the crawler did not work anymore if the page was already in cache. Solution -------- Restore the HTTP middleware loading order as it was before the fix for tomasnorre#642, so that the code path is again: 1. crawler:FrontendUserAuthenticator (aoe/crawler/authentication) 2. crawler:CrawlerInitialization (aoe/crawler/initialization) 3. indexed_search:TypoScriptFrontendHook (called by typo3/cms-frontend/prepare-tsfe-rendering) Resolves: tomasnorre#729
3 tasks
cweiske
added a commit
to mogic-le/t3x-crawler
that referenced
this issue
May 26, 2023
History: -------- Because of a problem with lochmueller/staticfilecache, crawler issue tomasnorre#642 changed the middleware loading order to execute crawler after static file cache. (commit 0f7cb6a) The source of the problem was that the crawler CrawlerInitialization middleware overwrote the HTTP response that was generated by TYPO3. Since commit 8a9b896 (issue tomasnorre#837) the HTTP response is not destroyed/overwritten by crawler anymore but moved into a HTTP header "X-T3Crawler-Meta". The loading order does not influence compatibility with static file cache anymore. Bug --- The changed loading order in the bug fix led to the problem that > indexed_search:TypoScriptFrontendHook was executed before > crawler:CrawlerInitialization But CrawlerInitialization must be run before TypoScriptFrontendHook because it loads request data that are needed by indexed_search. This led to bug tomasnorre#729 - forced reindexing by the crawler did not work anymore if the page was already in cache. Solution -------- Restore the HTTP middleware loading order as it was before the fix for tomasnorre#642, so that the code path is again: 1. crawler:FrontendUserAuthenticator (aoe/crawler/authentication) 2. crawler:CrawlerInitialization (aoe/crawler/initialization) 3. indexed_search:TypoScriptFrontendHook (called by typo3/cms-frontend/prepare-tsfe-rendering) Resolves: tomasnorre#729
tomasnorre
pushed a commit
that referenced
this issue
Feb 13, 2024
History: -------- Because of a problem with lochmueller/staticfilecache, crawler issue #642 changed the middleware loading order to execute crawler after static file cache. (commit 0f7cb6a) The source of the problem was that the crawler CrawlerInitialization middleware overwrote the HTTP response that was generated by TYPO3. Since commit 8a9b896 (issue #837) the HTTP response is not destroyed/overwritten by crawler anymore but moved into a HTTP header "X-T3Crawler-Meta". The loading order does not influence compatibility with static file cache anymore. Bug --- The changed loading order in the bug fix led to the problem that > indexed_search:TypoScriptFrontendHook was executed before > crawler:CrawlerInitialization But CrawlerInitialization must be run before TypoScriptFrontendHook because it loads request data that are needed by indexed_search. This led to bug #729 - forced reindexing by the crawler did not work anymore if the page was already in cache. Solution -------- Restore the HTTP middleware loading order as it was before the fix for #642, so that the code path is again: 1. crawler:FrontendUserAuthenticator (aoe/crawler/authentication) 2. crawler:CrawlerInitialization (aoe/crawler/initialization) 3. indexed_search:TypoScriptFrontendHook (called by typo3/cms-frontend/prepare-tsfe-rendering) Resolves: #729
cweiske
added a commit
to mogic-le/t3x-crawler
that referenced
this issue
Feb 13, 2024
History: -------- Because of a problem with lochmueller/staticfilecache, crawler issue tomasnorre#642 changed the middleware loading order to execute crawler after static file cache. (commit 0f7cb6a) The source of the problem was that the crawler CrawlerInitialization middleware overwrote the HTTP response that was generated by TYPO3. Since commit 8a9b896 (issue tomasnorre#837) the HTTP response is not destroyed/overwritten by crawler anymore but moved into a HTTP header "X-T3Crawler-Meta". The loading order does not influence compatibility with static file cache anymore. Bug --- The changed loading order in the bug fix led to the problem that > indexed_search:TypoScriptFrontendHook was executed before > crawler:CrawlerInitialization But CrawlerInitialization must be run before TypoScriptFrontendHook because it loads request data that are needed by indexed_search. This led to bug tomasnorre#729 - forced reindexing by the crawler did not work anymore if the page was already in cache. Solution -------- Restore the HTTP middleware loading order as it was before the fix for tomasnorre#642, so that the code path is again: 1. crawler:FrontendUserAuthenticator (aoe/crawler/authentication) 2. crawler:CrawlerInitialization (aoe/crawler/initialization) 3. indexed_search:TypoScriptFrontendHook (called by typo3/cms-frontend/prepare-tsfe-rendering) Resolves: tomasnorre#729
tomasnorre
pushed a commit
that referenced
this issue
Feb 13, 2024
History: -------- Because of a problem with lochmueller/staticfilecache, crawler issue #642 changed the middleware loading order to execute crawler after static file cache. (commit 0f7cb6a) The source of the problem was that the crawler CrawlerInitialization middleware overwrote the HTTP response that was generated by TYPO3. Since commit 8a9b896 (issue #837) the HTTP response is not destroyed/overwritten by crawler anymore but moved into a HTTP header "X-T3Crawler-Meta". The loading order does not influence compatibility with static file cache anymore. Bug --- The changed loading order in the bug fix led to the problem that > indexed_search:TypoScriptFrontendHook was executed before > crawler:CrawlerInitialization But CrawlerInitialization must be run before TypoScriptFrontendHook because it loads request data that are needed by indexed_search. This led to bug #729 - forced reindexing by the crawler did not work anymore if the page was already in cache. Solution -------- Restore the HTTP middleware loading order as it was before the fix for #642, so that the code path is again: 1. crawler:FrontendUserAuthenticator (aoe/crawler/authentication) 2. crawler:CrawlerInitialization (aoe/crawler/initialization) 3. indexed_search:TypoScriptFrontendHook (called by typo3/cms-frontend/prepare-tsfe-rendering) Resolves: #729
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
3rd party ext
Issue related to 3rd party extension e.g. News
Bug
Priority 1
refactoring
TYPO3v9
TYPO3v10
Bug Report
This bug is still WIP in regard to describing it. But to not lose the problem I'll write down what I have for now.
There is an issue on the Static File Cache GitHub on Crawler Compatibility:
lochmueller/staticfilecache#260
The crawler appears to have some issues with the Middleware Handling and the content/caching that is handled over to the StaticFileCache.
This is currently resulting in an Invalid cache in Static File Cache. This is since commit: lochmueller/staticfilecache@975eff6 omitted by a warning in regard to the crawler.
As we don't want to break functionality of other extensions, and of course not lose users of the Crawler we will try to have this fixed.
If you have any information that could be helpful to solve this issue, please add a comment below and lets see how we can best solve this issue.
The text was updated successfully, but these errors were encountered: