You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a multi-site Craft CMS Pro project with the "craftcms/redactor": "3.1.0" plugin, I am encountering an issue with content indexing. The issue seems related to invisible characters that affect search results in the admin panel and frontend.
Example Content
If I click the Source button in the Redactor field, the content is displayed as follows:
<p>More information on the projects submission can be found in the <strong>TOOLBOX</strong>.</p>
However, when I search for the word toolbox in the admin, this page does not appear in the results. Conversely, if I search for TOOL%C2%ADBOX, the page is found. The same behavior occurs in the frontend when using .search().
Field Configuration
Clean up HTML: Remove inline styles, Remove empty tags, Replace non-breaking spaces with regular spaces
Purify HTML: Enabled
HTML Purifier Config: Default
It seems that some invisible characters are being introduced and retained after saving. These characters interfere with the indexing process.
Steps to Reproduce
Create a field using Redactor plugin with the configurations mentioned above.
Add the following content in the field:
<p>More information on the projects submission can be found in the <strong>TOOLBOX</strong>.</p>
Save the entry and perform a search for the word toolbox in the admin or frontend.
Expected Behavior
The page containing the word TOOLBOX should appear in the search results without requiring the exact invisible character sequence (TOOL%C2%ADBOX).
Actual Behavior
The page only appears in the search results if the invisible character sequence is included in the search query. Regular searches for toolbox do not return the expected result.
Additional Questions
Why are these invisible characters added and retained after saving the content?
How can I prevent such characters from being saved in the first place?
What is the recommended approach to clean all content encodings before re-index using --update-search-index?
Craft CMS version
4.13.8
PHP version
No response
Operating system and version
No response
Database type and version
No response
Image driver and version
No response
Installed plugins and versions
"craftcms/redactor": "3.1.0"
The text was updated successfully, but these errors were encountered:
What happened?
Description
In a multi-site Craft CMS Pro project with the
"craftcms/redactor": "3.1.0"
plugin, I am encountering an issue with content indexing. The issue seems related to invisible characters that affect search results in the admin panel and frontend.Example Content
If I click the
Source
button in the Redactor field, the content is displayed as follows:However, when I search for the word
toolbox
in the admin, this page does not appear in the results. Conversely, if I search forTOOL%C2%ADBOX
, the page is found. The same behavior occurs in the frontend when using.search()
.Field Configuration
It seems that some invisible characters are being introduced and retained after saving. These characters interfere with the indexing process.
Steps to Reproduce
Redactor
plugin with the configurations mentioned above.toolbox
in the admin or frontend.Expected Behavior
The page containing the word
TOOLBOX
should appear in the search results without requiring the exact invisible character sequence (TOOL%C2%ADBOX
).Actual Behavior
The page only appears in the search results if the invisible character sequence is included in the search query. Regular searches for
toolbox
do not return the expected result.Additional Questions
--update-search-index
?Craft CMS version
4.13.8
PHP version
No response
Operating system and version
No response
Database type and version
No response
Image driver and version
No response
Installed plugins and versions
The text was updated successfully, but these errors were encountered: