-
-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stuck in an unsaved changes loop #3798
Comments
I noticed that the Sane class convert spaces in unicode 🤷♂️ My brain is burned with Sane/Dom/Parsley classes 🤯 |
@lukasbestle do you have an idea why spaces are converted? |
The conversion most likely happens inside Is the converted result incorrect? Or is it just about the unsaved changes loop? |
Sounds to me like our setup (e.g. the I don't know the E.g. what irritates me there is that the comment says Maybe the cause is located somewhere around there? |
The "inject a meta tag" route would probably be the "better" solution, but also be a hack and I'm not sure if it will cause other issues, e.g. if there's already a meta tag in the input. So I decided for entity escaping as it's also suggested in the comments in the PHP docs. @afbora I don't fully understand which part of your test case causes which issue. Could you convert your test case into a test that directly uses the |
If you want to load UTF-8 content, this has worked for us pretty well over the years: $document->loadHTML('<?xml encoding="UTF-8">' . $html); This is how we use it: |
Oh, that looks pretty nice. Thanks, I'll try it out! |
@lukasbestle You can test following text on writer field:
|
@lukasbestle FYI, also when you remove the https://github.com/getkirby/kirby/blob/develop/config/fields/writer.php#L32 |
If I was able to get it right, @nilshoerrmann's suggestion didn't work :(( |
Saved without sane sanitize and no stucks ✅
Saved with sane sanitize and stuck 💥
|
I tried several options with the following result (simple test with just
❌ Input is parsed as
❌ As the input is encoded as entities, the output also is encoded as entities.
❌ The input doesn't contain entities, but weirdly
✅ Promising, but we need to remove the added meta tag without removing any other tag that was already in the input. |
I'm onto something here. Let's hope it will work. :) |
Modifications were needed in two different places because: 1. `libxml` needs to be told to parse the input as UTF-8. A `<meta>` tag would create an empty `<head>` and converting everything to entities is a hack. The XML declaration works great and is the easiest to remove right afterwards. 2. If the `<meta>` tag is used and kept in the document, additional processing (e.g. in `$dom->sanitize()`) will think that it's part of the input. So everything we add needs to be removed immediately after parsing. 3. On output we need the `<meta>` tag again as it's the only way to avoid an export as `ISO-8859-1` with entities. Fixes #3798.
Modifications were needed in two different places because: 1. `libxml` needs to be told to parse the input as UTF-8. A `<meta>` tag would create an empty `<head>` and converting everything to entities is a hack. The XML declaration works great and is the easiest to remove right afterwards. 2. If the `<meta>` tag is used and kept in the document, additional processing (e.g. in `$dom->sanitize()`) will think that it's part of the input. So everything we add needs to be removed immediately after parsing. 3. On output we need the `<meta>` tag again as it's the only way to avoid an export as `ISO-8859-1` with entities. Fixes #3798.
Modifications were needed in two different places because: 1. `libxml` needs to be told to parse the input as UTF-8. A `<meta>` tag would create an empty `<head>` and converting everything to entities is a hack. The XML declaration works great and is the easiest to remove right afterwards. 2. If the `<meta>` tag is used and kept in the document, additional processing (e.g. in `$dom->sanitize()`) will think that it's part of the input. So everything we add needs to be removed immediately after parsing. 3. On output we need the `<meta>` tag again as it's the only way to avoid an export as `ISO-8859-1` with entities. Fixes #3798.
I think the issue is that the writer field's handling of the value/text conflicts with that of the sane class. But I couldn't figure out. |
Ok, I'll try to explain the underlying problem a bit to keep track of it here. Form changesAs soon as content gets loaded from the server to fill the form, a copy of it is stored in Vuex as original form values. Whenever a form input updates, the updated value is also stored in Vuex in a second "changes" object. The original values object and the changes object are then compared against each other to find out of something has changed. In this case, the orange bar shows up. This is the basic logic behind the changes bar. WriterThe Writer initializes ProseMirror under the hood. ProseMirror receives the HTML from the server and turns it into its own JSON document format. In order to create this format, it has to be fed with all kinds of rules about allowed block level elements and inline elements. It will automatically do all the hard work and filter out unwanted shit in the HTML – kill attributes, styles, unwanted tags, etc. It is very strict when performing this task. That JSON document format can then be turned back into clean HTML again. But that HTML might differ from the original HTML the Writer received. For example, when you give the Writer this: <p>Some <b>Text</b></p> The output after the clean-up would be: <p>Some <strong>Text</strong></p> This is actually really nice, but it also leads to a big problem. When the form is loaded, the original value in Vuex would be the string from the server. We already clean up the HTML there with our Sane class and the clean up process is already very very close to what ProseMirror is doing, but it will never be 100% the same unless we'd be able to let ProseMirror run on the server, which is unfortunately not possible. So the original already differs from the initial value once ProseMirror has been initalized. ProseMirror therefor already sends an input event and the changes object contains a value that's no longer the same than the original. That's why the changes bar appears immediately and can also never be fully removed. |
After writing this all down, I might have found a solution in the PR above. |
✅ |
Related #3758 PR and still continue unicode characters issue.
You can use following content to reproduce the issue:
Kirby 3.6.0-beta.3
The text was updated successfully, but these errors were encountered: