Stuck in an unsaved changes loop #3798

afbora · 2021-10-12T15:38:36Z

Related #3758 PR and still continue unicode characters issue.

You can use following content to reproduce the issue:

[{"attrs":{"width":"full-width","mbottom":"true","bgcolor":"colorLightGray"},"columns":[{"blocks":[{"content":{"slides":[{"title":"\u0160ildymo sistemas pri\u017ei\u016brin\u010di\u0173 profesional\u0173 ir savo \u0161ildymo sistema besir\u016bpinan\u010di\u0173 vartotoj\u0173 pam\u0117gti cheminiai priedai","bg":["slide-1.png"],"text":"<ul><li><p>Skland\u017eiam ir ekonomi\u0161kam sistemos veikimui<\/p><\/li><li><p>Efektyviam \u0161ildymo problem\u0173 sprendimui<\/p><\/li><\/ul>","button":{"type":"page","value":"contacts","text":"Daugiau informacijos"}},{"title":"Priedai sistem\u0173 apsaugai nuo korozijos, purvo, nuovir\u0173","bg":["slide-2.png"],"text":"<ul><li><p>Tausoja katil\u0105, \u0161ilumos siurbl\u012f, kitus \u0161ildymo prietaisus ir \u012frang\u0105.<\/p><\/li><li><p>I\u0161lieka veiksmingi vis\u0105 sistemos eksploatavimo laik\u0105.<\/p><\/li><\/ul>","button":{"type":"page","value":"contacts","text":"Daugiau informacijos"}},{"title":"Magnetiniai \u2013 cikloniniai \u0161ildymo sistemos filtrai","bg":["slide-3.png"],"text":"<ul><li><p>Padeda pasi\u0161alinti \u012fvairiems ter\u0161alams ir apsaugo \u0161ildymo prietaisus nuo pirmalaikio gedimo.<\/p><\/li><li><p>Modeliai kompakti\u0161kai ir erdviai katilinei.<\/p><\/li><li><p>10 \u2013 25 met\u0173 garantija.<\/p><\/li><\/ul>","button":{"type":"page","value":"products","text":"Daugiau informacijos"}},{"title":"\u0160ildymo ir \u0161aldymo sistem\u0173 plovikliai","bg":["slide-4.png"],"text":"<ul><li><p>I\u0161tirpdo \u012fvairias nuos\u0117das.<\/p><\/li><li><p>R\u016bg\u0161tiniai ir ner\u016bg\u0161tiniai.<\/p><\/li><li><p>Tinkami visiems praplovimo \u012frenginiams.<\/p><\/li><\/ul>","button":{"type":"page","text":"Daugiau informacijos","value":"products\/plovikliai"}},{"title":"\u0160ildymo sistemos prot\u0117ki\u0173 hermetizavimo priemon\u0117s","bg":["slide-5.png"],"text":"<ul><li><p>Efektyvi pagalba atsiradus mikro prot\u0117kiams.<\/p><\/li><li><p>Suveikia per kelias valandas, nereikia i\u0161leisti sistemos.<\/p><\/li><\/ul>","button":{"type":"page","value":"products\/protekiu-hermetikai","text":"Per\u017ei\u016br\u0117ti produktus"}}],"height":"h-medium","align":"align-left","valign":"valign-middle","paddingleft":null,"paddingright":52,"overlaytoggle":"false","overlaytop":"#FFFFFF","overlaybottom":"#FFFFFF","overlaymobiletoggle":"true","overlaymobiletop":"#FFFFFF82","overlaymobilebottom":"#FFFFFF82","bgmobileposition":null,"titlestyle":"hstyle2","titlecolor":"#0F4C87","tagtitle":"h2","textstyle":"tstyle2","textcolor":"#0F4C87","fontsizea":"","fontweighta":""},"id":"38e799f7-99b2-41e9-8f9c-a046b8f807f6","isHidden":false,"type":"slides"}],"id":"3f93abad-add6-4267-bc58-268f7ee5f18b","width":"1\/1"}],"id":"c35e77bb-e370-4c26-b104-8be6c1184d04"},{"attrs":{"width":"container","mbottom":"false","bgcolor":"colorLightGray"},"columns":[{"blocks":[{"content":{"level":"h2","text":"FERNOX \u2014\u00a0jau 57 metus \u2014\u00a0patikimi, profesional\u016bs, aplink\u0105 tausojantys sprendimai j\u016bs\u0173 \u0161ildymo sistemai","halign":"","styleheading":"hstyle2"},"id":"7733aede-1e08-444e-93d7-0b42b0a04924","isHidden":false,"type":"heading"},{"content":{"features":[{"icon":["network.svg"],"title":"Fernox pasaulyje","text":"<p>Fernox - pasaulyje gerai \u017einomas vandens prie\u017ei\u016bros chemikal\u0173 ir \u00a0\u012frangos gamintojas. \u012emon\u0117 prad\u0117jo veikl\u0105 1964 metais Jungtin\u0117je Karalyst\u0117je, o \u0161iandien Fernox produktais prekiaujama daugiau nei 40-yje \u0161ali\u0173 Europoje, Amerikoje ir Azijoje.<\/p>"},{"icon":["science-molecule-strucutre.svg"],"title":"Produkt\u0173 panaudojimas","text":"<p>Fernox produktai skirti tiek priva\u010di\u0173, tiek pramonini\u0173 \u0161ildymo ir \u0161aldymo sistem\u0173 vandens paruo\u0161imui, vamzdyn\u0173 ir \u012frengini\u0173 praplovimui ir nukalkinimui, prot\u0117ki\u0173 hermetizavimui, apsaugai nuo u\u017e\u0161alimo. Platus magnetini\u0173\/ciklonini\u0173 \u0161ildymo sistemos filtr\u0173 pasirinkimas. Sukurta atskira produkt\u0173 grup\u0117 atsinaujinan\u010di\u0173 i\u0161tekli\u0173 \u0161ildymo sistem\u0173 prie\u017ei\u016brai.<\/p>"},{"icon":["file-iso.svg"],"title":"Kokyb\u0117","text":"<p>\u012emon\u0117 nuolat investuoja \u012f\u00a0mokslinius tyrimus ir ie\u0161ko novatori\u0161k\u0173 bei\u00a0aplink\u0105 tausojan\u010di\u0173 sprendim\u0173. Gymybos procese \u012fdiegti ISO 9001, ISO 14001, ISO\/TS 16949 ir OHSAS 18001 standartai. Produktai sertifikuoti \u012fvairiose nepriklausomose ES laboratorijose.<\/p>"}],"cols":4},"id":"1a1daaff-8bf8-4d04-9c13-6210f20843ba","isHidden":false,"type":"features"}],"id":"3a59ae10-4149-48a9-bd38-16bbfd0179b2","width":"1\/1"}],"id":"12fbd5d0-4d90-48f6-8ce4-c221e29e2f79"},{"attrs":{"width":"container","mbottom":"false","bgcolor":"colorLightGray"},"columns":[{"blocks":[{"content":{"level":"h2","text":"Norite prad\u0117ti naudotis m\u016bs\u0173 produktais?","halign":"left","styleheading":""},"id":"8134a98d-16d6-4ce5-ac11-3f42f901c194","isHidden":false,"type":"heading"}],"id":"a47bdbd3-2ab3-4793-9e11-4a2eb60e08ef","width":"1\/1"}],"id":"05e639e9-3dde-4387-a197-128c0578e341"},{"attrs":{"width":"container","mbottom":"false","bgcolor":"colorLightGray"},"columns":[{"blocks":[{"content":{"level":"h3","text":"Skambinkite:\u00a0+370 685 06 811","halign":"left","styleheading":""},"id":"fe43f682-998e-4359-a9a3-e60a931437a7","isHidden":false,"type":"heading"},{"content":{"text":"<p><strong>Atstovas Lietuvoje UAB \"Sandomus\"<\/strong><\/p><p>\u012emon\u0117s kodas: 302594355<\/p><p>PVM kodas: LT100005964216<\/p><p>Mobilus: +370 685 06 811<\/p><p>El. pa\u0161tas: info@fernox.lt<\/p><p><\/p><p>Adresas: S. \u017dukausko g. 32-7, Vilnius<\/p>"},"id":"d89ce57d-1adf-4f4a-b082-1abebdc99b7d","isHidden":false,"type":"text"}],"id":"183a9a54-8723-4a29-8dec-4fac60f1d96b","width":"1\/2"},{"blocks":[{"content":{"title":"","name":"Vardas \/ \u012emon\u0117","email":"El. pa\u0161tas","phone":"Telefonas","message":"Papildoma informacija","buttontext":"Si\u0173sti"},"id":"2e8c1a2e-51de-4c5b-bcd1-2c50ece48df8","isHidden":false,"type":"contactform"}],"id":"99b084c3-8c9b-4999-a444-f7a46820ffb5","width":"1\/2"}],"id":"90df7d47-2761-4c74-a36e-4fcb0a641f6e"}]

Kirby 3.6.0-beta.3

The text was updated successfully, but these errors were encountered:

afbora · 2021-10-12T22:00:01Z

I noticed that the Sane class convert spaces in unicode 🤷‍♂️ My brain is burned with Sane/Dom/Parsley classes 🤯

bastianallgeier · 2021-10-18T14:12:24Z

@lukasbestle do you have an idea why spaces are converted?

lukasbestle · 2021-10-18T14:21:22Z

The conversion most likely happens inside libxml (PHP DOM classes) between parsing and conversion back to a string.

Is the converted result incorrect? Or is it just about the unsaved changes loop?

distantnative · 2021-10-31T16:08:30Z

Sounds to me like our setup (e.g. the Dom class) has issues with encodings - like it receives A in encoding X, processes it and spits out B in encoding Y. This gets passed to the Panel and browser, where automatically it converts somehow to A in X again which is compared to B in Y - it differs, so we have the changes bar. And starting from top endlessly.

I don't know the Dom class well, but I see some transformations and assumptions about very specific encodings:
https://github.com/getkirby/kirby/blob/develop/src/Toolkit/Dom.php#L73-L75

E.g. what irritates me there is that the comment says ISO-8859-1 is need, but the conversion is actually to HTML-ENTITIES.

Maybe the cause is located somewhere around there?

lukasbestle · 2021-10-31T16:48:20Z

E.g. what irritates me there is that the comment says ISO-8859-1 is need, but the conversion is actually to HTML-ENTITIES.

libxml parses the HTML as ISO-8859-1 unless there is a <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> tag in the input HTML (the new shorter HTML5 variant of this is not supported). So unless we inject such a meta tag and remove it manually again, we need to input code in ISO-8859-1 to the loadHTML() method. And that is done by converting everything to entities. As the entity names only use ASCII characters, the entities are valid in ISO-8859-1 and therefore parsed correctly. If we converted directly to ISO-8859-1 instead of to HTML-ENTITIES, every UTF-8 char without an equivalent would be converted to a question mark, so content would get lost.

The "inject a meta tag" route would probably be the "better" solution, but also be a hack and I'm not sure if it will cause other issues, e.g. if there's already a meta tag in the input. So I decided for entity escaping as it's also suggested in the comments in the PHP docs.

@afbora I don't fully understand which part of your test case causes which issue. Could you convert your test case into a test that directly uses the Dom class? I.e. an HTML string that is as short as possible but exhibits the wrong behavior when parsed and printed by the Dom class.

nilshoerrmann · 2021-10-31T17:41:55Z

If you want to load UTF-8 content, this has worked for us pretty well over the years:

$document->loadHTML('<?xml encoding="UTF-8">' . $html);

This is how we use it:
https://github.com/hananils/kirby-tree-methods/blob/09d25aae7fadb403984c02282349b387299b4ea9/lib/tree.php#L39

lukasbestle · 2021-10-31T20:07:27Z

Oh, that looks pretty nice. Thanks, I'll try it out!

afbora · 2021-10-31T20:08:37Z

@lukasbestle You can test following text on writer field:

FERNOX — jau 57 metus — patikimi, profesionalūs, aplinką tausojantys sprendimai jūsų šildymo sistemai

afbora · 2021-10-31T20:10:08Z

@lukasbestle FYI, also when you remove the Sane::sanitize(), works great!

https://github.com/getkirby/kirby/blob/develop/config/fields/writer.php#L32

afbora · 2021-10-31T20:19:53Z

If I was able to get it right, @nilshoerrmann's suggestion didn't work :((

afbora · 2021-10-31T20:24:09Z

Saved without sane sanitize and no stucks ✅

<p>FERNOX —&nbsp;jau 57 metus —&nbsp;patikimi, profesionalūs, aplinką tausojantys sprendimai jūsų šildymo sistemai</p>

Saved with sane sanitize and stuck 💥

<p>FERNOX — jau 57 metus — patikimi, profesionalūs, aplinką tausojantys sprendimai jūsų šildymo sistemai</p>

lukasbestle · 2021-10-31T20:54:55Z

I tried several options with the following result (simple test with just new Dom($string)->toString(); input = expected result):

$this->doc->loadHTML($code)

--- Expected
+++ Actual
@@ @@
-'<html><body><p>TEST — jūsų šildymo sistemai</p></body></html>
+'<html><body><p>TEST &acirc;&#128;&#148;&Acirc;&nbsp;j&Aring;&laquo;s&Aring;&sup3; &Aring;&iexcl;ildymo sistemai</p></body></html>

❌ Input is parsed as ISO-8859-1, which completely destroys the Unicode multibyte sequences.

$this->doc->loadHTML(mb_convert_encoding($code, 'HTML-ENTITIES', 'UTF-8'))

--- Expected
+++ Actual
@@ @@
-'<html><body><p>TEST — jūsų šildymo sistemai</p></body></html>
+'<html><body><p>TEST &mdash;&nbsp;j&#363;s&#371; &scaron;ildymo sistemai</p></body></html>

❌ As the input is encoded as entities, the output also is encoded as entities.

$this->doc->loadHTML('<?xml encoding="UTF-8">' . $code)

--- Expected
+++ Actual
@@ @@
-'<html><body><p>TEST — jūsų šildymo sistemai</p></body></html>
+'<?xml encoding="UTF-8"><html><body><p>TEST &mdash;&nbsp;j&#363;s&#371; &scaron;ildymo sistemai</p></body></html>

❌ The input doesn't contain entities, but weirdly libxml still converts everything to entities on output. Seems like it exports as ISO-8859-1 even though the XML declaration says the document is UTF-8.

$this->doc->loadHTML('<meta http-equiv="Content-Type" content="text/html; charset=utf-8">' . $code)

--- Expected
+++ Actual
@@ @@
-'<html><body><p>TEST — jūsų šildymo sistemai</p></body></html>
+'<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body><p>TEST — jūsų šildymo sistemai</p></body></html>

✅ Promising, but we need to remove the added meta tag without removing any other tag that was already in the input.

lukasbestle · 2021-10-31T21:06:51Z

I'm onto something here. Let's hope it will work. :)

Modifications were needed in two different places because: 1. `libxml` needs to be told to parse the input as UTF-8. A `<meta>` tag would create an empty `<head>` and converting everything to entities is a hack. The XML declaration works great and is the easiest to remove right afterwards. 2. If the `<meta>` tag is used and kept in the document, additional processing (e.g. in `$dom->sanitize()`) will think that it's part of the input. So everything we add needs to be removed immediately after parsing. 3. On output we need the `<meta>` tag again as it's the only way to avoid an export as `ISO-8859-1` with entities. Fixes #3798.

afbora · 2021-11-01T20:36:24Z

I think the issue is that the writer field's handling of the value/text conflicts with that of the sane class. But I couldn't figure out.

bastianallgeier · 2021-11-02T20:27:12Z

Ok, I'll try to explain the underlying problem a bit to keep track of it here.

Form changes

As soon as content gets loaded from the server to fill the form, a copy of it is stored in Vuex as original form values.

Whenever a form input updates, the updated value is also stored in Vuex in a second "changes" object. The original values object and the changes object are then compared against each other to find out of something has changed. In this case, the orange bar shows up.

This is the basic logic behind the changes bar.

Writer

The Writer initializes ProseMirror under the hood. ProseMirror receives the HTML from the server and turns it into its own JSON document format. In order to create this format, it has to be fed with all kinds of rules about allowed block level elements and inline elements. It will automatically do all the hard work and filter out unwanted shit in the HTML – kill attributes, styles, unwanted tags, etc. It is very strict when performing this task.

That JSON document format can then be turned back into clean HTML again. But that HTML might differ from the original HTML the Writer received. For example, when you give the Writer this:

<p>Some&nbsp;<b>Text</b></p>

The output after the clean-up would be:

<p>Some <strong>Text</strong></p>

This is actually really nice, but it also leads to a big problem.

When the form is loaded, the original value in Vuex would be the string from the server. We already clean up the HTML there with our Sane class and the clean up process is already very very close to what ProseMirror is doing, but it will never be 100% the same unless we'd be able to let ProseMirror run on the server, which is unfortunately not possible.

So the original already differs from the initial value once ProseMirror has been initalized. ProseMirror therefor already sends an input event and the changes object contains a value that's no longer the same than the original. That's why the changes bar appears immediately and can also never be fully removed.

bastianallgeier · 2021-11-02T20:57:53Z

After writing this all down, I might have found a solution in the PR above.

bastianallgeier · 2021-11-03T09:37:11Z

✅

afbora added type: bug 🐛 labels Oct 12, 2021

afbora added this to the 3.6.0-beta.4 milestone Oct 12, 2021

afbora assigned bastianallgeier Oct 12, 2021

bastianallgeier changed the title ~~[3.6.0-beta.3] Stuck in an unsaved changes loop~~ Stuck in an unsaved changes loop Oct 15, 2021

bastianallgeier modified the milestones: 3.6.0-rc.1, 3.6.0-rc.2 Oct 19, 2021

bastianallgeier modified the milestones: 3.6.0-rc.2, 3.6.0-rc.3 Oct 27, 2021

lukasbestle assigned lukasbestle and unassigned bastianallgeier Oct 31, 2021

lukasbestle mentioned this issue Oct 31, 2021

Dom: Proper handling for Unicode in HTML and XML documents; more consistent export #3887

Merged

5 tasks

lukasbestle linked a pull request Oct 31, 2021 that will close this issue

Dom: Proper handling for Unicode in HTML and XML documents; more consistent export #3887

Merged

5 tasks

lukasbestle removed their assignment Nov 1, 2021

bastianallgeier modified the milestones: 3.6.0-rc.3, 3.6.0-rc.4 Nov 2, 2021

bastianallgeier mentioned this issue Nov 2, 2021

Compare JSON docs to avoid unsaved changes loop #3899

Merged

afbora linked a pull request Nov 2, 2021 that will close this issue

Compare JSON docs to avoid unsaved changes loop #3899

Merged

bastianallgeier closed this as completed Nov 3, 2021

afbora mentioned this issue Aug 5, 2022

First steps code splitting with Vite 3 #4504

Merged

5 tasks

igregson mentioned this issue Feb 22, 2024

Writer: non-breaking spaces cause endless "Revert" state #6285

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuck in an unsaved changes loop #3798

Stuck in an unsaved changes loop #3798

afbora commented Oct 12, 2021

afbora commented Oct 12, 2021

bastianallgeier commented Oct 18, 2021

lukasbestle commented Oct 18, 2021

distantnative commented Oct 31, 2021 •

edited

Loading

lukasbestle commented Oct 31, 2021 •

edited

Loading

nilshoerrmann commented Oct 31, 2021

lukasbestle commented Oct 31, 2021

afbora commented Oct 31, 2021

afbora commented Oct 31, 2021

afbora commented Oct 31, 2021 •

edited

Loading

afbora commented Oct 31, 2021 •

edited

Loading

lukasbestle commented Oct 31, 2021

lukasbestle commented Oct 31, 2021

afbora commented Nov 1, 2021

bastianallgeier commented Nov 2, 2021

bastianallgeier commented Nov 2, 2021

bastianallgeier commented Nov 3, 2021

Stuck in an unsaved changes loop #3798

Stuck in an unsaved changes loop #3798

Comments

afbora commented Oct 12, 2021

afbora commented Oct 12, 2021

bastianallgeier commented Oct 18, 2021

lukasbestle commented Oct 18, 2021

distantnative commented Oct 31, 2021 • edited Loading

lukasbestle commented Oct 31, 2021 • edited Loading

nilshoerrmann commented Oct 31, 2021

lukasbestle commented Oct 31, 2021

afbora commented Oct 31, 2021

afbora commented Oct 31, 2021

afbora commented Oct 31, 2021 • edited Loading

afbora commented Oct 31, 2021 • edited Loading

lukasbestle commented Oct 31, 2021

lukasbestle commented Oct 31, 2021

afbora commented Nov 1, 2021

bastianallgeier commented Nov 2, 2021

Form changes

Writer

bastianallgeier commented Nov 2, 2021

bastianallgeier commented Nov 3, 2021

distantnative commented Oct 31, 2021 •

edited

Loading

lukasbestle commented Oct 31, 2021 •

edited

Loading

afbora commented Oct 31, 2021 •

edited

Loading

afbora commented Oct 31, 2021 •

edited

Loading