Blocks: Remember raw source block for invalid blocks. #38923

dmsnell · 2022-02-18T23:15:56Z

Status

While this PR looks like it introduces a property that doesn't get much
use (source) I'd like to go ahead and merge it in to get the most basic
fix (invalid block previews are notoriously broken) while setting the stage
for the more thorough fixes to come later.

Everything that relies on originalContent and most of what relies on
innerHTML is broken but fixing those issues is so expansive that I cannot
focus on them in one mega PR. Thankfully, because they are all separate
problems I do believe that we can fix them piecemeal and let "fixes one
thing" take priority over "fixes everything."

Description

Part of #38922
Replaces #38922

When the editor is unable to validate a block it should preserve the
broken content in the post and show an accurate representation of that
underlying markup in the absence of being able to interact with it.

Currently when showing a preview of an invalid block in the editor we
attempt to re-generate the save output for a block given the attributes
we originally parsed. This is a flawed approach, however, because by
the nature of being invalid we know that there is a problem with those
attributes as they are.

In this patch we're introducing the source attribute on a block which
only exists for invalid blocks at the time of this patch. That source
carries the original un-processed data for a block node and can be used
to reconstruct the original markup without using garbage data and without
inadvertently changing it through the series of autofixes, deprecations,
and the like that happen during normal block loading.

The noticable change is in block-list/block where we will be showing
that reconstruction rather than the re-generated block content. Previously
it was the case that the preview might represent a corrupted version of the
block or show the block as if emptied of all its content. Now, however,
the preview sould accurately reflect the HTML in the source post even
when it's invalid or unrecognized according to the editor.

Further work should take advantage of the source property to provide
a more consistent and trusting experience for working with unrecognized
content.

Testing Instructions

To test this you will want to compare the block preview for invalid
blocks in the editor. This change limits itself to that preview and
leaves other broken UI bits broken, to be addressed in follow-up work
for the sake of focus during review and testing.

Two primary cases to asses are when the block attributes fail to
properly parse and when there are inner blocks involved.

Failed attribute parsing

We've removed the <ul> tag from the following list. The three
bullet points render fine as HTML but are hidden in the editor
because it loads an empty string for the values attribute, since
it was unable to find LI content inside a UL as the attribute
sourcing describes it should.

<!-- wp:list -->
<li>one</li><li>two</li><li>three</li>
<!-- /wp:list -->

`trunk`	this branch

Inner blocks on invalid block

Because inner blocks are ignored by the existing originalContent attribute they will disappear in the preview for invalid blocks even if they are themselves valid.

<!-- wp:group -->
<div class="wp-block-group"><!-- wp:columns -->
<div class="wp-block-columns"><!-- wp:column -->
<div class="wp-block-column"></div>
<!-- /wp:column -->

<!-- wp:column -->
<div class="wp-block-column"><!-- wp:group {"layout":{"type":"flerx","allowOrientation":false}} -->
<div class="wp-block-group"><!-- wp:paragraph -->
<p>Nested!</p>
<!-- /wp:paragraph --></div>
<!-- /wp:group --></div>
<!-- /wp:column --></div>
<!-- /wp:columns --></div>
<!-- /wp:group -->

`trunk`	this branch

Types of changes

Introduces a new property on a loaded block which refers to the
un-processed block data when it was originally loaded and which
is used to better represent the unrecognized content in the editor.

Checklist:

My code is tested.
My code follows the WordPress code style.
My code follows the accessibility standards.
I've tested my changes with keyboard and screen readers.
My code has proper inline documentation.
I've included developer documentation if appropriate.
I've updated all React Native files affected by any refactorings/renamings in this PR (please manually search all *.native.js files for terms that need renaming or removal).
I've updated related schemas if appropriate.

github-actions · 2022-02-18T23:23:44Z

Size Change: +55 B (0%)

Total Size: 1.16 MB

Filename	Size	Change
`build/block-editor/index.min.js`	144 kB	+34 B (0%)
`build/blocks/index.min.js`	46.4 kB	+21 B (0%)

ℹ️ View Unchanged

Filename	Size
`build/a11y/index.min.js`	993 B
`build/admin-manifest/index.min.js`	1.24 kB
`build/annotations/index.min.js`	2.77 kB
`build/api-fetch/index.min.js`	2.27 kB
`build/autop/index.min.js`	2.15 kB
`build/blob/index.min.js`	487 B
`build/block-directory/index.min.js`	6.49 kB
`build/block-directory/style-rtl.css`	1.01 kB
`build/block-directory/style.css`	1.01 kB
`build/block-editor/default-editor-styles-rtl.css`	378 B
`build/block-editor/default-editor-styles.css`	378 B
`build/block-editor/style-rtl.css`	14.8 kB
`build/block-editor/style.css`	14.8 kB
`build/block-library/blocks/archives/editor-rtl.css`	61 B
`build/block-library/blocks/archives/editor.css`	60 B
`build/block-library/blocks/archives/style-rtl.css`	65 B
`build/block-library/blocks/archives/style.css`	65 B
`build/block-library/blocks/audio/editor-rtl.css`	150 B
`build/block-library/blocks/audio/editor.css`	150 B
`build/block-library/blocks/audio/style-rtl.css`	111 B
`build/block-library/blocks/audio/style.css`	111 B
`build/block-library/blocks/audio/theme-rtl.css`	125 B
`build/block-library/blocks/audio/theme.css`	125 B
`build/block-library/blocks/block/editor-rtl.css`	161 B
`build/block-library/blocks/block/editor.css`	161 B
`build/block-library/blocks/button/editor-rtl.css`	445 B
`build/block-library/blocks/button/editor.css`	445 B
`build/block-library/blocks/button/style-rtl.css`	560 B
`build/block-library/blocks/button/style.css`	560 B
`build/block-library/blocks/buttons/editor-rtl.css`	292 B
`build/block-library/blocks/buttons/editor.css`	292 B
`build/block-library/blocks/buttons/style-rtl.css`	275 B
`build/block-library/blocks/buttons/style.css`	275 B
`build/block-library/blocks/calendar/style-rtl.css`	207 B
`build/block-library/blocks/calendar/style.css`	207 B
`build/block-library/blocks/categories/editor-rtl.css`	84 B
`build/block-library/blocks/categories/editor.css`	83 B
`build/block-library/blocks/categories/style-rtl.css`	79 B
`build/block-library/blocks/categories/style.css`	79 B
`build/block-library/blocks/code/style-rtl.css`	103 B
`build/block-library/blocks/code/style.css`	103 B
`build/block-library/blocks/code/theme-rtl.css`	124 B
`build/block-library/blocks/code/theme.css`	124 B
`build/block-library/blocks/columns/editor-rtl.css`	108 B
`build/block-library/blocks/columns/editor.css`	108 B
`build/block-library/blocks/columns/style-rtl.css`	406 B
`build/block-library/blocks/columns/style.css`	406 B
`build/block-library/blocks/comment-author-avatar/editor-rtl.css`	125 B
`build/block-library/blocks/comment-author-avatar/editor.css`	125 B
`build/block-library/blocks/comment-template/style-rtl.css`	127 B
`build/block-library/blocks/comment-template/style.css`	127 B
`build/block-library/blocks/comments-pagination-numbers/editor-rtl.css`	123 B
`build/block-library/blocks/comments-pagination-numbers/editor.css`	121 B
`build/block-library/blocks/comments-pagination/editor-rtl.css`	222 B
`build/block-library/blocks/comments-pagination/editor.css`	209 B
`build/block-library/blocks/comments-pagination/style-rtl.css`	235 B
`build/block-library/blocks/comments-pagination/style.css`	231 B
`build/block-library/blocks/comments-query-loop/editor-rtl.css`	95 B
`build/block-library/blocks/comments-query-loop/editor.css`	95 B
`build/block-library/blocks/cover/editor-rtl.css`	546 B
`build/block-library/blocks/cover/editor.css`	547 B
`build/block-library/blocks/cover/style-rtl.css`	1.56 kB
`build/block-library/blocks/cover/style.css`	1.56 kB
`build/block-library/blocks/embed/editor-rtl.css`	293 B
`build/block-library/blocks/embed/editor.css`	293 B
`build/block-library/blocks/embed/style-rtl.css`	417 B
`build/block-library/blocks/embed/style.css`	417 B
`build/block-library/blocks/embed/theme-rtl.css`	124 B
`build/block-library/blocks/embed/theme.css`	124 B
`build/block-library/blocks/file/editor-rtl.css`	300 B
`build/block-library/blocks/file/editor.css`	300 B
`build/block-library/blocks/file/style-rtl.css`	255 B
`build/block-library/blocks/file/style.css`	255 B
`build/block-library/blocks/file/view.min.js`	353 B
`build/block-library/blocks/freeform/editor-rtl.css`	2.44 kB
`build/block-library/blocks/freeform/editor.css`	2.44 kB
`build/block-library/blocks/gallery/editor-rtl.css`	965 B
`build/block-library/blocks/gallery/editor.css`	967 B
`build/block-library/blocks/gallery/style-rtl.css`	1.61 kB
`build/block-library/blocks/gallery/style.css`	1.61 kB
`build/block-library/blocks/gallery/theme-rtl.css`	122 B
`build/block-library/blocks/gallery/theme.css`	122 B
`build/block-library/blocks/group/editor-rtl.css`	159 B
`build/block-library/blocks/group/editor.css`	159 B
`build/block-library/blocks/group/style-rtl.css`	57 B
`build/block-library/blocks/group/style.css`	57 B
`build/block-library/blocks/group/theme-rtl.css`	78 B
`build/block-library/blocks/group/theme.css`	78 B
`build/block-library/blocks/heading/style-rtl.css`	114 B
`build/block-library/blocks/heading/style.css`	114 B
`build/block-library/blocks/html/editor-rtl.css`	332 B
`build/block-library/blocks/html/editor.css`	333 B
`build/block-library/blocks/image/editor-rtl.css`	731 B
`build/block-library/blocks/image/editor.css`	730 B
`build/block-library/blocks/image/style-rtl.css`	529 B
`build/block-library/blocks/image/style.css`	535 B
`build/block-library/blocks/image/theme-rtl.css`	124 B
`build/block-library/blocks/image/theme.css`	124 B
`build/block-library/blocks/latest-comments/style-rtl.css`	284 B
`build/block-library/blocks/latest-comments/style.css`	284 B
`build/block-library/blocks/latest-posts/editor-rtl.css`	199 B
`build/block-library/blocks/latest-posts/editor.css`	198 B
`build/block-library/blocks/latest-posts/style-rtl.css`	447 B
`build/block-library/blocks/latest-posts/style.css`	446 B
`build/block-library/blocks/list/style-rtl.css`	94 B
`build/block-library/blocks/list/style.css`	94 B
`build/block-library/blocks/media-text/editor-rtl.css`	266 B
`build/block-library/blocks/media-text/editor.css`	263 B
`build/block-library/blocks/media-text/style-rtl.css`	493 B
`build/block-library/blocks/media-text/style.css`	490 B
`build/block-library/blocks/more/editor-rtl.css`	431 B
`build/block-library/blocks/more/editor.css`	431 B
`build/block-library/blocks/navigation-link/editor-rtl.css`	708 B
`build/block-library/blocks/navigation-link/editor.css`	706 B
`build/block-library/blocks/navigation-link/style-rtl.css`	94 B
`build/block-library/blocks/navigation-link/style.css`	94 B
`build/block-library/blocks/navigation-submenu/editor-rtl.css`	299 B
`build/block-library/blocks/navigation-submenu/editor.css`	299 B
`build/block-library/blocks/navigation-submenu/view.min.js`	375 B
`build/block-library/blocks/navigation/editor-rtl.css`	2.03 kB
`build/block-library/blocks/navigation/editor.css`	2.04 kB
`build/block-library/blocks/navigation/style-rtl.css`	1.89 kB
`build/block-library/blocks/navigation/style.css`	1.88 kB
`build/block-library/blocks/navigation/view.min.js`	2.85 kB
`build/block-library/blocks/nextpage/editor-rtl.css`	395 B
`build/block-library/blocks/nextpage/editor.css`	395 B
`build/block-library/blocks/page-list/editor-rtl.css`	363 B
`build/block-library/blocks/page-list/editor.css`	363 B
`build/block-library/blocks/page-list/style-rtl.css`	175 B
`build/block-library/blocks/page-list/style.css`	175 B
`build/block-library/blocks/paragraph/editor-rtl.css`	157 B
`build/block-library/blocks/paragraph/editor.css`	157 B
`build/block-library/blocks/paragraph/style-rtl.css`	273 B
`build/block-library/blocks/paragraph/style.css`	273 B
`build/block-library/blocks/post-author/style-rtl.css`	175 B
`build/block-library/blocks/post-author/style.css`	176 B
`build/block-library/blocks/post-comments-form/style-rtl.css`	446 B
`build/block-library/blocks/post-comments-form/style.css`	446 B
`build/block-library/blocks/post-comments/style-rtl.css`	521 B
`build/block-library/blocks/post-comments/style.css`	521 B
`build/block-library/blocks/post-excerpt/editor-rtl.css`	73 B
`build/block-library/blocks/post-excerpt/editor.css`	73 B
`build/block-library/blocks/post-excerpt/style-rtl.css`	69 B
`build/block-library/blocks/post-excerpt/style.css`	69 B
`build/block-library/blocks/post-featured-image/editor-rtl.css`	721 B
`build/block-library/blocks/post-featured-image/editor.css`	721 B
`build/block-library/blocks/post-featured-image/style-rtl.css`	153 B
`build/block-library/blocks/post-featured-image/style.css`	153 B
`build/block-library/blocks/post-template/editor-rtl.css`	99 B
`build/block-library/blocks/post-template/editor.css`	98 B
`build/block-library/blocks/post-template/style-rtl.css`	323 B
`build/block-library/blocks/post-template/style.css`	323 B
`build/block-library/blocks/post-terms/style-rtl.css`	73 B
`build/block-library/blocks/post-terms/style.css`	73 B
`build/block-library/blocks/post-title/style-rtl.css`	80 B
`build/block-library/blocks/post-title/style.css`	80 B
`build/block-library/blocks/preformatted/style-rtl.css`	103 B
`build/block-library/blocks/preformatted/style.css`	103 B
`build/block-library/blocks/pullquote/editor-rtl.css`	198 B
`build/block-library/blocks/pullquote/editor.css`	198 B
`build/block-library/blocks/pullquote/style-rtl.css`	389 B
`build/block-library/blocks/pullquote/style.css`	388 B
`build/block-library/blocks/pullquote/theme-rtl.css`	167 B
`build/block-library/blocks/pullquote/theme.css`	167 B
`build/block-library/blocks/query-pagination-numbers/editor-rtl.css`	122 B
`build/block-library/blocks/query-pagination-numbers/editor.css`	121 B
`build/block-library/blocks/query-pagination/editor-rtl.css`	221 B
`build/block-library/blocks/query-pagination/editor.css`	211 B
`build/block-library/blocks/query-pagination/style-rtl.css`	234 B
`build/block-library/blocks/query-pagination/style.css`	231 B
`build/block-library/blocks/query/editor-rtl.css`	131 B
`build/block-library/blocks/query/editor.css`	132 B
`build/block-library/blocks/quote/style-rtl.css`	201 B
`build/block-library/blocks/quote/style.css`	201 B
`build/block-library/blocks/quote/theme-rtl.css`	223 B
`build/block-library/blocks/quote/theme.css`	226 B
`build/block-library/blocks/read-more/style-rtl.css`	132 B
`build/block-library/blocks/read-more/style.css`	132 B
`build/block-library/blocks/rss/editor-rtl.css`	202 B
`build/block-library/blocks/rss/editor.css`	204 B
`build/block-library/blocks/rss/style-rtl.css`	289 B
`build/block-library/blocks/rss/style.css`	288 B
`build/block-library/blocks/search/editor-rtl.css`	165 B
`build/block-library/blocks/search/editor.css`	165 B
`build/block-library/blocks/search/style-rtl.css`	397 B
`build/block-library/blocks/search/style.css`	398 B
`build/block-library/blocks/search/theme-rtl.css`	64 B
`build/block-library/blocks/search/theme.css`	64 B
`build/block-library/blocks/separator/editor-rtl.css`	99 B
`build/block-library/blocks/separator/editor.css`	99 B
`build/block-library/blocks/separator/style-rtl.css`	233 B
`build/block-library/blocks/separator/style.css`	233 B
`build/block-library/blocks/separator/theme-rtl.css`	172 B
`build/block-library/blocks/separator/theme.css`	172 B
`build/block-library/blocks/shortcode/editor-rtl.css`	474 B
`build/block-library/blocks/shortcode/editor.css`	474 B
`build/block-library/blocks/site-logo/editor-rtl.css`	744 B
`build/block-library/blocks/site-logo/editor.css`	744 B
`build/block-library/blocks/site-logo/style-rtl.css`	181 B
`build/block-library/blocks/site-logo/style.css`	181 B
`build/block-library/blocks/site-tagline/editor-rtl.css`	86 B
`build/block-library/blocks/site-tagline/editor.css`	86 B
`build/block-library/blocks/site-title/editor-rtl.css`	84 B
`build/block-library/blocks/site-title/editor.css`	84 B
`build/block-library/blocks/social-link/editor-rtl.css`	177 B
`build/block-library/blocks/social-link/editor.css`	177 B
`build/block-library/blocks/social-links/editor-rtl.css`	674 B
`build/block-library/blocks/social-links/editor.css`	673 B
`build/block-library/blocks/social-links/style-rtl.css`	1.37 kB
`build/block-library/blocks/social-links/style.css`	1.36 kB
`build/block-library/blocks/spacer/editor-rtl.css`	332 B
`build/block-library/blocks/spacer/editor.css`	332 B
`build/block-library/blocks/spacer/style-rtl.css`	48 B
`build/block-library/blocks/spacer/style.css`	48 B
`build/block-library/blocks/table/editor-rtl.css`	471 B
`build/block-library/blocks/table/editor.css`	472 B
`build/block-library/blocks/table/style-rtl.css`	481 B
`build/block-library/blocks/table/style.css`	481 B
`build/block-library/blocks/table/theme-rtl.css`	188 B
`build/block-library/blocks/table/theme.css`	188 B
`build/block-library/blocks/tag-cloud/style-rtl.css`	226 B
`build/block-library/blocks/tag-cloud/style.css`	227 B
`build/block-library/blocks/template-part/editor-rtl.css`	235 B
`build/block-library/blocks/template-part/editor.css`	235 B
`build/block-library/blocks/template-part/theme-rtl.css`	101 B
`build/block-library/blocks/template-part/theme.css`	101 B
`build/block-library/blocks/text-columns/editor-rtl.css`	95 B
`build/block-library/blocks/text-columns/editor.css`	95 B
`build/block-library/blocks/text-columns/style-rtl.css`	166 B
`build/block-library/blocks/text-columns/style.css`	166 B
`build/block-library/blocks/verse/style-rtl.css`	87 B
`build/block-library/blocks/verse/style.css`	87 B
`build/block-library/blocks/video/editor-rtl.css`	571 B
`build/block-library/blocks/video/editor.css`	572 B
`build/block-library/blocks/video/style-rtl.css`	173 B
`build/block-library/blocks/video/style.css`	173 B
`build/block-library/blocks/video/theme-rtl.css`	124 B
`build/block-library/blocks/video/theme.css`	124 B
`build/block-library/common-rtl.css`	934 B
`build/block-library/common.css`	932 B
`build/block-library/editor-rtl.css`	9.96 kB
`build/block-library/editor.css`	9.96 kB
`build/block-library/index.min.js`	168 kB
`build/block-library/reset-rtl.css`	474 B
`build/block-library/reset.css`	474 B
`build/block-library/style-rtl.css`	11.4 kB
`build/block-library/style.css`	11.4 kB
`build/block-library/theme-rtl.css`	665 B
`build/block-library/theme.css`	670 B
`build/block-serialization-default-parser/index.min.js`	1.12 kB
`build/block-serialization-spec-parser/index.min.js`	2.83 kB
`build/components/index.min.js`	217 kB
`build/components/style-rtl.css`	15.6 kB
`build/components/style.css`	15.6 kB
`build/compose/index.min.js`	11.2 kB
`build/core-data/index.min.js`	14.1 kB
`build/customize-widgets/index.min.js`	11.2 kB
`build/customize-widgets/style-rtl.css`	1.39 kB
`build/customize-widgets/style.css`	1.39 kB
`build/data-controls/index.min.js`	663 B
`build/data/index.min.js`	8.05 kB
`build/date/index.min.js`	31.9 kB
`build/deprecated/index.min.js`	518 B
`build/dom-ready/index.min.js`	336 B
`build/dom/index.min.js`	4.53 kB
`build/edit-navigation/index.min.js`	16.1 kB
`build/edit-navigation/style-rtl.css`	4.04 kB
`build/edit-navigation/style.css`	4.05 kB
`build/edit-post/classic-rtl.css`	546 B
`build/edit-post/classic.css`	547 B
`build/edit-post/index.min.js`	29.8 kB
`build/edit-post/style-rtl.css`	7.07 kB
`build/edit-post/style.css`	7.07 kB
`build/edit-site/index.min.js`	43.9 kB
`build/edit-site/style-rtl.css`	7.44 kB
`build/edit-site/style.css`	7.42 kB
`build/edit-widgets/index.min.js`	16.5 kB
`build/edit-widgets/style-rtl.css`	4.39 kB
`build/edit-widgets/style.css`	4.39 kB
`build/editor/index.min.js`	38.4 kB
`build/editor/style-rtl.css`	3.71 kB
`build/editor/style.css`	3.71 kB
`build/element/index.min.js`	4.29 kB
`build/escape-html/index.min.js`	548 B
`build/format-library/index.min.js`	6.62 kB
`build/format-library/style-rtl.css`	571 B
`build/format-library/style.css`	571 B
`build/hooks/index.min.js`	1.66 kB
`build/html-entities/index.min.js`	454 B
`build/i18n/index.min.js`	3.79 kB
`build/is-shallow-equal/index.min.js`	535 B
`build/keyboard-shortcuts/index.min.js`	1.83 kB
`build/keycodes/index.min.js`	1.41 kB
`build/list-reusable-blocks/index.min.js`	1.75 kB
`build/list-reusable-blocks/style-rtl.css`	838 B
`build/list-reusable-blocks/style.css`	838 B
`build/media-utils/index.min.js`	2.94 kB
`build/notices/index.min.js`	957 B
`build/nux/index.min.js`	2.12 kB
`build/nux/style-rtl.css`	751 B
`build/nux/style.css`	749 B
`build/plugins/index.min.js`	1.98 kB
`build/preferences/index.min.js`	1.2 kB
`build/primitives/index.min.js`	949 B
`build/priority-queue/index.min.js`	611 B
`build/react-i18n/index.min.js`	704 B
`build/react-refresh-entry/index.min.js`	8.44 kB
`build/react-refresh-runtime/index.min.js`	7.31 kB
`build/redux-routine/index.min.js`	2.69 kB
`build/reusable-blocks/index.min.js`	2.24 kB
`build/reusable-blocks/style-rtl.css`	256 B
`build/reusable-blocks/style.css`	256 B
`build/rich-text/index.min.js`	11.1 kB
`build/server-side-render/index.min.js`	1.61 kB
`build/shortcode/index.min.js`	1.52 kB
`build/token-list/index.min.js`	668 B
`build/url/index.min.js`	1.99 kB
`build/viewport/index.min.js`	1.08 kB
`build/warning/index.min.js`	280 B
`build/widgets/index.min.js`	7.21 kB
`build/widgets/style-rtl.css`	1.16 kB
`build/widgets/style.css`	1.16 kB
`build/wordcount/index.min.js`	1.07 kB

_{compressed-size-action}

dmsnell · 2022-02-23T21:08:11Z

cc: @youknowriad thanks for reviewing the first draft of this. this one I believe is ready for merging plus proposes a more stable API that will serve us better as we clean up all the places that corrupt or erase existing content.

youknowriad · 2022-02-24T10:40:14Z

packages/blocks/src/api/parser/index.js

+		// Preserve the original unprocessed version of the block
+		// that we received so that we can preserve it in its
+		// existing state when we save.
+		updatedBlock.source = rawBlock;


The impact here is that we're making the raw block a public API, something we never did in the client and may need deeper thoughts about impact.

cc @mtias @mcsf @WordPress/gutenberg-core

In what other places we will need the rawBlock? Could we just pass the serialized raw block here?

One thing I am not understanding is: how is this different from updatedBlock.originalContent? Indeed, this value is already available in BlockListBlock at props.block.originalContent.

my first attempts were to pass the serialized raw block but we actually use various forms of the original block when performing comparisons of an "invalid block" and various ways of resolving it. further we have various ways of resolving it, some involving splitting that block into multiple blocks.

originalContent is fundamentally flawed in that it only looks at the innerHTML property of the raw block (to the neglect of inner blocks) which is the cause for a host of data loss and corruption issues and is also one of the reasons we can't convert non-nested blocks to nested blocks without creating those opportunities for data loss. technically it's the combination of our reliance on originalContent, innerHTML, and the way we call getSaveContent(…) to re-generate markup for invalid blocks; this PR is one step in undoing that so that we can preserve that unrecognized content and resolve issues in less risky ways. in other words the difference is that this is fix to a defect that originalContent introduced and unfortunately we can't fix originalContent without changing everything that depends on it.

I understand it's making this part of exposed surface of the block but we use this information, as we do with originalContent, in different places around the editor and I've personally concluded we have to carry it around if we want to avoid the corruption, data loss, and UX problems (such as demonstrated in the screenshot above where it looks like the list lost all its bullet points). Ultimately I would like to get rid of originalContent but as noted in #38922 (and in the PR description above) that's a much broader change and I couldn't get it in a comprehensive yet focused PR.

The only way to access rawBlocks on save if it's stored somewhere. Let's make the following distinction: It could be a part of the block object, or it could live outside of the block object.

I pondered a different variations of the latter, such as saving rawBlocks in a redux store, but I don't see a reliable way to retrieve them later. We would need a function such as ( block ) => lookupKey that returns the same result throughout the entire lifecycle of the block object. However, the clientId may change just like the block props and about any other detail. I don't think such function exists at the moment.

Generating a stable id like block.stableLifecycleId = uuid() would make it possible, but it would then require additional machinery like maintaining a lookup table, garbage collection, and such. It would buy us some memory, but the rawBlock would still become a public API.

And so, to me it seems like a sensible start. Perhaps we could call it __unstableSource instead of source to make the stakes lower in case it doesn't pan out somewhere down the road, but other than that I like it.

sorry I missed this comment @adamziel.

honestly I am fairly persuaded by the source/__unstableSource method, but I want to make sure we exhaust our search of other options that I may be less persuaded at initially.

in this discussion I think it's become even more clear that the block node itself is the place where this should live, and I'll rename to __unstableSource

scruffian · 2022-03-02T12:10:21Z

I gave this one a spin. The approach makes sense, but the results I saw don't match the description. It works for the list but not for the columns block:

dmsnell · 2022-03-02T17:21:16Z

Thanks for testing @scruffian. can you share the markup you used to break those columns? I recognized what's happening, but at least in my head it's a different problem. I believe that the block in the picture is throwing a JS error (can you check the console). If it's related to the new fallback serialization I bet I'm assuming a field is present that isn't.

scruffian · 2022-03-02T17:24:30Z

The markup I used was the markup you shared in the PR description for testing. I do see these JS errors:

…lue. While working on #38923 with invalid blocks it happened that a JavaScript error surfaced when working with a block that specified an incorrect layout type. In the case at hand the block asked for the `flerx` alignment instead of the `flex` alignment. It's reasonable to expect blocks to call for unrecognized layouts, especially if a plugin introduces new layouts and then blocks are loaded without that plugin's support. When `getLayoutType` is unable to find a recognized layout it returns `undefined` which causes numerous issues with calling code that expects a valid layout. In this patch we're guarding for those cases and if no known layout can be found for the specified layout type then we return the default layout instead. This introduces data corruption so it could be that this solution is more of a coverup than a fix. There is no way however to ignore that layout and preserve the behavior while still editing the block. On the other hand a block should not fail validation due to an implementation bug in the core editor and if this fix truly resolves the issue we should find invalidated blocks and prevent editing anyway (unfortunately this is not the case in this patch).

dmsnell · 2022-03-04T23:25:53Z

Thanks for your review and patience with me @scruffian; it took me a while to get back and investigate this. I think I did see this on my own when testing but I overlooked it because it wasn't related to the breakage that this PR addresses.

In fact that does already exist in trunk and appears to come from useAvailableAlignments and my invalid layout type of flerx. Either way it's a good point to also try and fix these cases but I don't think this patch will address that kind of failure and I think that's okay. That's a JS bug in the implementation of getLayoutType, specifically in how we assume that the layouts specified will exist. I've submitted #39232 to start addressing that particular bug, but would rather follow-up to this PR with a separate fix that displays the static preview in cases where JS errors break the block rendering.

dmsnell · 2022-03-04T23:34:35Z

Actually it's fairly trivial to go ahead and show the preview even with JS crashes but it does require that we always save a copy of source for a block even if not invalid. cc: @youknowriad - I think there may be a point where we want to consider just going for it and seeing if memory use is significantly impacted by that change.

Right now there's no way of telling at the parsing stage if a JS error will later hide a block preview. We can either show that preview anyway (and keep around a reference to the raw parsed block) or maintain the current behavior which is to hide everything in that block (and let the garbage collector get around to clearing up that now-unused reference - it's always located when parsing anyways).

youknowriad · 2022-03-08T16:58:10Z

I'm still unclear about how this approach is better than #38460

In terms of performance (memory), both add bytes to the block object, and if we're keeping the property for just invalid block (which I think is the right thing to do), I'd argue that preserving the string is probably better than preserving the objects (more data).

In terms of APIs, I think keeping the original HTML makes sense, keeping an intermediary block format that is confusing because we can end up with two different block formats in different parts of our code base. If that's limited to the parsing function, that's fine.

For the use-case of the JS error of a valid block, I don't think we should anything special there, the block author needs to be aware of the JS error to solve it, and I'm not sure that showing a preview is a good idea. In all cases, I still prefer to only add the property for invalid blocks to keep the impact small.

adamziel · 2022-03-09T18:32:17Z

I wonder if #39183 would help with the UI aspect of making the invalid blocks non-editable

dmsnell · 2022-03-09T23:29:50Z

Thanks @youknowriad for the review!

I'm still unclear about how this approach is better than #38460

The text-only approach led to multiple dead-ends particularly around resolving block errors. It's complicated to describe right now because so many pieces are broken (for example, we currently wipe out inner blocks when running certain block resolutions).

There's certain need in comparing while ignoring inner blocks, but only if we ignore them after re-generating their content. I did explore a number of options including passing the raw strings around and then running the parser on those snippets as needed, but it definitely wasn't that elegant and I feel like it exposed even more new API surface area.

Ultimately the reason it's not just better, but workable where #38460 isn't, is that we use the structural form of a broken block for various contexts where we'd end up needing to store multiple text copies of the block in memory if that's all we stored:

Just the original innerHTML
The entire raw text from the post
The entire raw text from the post except for the original top-most block comment delimiter
The array of inner blocks associated with that text, in text or parsed form.

Further now we have the recursive question: what to do about those inner blocks. We descend and store all these strings and inner blocks and then recurse in another way if any of those inner blocks also happens to be broken.

By storing the raw parsed content we get this:

The information is already there, we're holding onto allocated memory and deferring the processing of that until we need it, vs. allocating new strings and printing those raw blocks into raw strings.
A simpler raw source that raises fewer questions about what it should be. It's a block node as parsed, and in this case, does not particularly need to come from the text.
The more raw form of a block that we can use in many different ways from one chunk of data. We can build anything we need from the raw block node but it's not easy to build all the different forms from the text without redoing multiple stages of processing.

I'm still not sold on the performance argument without any measurement. Let's zoom out a bit and ask where the impact is likely to accrue: only in posts with broken blocks; only in posts with lots of broken blocks.

Put in context:

This shouldn't impact most posts at all.
Wherever this does have a performance impact, it will likely be because the post is very broken and there's a high chance the editor will be corrupting the document or erasing content without some safeguard.

So I view this as the cost of that safety. Is an object we already have worse than a string we have to generate and process? I don't know but I'm open to discuss any data we can find that speaks to it.

keeping an intermediary block format that is confusing because we can end up with two different block formats in different parts of our code base

I hear that but I think we can't avoid this. We'll carry around that string and have different forms of the string plus inner blocks if we go the text route.

Notable points of safety here:

If you clone a block you lose the source because it's only assigned in the parser.
If you create a block it won't have a source, essentially the source is fiat and there's no history of that block which could be broken now.
The source is not an attribute on the block; it's generally not accessible inside the block implementation (such as in edit). If someone does reach in there with our imperative APIs to get it, well, people already do all sorts of stuff and break things anyway so I don't see this being a special exception where we prevent people from doing that.

For the use-case of the JS error of a valid block, I don't think we should anything special there, the block author needs to be aware of the JS error to solve it, and I'm not sure that showing a preview is a good idea.

We don't have to swallow the error, but I'm curious why we wouldn't want to show a preview of the block. There's nothing wrong with the blocks in these cases and the markup is possibly valid. If we don't show a preview we uphold the current state of showing very different versions of the post in the editor vs. when rendered. Would be very easy to mess up and publish something you didn't mean to because large amounts of content was hidden inside a block whose JS threw.

I still prefer to only add the property for invalid blocks to keep the impact small.

Regardless this PR is only addressing the case for blocks that fail to validate during parse. I think eventually though we will have to decide if we want to preserve blocks whose implementations are buggy or let one error cascade into data loss. At that point we'll probably have to made a tradeoff between potential performance impact and trust and data security.

I wonder if #39183 would help with the UI aspect of making the invalid blocks non-editable

"invalid" blocks are currently non-editable. there are a couple of problems though revolving around the fact that corruption is introduced during the loading of blocks out of the parser (meaning the defects are introduced before locking) and also that for some "invalid" blocks the editor doesn't realize they're invalid because it ignores inner blocks and so it wouldn't have any reason to lock them either.

youknowriad · 2022-03-10T05:42:14Z

Thanks for expanding on your reasoning @dmsnell I feel like there's no perfect solution here, maybe the "source" is just something internal to the framework and not exposed to the user (at least for now) which would ease some of my concerns. In that sense, doing what @adamziel suggest could be a potential approach, aka __unstableSource or

We don't have to swallow the error, but I'm curious why we wouldn't want to show a preview of the block.

I might have misunderstood the solution proposed, I think the error should be prominent as the JS error can be anything and we shouldn't just replace it with the preview at the risk of confusing the user.

dmsnell · 2022-03-11T03:55:15Z

maybe the "source" is just something internal to the framework and not exposed to the user

One thing I thought about but didn't explore further is storing the source separately, either in a global or in the data stores. A block implementation could request it, or the block-list/block wrapper, and we could associate them by clientId.

The problem with that approach in my head was just the complexity it introduced in comparison to the more pragmatic approach of storing this data on the block node itself.

It doesn't seem unreasonable to me to communicate this source information through some global in the parser module, but then we'd have to be more careful about purging that store once a block is deleted, something we don't have to do when storing it on the block itself.

Happy to explore other approaches if you want me to. At least at this point I know how to start fitting it together once we track the source in whichever manner we do.

there's no perfect solution here

yeah I agree, and I look at this from the perspective that we're already in a broken condition by the point where this code matters. things are already broken, how can we stop the bleeding?

maybe the "source" is just something internal to the framework and not exposed to the user (at least for now) which would ease some of my concerns.

what were you thinking of for this? something like I mentioned with globals/stores? or just __unstableSource as the property name?

adamziel · 2022-03-11T13:04:14Z

One thing I thought about but didn't explore further is storing the source separately, either in a global or in the data stores. A block implementation could request it, or the block-list/block wrapper, and we could associate them by clientId.

@dmsnell I believe that the clientId could change, making this approach even more complex – this is what I was getting at in my comment above:

We would need a function such as ( block ) => lookupKey that returns the same result throughout the entire lifecycle of the block object. However, the clientId may change just like the block props and about any other detail. I don't think such function exists at the moment.

This bit us in the widgets editor, see #28379:

because #27885 makes it so that when you give BlockEditor a list of blocks, it changes the client IDs of those blocks before storing them in its local block-editor state. It turns out client IDs weren’t ever supposed to be “stable” outside of a block editor instance. They can change. This means we can’t use them how we’re using them, as a foreign key

It says "outside of a block editor instance," so maybe it wouldn't be a problem for this PR after all – I'm not sure, more investigation is needed to decide. Either way, this PR is the simplest way forward, and if the API is unstable then the door to using a global store remain open.

adamziel

One more change is needed (__unstableSource) to get this PR across the finish line, but since the ideas and the implementation seem solid, I'm approving preemptively.

dmsnell · 2022-03-11T21:00:43Z

Updated to __unstableBlockSource (added the "block" because with the prefix it seemed a little more helpful to bring in the additional context).

While doing this I wondered about our usage of the __unstable prefix and realized that this doesn't particularly fit the guidelines for __unstable or for __experimental. It's not meant to be unstable or to disappear, even though it's not ever supposed to be used. It's required because the internal aspects of block preservation are stretched across package boundaries.

We have a few instances of __unsafe and __internal prefixes but there aren't guidelines for their use. I wonder if it would be worth introducing __internalBlockSource here instead, though I also wonder if we try hard to avoid setting an example of naming things __internal. Any thoughts?

Other than that, any objections to this PR? After this one we can start improving the handling of block resolution, JS errors in the block implementations, and invaliding based on innerBlock changes.

youknowriad · 2022-03-14T08:45:49Z

__internal suffix need came up a few times before yeah and It seems we should document it as another option for APIs not allowed to be used by theme authors. I think we used __unstable before for some of the internal APIs as well but it seems having a dedicated suffix is clearer.

No objections left for me here 👍

Part of #38922 When the editor is unable to validate a block it should preserve the broken content in the post and show an accurate representation of that underlying markup in the absence of being able to interact with it. Currently when showing a preview of an invalid block in the editor we attempt to re-generate the save output for a block given the attributes we originally parsed. This is a flawed approach, however, because by the nature of being invalid we know that there is a problem with those attributes as they are. In this patch we're introducing the `source` attribute on a block which only exists for invalid blocks at the time of this patch. That `source` carries the original un-processed data for a block node and can be used to reconstruct the original markup without using garbage data and without inadvertently changing it through the series of autofixes, deprecations, and the like that happen during normal block loading. The noticable change is in `block-list/block` where we will be showing that reconstruction rather than the re-generated block content. Previously it was the case that the preview might represent a corrupted version of the block or show the block as if emptied of all its content. Now, however, the preview sould accurately reflect the HTML in the source post even when it's invalid or unrecognized according to the editor. Further work should take advantage of the `source` property to provide a more consistent and trusting experience for working with unrecognized content.

Potentially this is a conflict with #38794 and once `isValidBlockContent` is gone we might be able to remove this change.

Following the work of #38923 where we started preserving the source content for blocks that the editor fails to recognize, this patch uses that information in the Code Editor to ensure that we don't accidentally corrupt or lose content when making edits there. Previously the `serializeBlock` function, which generates the HTML for the Code Editor, would try to re-generated the saved content for blocks that we know are invalid. Because we know the input is invalid we can guarantee that the output will have even more problems. Now that `serializeBlock` function is aware of the source content for a block, and if the block is marked invalid _and_ that source exists it will pass that along to the output and skip the processing and re-generation of the saved content. This ensures that errors don't cascade and that unrelated edits to other blocks from the Code Editor won't destroy content in the marked-invalid blocks.

…lue. While working on #38923 with invalid blocks it happened that a JavaScript error surfaced when working with a block that specified an incorrect layout type. In the case at hand the block asked for the `flerx` alignment instead of the `flex` alignment. It's reasonable to expect blocks to call for unrecognized layouts, especially if a plugin introduces new layouts and then blocks are loaded without that plugin's support. When `getLayoutType` is unable to find a recognized layout it returns `undefined` which causes numerous issues with calling code that expects a valid layout. In this patch we're guarding for those cases and if no known layout can be found for the specified layout type then we return the default layout instead. This introduces data corruption so it could be that this solution is more of a coverup than a fix. There is no way however to ignore that layout and preserve the behavior while still editing the block. On the other hand a block should not fail validation due to an implementation bug in the core editor and if this fix truly resolves the issue we should find invalidated blocks and prevent editing anyway (unfortunately this is not the case in this patch).

Following the work of #38923 where we started preserving the source content for blocks that the editor fails to recognize, this patch uses that information in the Code Editor to ensure that we don't accidentally corrupt or lose content when making edits there. Previously the `serializeBlock` function, which generates the HTML for the Code Editor, would try to re-generated the saved content for blocks that we know are invalid. Because we know the input is invalid we can guarantee that the output will have even more problems. Now that `serializeBlock` function is aware of the source content for a block, and if the block is marked invalid _and_ that source exists it will pass that along to the output and skip the processing and re-generation of the saved content. This ensures that errors don't cascade and that unrelated edits to other blocks from the Code Editor won't destroy content in the marked-invalid blocks.

…#39523) Following the work of #38923 where we started preserving the source content for blocks that the editor fails to recognize, this patch uses that information in the Code Editor to ensure that we don't accidentally corrupt or lose content when making edits there. Previously the `serializeBlock` function, which generates the HTML for the Code Editor, would try to re-generated the saved content for blocks that we know are invalid. Because we know the input is invalid we can guarantee that the output will have even more problems. Now that `serializeBlock` function is aware of the source content for a block, and if the block is marked invalid _and_ that source exists it will pass that along to the output and skip the processing and re-generation of the saved content. This ensures that errors don't cascade and that unrelated edits to other blocks from the Code Editor won't destroy content in the marked-invalid blocks.

Previously when running "Convert to Blocks" on the invalid-block resolution dialog we have been converting only a portion of the invalid block due to the way we examine the `originalContent.` Since #38923 we have had `__unstableBlockSource` available which tracks the entire contents of the original block that failed to validate. In this patch we're using that source information in order to split apart the invalid block and then separately parse each of its constituent components. The result of this change is that we're able to preserve more block content when resolving an invalid block than we were before. For example, supposing we have a broken container block full of valid inner blocks, we are now able to extract all of those inner blocks and preserve them whereas before we would lose all block information and the stack would turn into an empty classic block.

…lue. While working on #38923 with invalid blocks it happened that a JavaScript error surfaced when working with a block that specified an incorrect layout type. In the case at hand the block asked for the `flerx` alignment instead of the `flex` alignment. It's reasonable to expect blocks to call for unrecognized layouts, especially if a plugin introduces new layouts and then blocks are loaded without that plugin's support. When `getLayoutType` is unable to find a recognized layout it returns `undefined` which causes numerous issues with calling code that expects a valid layout. In this patch we're guarding for those cases and if no known layout can be found for the specified layout type then we return the default layout instead. This introduces data corruption so it could be that this solution is more of a coverup than a fix. There is no way however to ignore that layout and preserve the behavior while still editing the block. On the other hand a block should not fail validation due to an implementation bug in the core editor and if this fix truly resolves the issue we should find invalidated blocks and prevent editing anyway (unfortunately this is not the case in this patch).

dmsnell requested review from mcsf, youknowriad and vcanales February 18, 2022 23:15

dmsnell requested a review from ellatrix as a code owner February 18, 2022 23:15

This was referenced Feb 18, 2022

Unrecognized Blocks: Preserve content of invalid and missing blocks #38922

Open

[Blocks] Preserve source markup in invalid blocks #38460

Closed

dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch 2 times, most recently from a779af7 to 8a73c1c Compare February 23, 2022 21:07

dmsnell requested a review from gziolo February 23, 2022 21:07

gziolo added [Feature] Block API API that allows to express the block paradigm. [Type] New API New API to be used by plugin developers or package users. [Feature] Block Validation/Deprecation Handling block validation to determine accuracy and deprecation labels Feb 24, 2022

youknowriad reviewed Feb 24, 2022

View reviewed changes

dmsnell mentioned this pull request Feb 24, 2022

Quote: use nested blocks #25892

Merged

dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch from 8a73c1c to ee6b540 Compare March 4, 2022 22:19

dmsnell mentioned this pull request Mar 4, 2022

Block Editor: Use default layout type when block specifies unknown value #39232

Open

8 tasks

dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch from ee6b540 to 22cb97e Compare March 9, 2022 23:32

adamziel approved these changes Mar 11, 2022

View reviewed changes

dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch from 22cb97e to 602c44f Compare March 11, 2022 20:16

dmsnell added 5 commits March 14, 2022 14:27

Fix test case given an invalid block without a source.

9011272

Potentially this is a conflict with #38794 and once `isValidBlockContent` is gone we might be able to remove this change.

Replace source with __unstableBlockSource

116c81d

fixup! Replace source with __unstableBlockSource

54bcfe3

fixup! fixup! Replace source with __unstableBlockSource

0b721f8

dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch from 14a881a to 0b721f8 Compare March 14, 2022 21:27

dmsnell merged commit 3ea2d42 into trunk Mar 14, 2022

dmsnell deleted the blocks/preserve-source-markup-when-invalid-2 branch March 14, 2022 22:42

github-actions bot added this to the Gutenberg 12.9 milestone Mar 14, 2022

dmsnell mentioned this pull request Mar 17, 2022

Blocks: Preserves source of unrecognized blocks inside of Code Editor #39523

Merged

glendaviesnz mentioned this pull request Mar 21, 2022

Test/optional ignore style validation #39417

Closed

bph added the Needs Dev Note Requires a developer note for a major WordPress release cycle label Mar 23, 2022

bph mentioned this pull request Apr 5, 2022

Dev Notes Tracking Issue for WordPress 6.0 #39654

Closed

37 tasks

dmsnell mentioned this pull request Apr 18, 2022

Blocks: Examine source block when converted invalid block into blocks. #40429

Draft

SiobhyB mentioned this pull request Sep 20, 2022

[RNMobile] Invalid blocks: Display raw source block #44295

Open

dmsnell mentioned this pull request Jun 1, 2023

Prevent freeform and shortcode blocks from converting HTML entities #51123

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blocks: Remember raw source block for invalid blocks. #38923

Blocks: Remember raw source block for invalid blocks. #38923

dmsnell commented Feb 18, 2022 •

edited

Loading

github-actions bot commented Feb 18, 2022 •

edited

Loading

dmsnell commented Feb 23, 2022

youknowriad Feb 24, 2022

ntsekouras Feb 24, 2022

mcsf Feb 24, 2022

dmsnell Feb 24, 2022 •

edited

Loading

adamziel Mar 8, 2022 •

edited

Loading

dmsnell Mar 11, 2022

scruffian commented Mar 2, 2022

dmsnell commented Mar 2, 2022

scruffian commented Mar 2, 2022

dmsnell commented Mar 4, 2022

dmsnell commented Mar 4, 2022

youknowriad commented Mar 8, 2022

adamziel commented Mar 9, 2022

dmsnell commented Mar 9, 2022

youknowriad commented Mar 10, 2022

dmsnell commented Mar 11, 2022

adamziel commented Mar 11, 2022 •

edited

Loading

adamziel left a comment •

edited

Loading

dmsnell commented Mar 11, 2022

youknowriad commented Mar 14, 2022

Blocks: Remember raw source block for invalid blocks. #38923

Blocks: Remember raw source block for invalid blocks. #38923

Conversation

dmsnell commented Feb 18, 2022 • edited Loading

Status

Description

Testing Instructions

Failed attribute parsing

Inner blocks on invalid block

Types of changes

Checklist:

github-actions bot commented Feb 18, 2022 • edited Loading

dmsnell commented Feb 23, 2022

youknowriad Feb 24, 2022

Choose a reason for hiding this comment

ntsekouras Feb 24, 2022

Choose a reason for hiding this comment

mcsf Feb 24, 2022

Choose a reason for hiding this comment

dmsnell Feb 24, 2022 • edited Loading

Choose a reason for hiding this comment

adamziel Mar 8, 2022 • edited Loading

Choose a reason for hiding this comment

dmsnell Mar 11, 2022

Choose a reason for hiding this comment

scruffian commented Mar 2, 2022

dmsnell commented Mar 2, 2022

scruffian commented Mar 2, 2022

dmsnell commented Mar 4, 2022

dmsnell commented Mar 4, 2022

youknowriad commented Mar 8, 2022

adamziel commented Mar 9, 2022

dmsnell commented Mar 9, 2022

youknowriad commented Mar 10, 2022

dmsnell commented Mar 11, 2022

adamziel commented Mar 11, 2022 • edited Loading

adamziel left a comment • edited Loading

Choose a reason for hiding this comment

dmsnell commented Mar 11, 2022

youknowriad commented Mar 14, 2022

dmsnell commented Feb 18, 2022 •

edited

Loading

github-actions bot commented Feb 18, 2022 •

edited

Loading

dmsnell Feb 24, 2022 •

edited

Loading

adamziel Mar 8, 2022 •

edited

Loading

adamziel commented Mar 11, 2022 •

edited

Loading

adamziel left a comment •

edited

Loading