Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocks: Remember raw source block for invalid blocks. #38923

Merged
merged 5 commits into from
Mar 14, 2022

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented Feb 18, 2022

Status

While this PR looks like it introduces a property that doesn't get much
use (source) I'd like to go ahead and merge it in to get the most basic
fix (invalid block previews are notoriously broken) while setting the stage
for the more thorough fixes to come later.

Everything that relies on originalContent and most of what relies on
innerHTML is broken but fixing those issues is so expansive that I cannot
focus on them in one mega PR. Thankfully, because they are all separate
problems I do believe that we can fix them piecemeal and let "fixes one
thing" take priority over "fixes everything."

Description

Part of #38922
Replaces #38922

When the editor is unable to validate a block it should preserve the
broken content in the post and show an accurate representation of that
underlying markup in the absence of being able to interact with it.

Currently when showing a preview of an invalid block in the editor we
attempt to re-generate the save output for a block given the attributes
we originally parsed. This is a flawed approach, however, because by
the nature of being invalid we know that there is a problem with those
attributes as they are.

In this patch we're introducing the source attribute on a block which
only exists for invalid blocks at the time of this patch. That source
carries the original un-processed data for a block node and can be used
to reconstruct the original markup without using garbage data and without
inadvertently changing it through the series of autofixes, deprecations,
and the like that happen during normal block loading.

The noticable change is in block-list/block where we will be showing
that reconstruction rather than the re-generated block content. Previously
it was the case that the preview might represent a corrupted version of the
block or show the block as if emptied of all its content. Now, however,
the preview sould accurately reflect the HTML in the source post even
when it's invalid or unrecognized according to the editor.

Further work should take advantage of the source property to provide
a more consistent and trusting experience for working with unrecognized
content.

Testing Instructions

To test this you will want to compare the block preview for invalid
blocks in the editor. This change limits itself to that preview and
leaves other broken UI bits broken, to be addressed in follow-up work
for the sake of focus during review and testing.

Two primary cases to asses are when the block attributes fail to
properly parse and when there are inner blocks involved.

Failed attribute parsing

We've removed the <ul> tag from the following list. The three
bullet points render fine as HTML but are hidden in the editor
because it loads an empty string for the values attribute, since
it was unable to find LI content inside a UL as the attribute
sourcing describes it should.

<!-- wp:list -->
<li>one</li><li>two</li><li>three</li>
<!-- /wp:list -->
trunk this branch
Screen Shot 2022-02-18 at 4 11 58 PM Screen Shot 2022-02-18 at 4 08 28 PM

Inner blocks on invalid block

Because inner blocks are ignored by the existing originalContent attribute they will disappear in the preview for invalid blocks even if they are themselves valid.

<!-- wp:group -->
<div class="wp-block-group"><!-- wp:columns -->
<div class="wp-block-columns"><!-- wp:column -->
<div class="wp-block-column"></div>
<!-- /wp:column -->

<!-- wp:column -->
<div class="wp-block-column"><!-- wp:group {"layout":{"type":"flerx","allowOrientation":false}} -->
<div class="wp-block-group"><!-- wp:paragraph -->
<p>Nested!</p>
<!-- /wp:paragraph --></div>
<!-- /wp:group --></div>
<!-- /wp:column --></div>
<!-- /wp:columns --></div>
<!-- /wp:group -->
trunk this branch
Screen Shot 2022-02-18 at 4 12 01 PM Screen Shot 2022-02-18 at 4 08 32 PM

Types of changes

Introduces a new property on a loaded block which refers to the
un-processed block data when it was originally loaded and which
is used to better represent the unrecognized content in the editor.

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows the accessibility standards.
  • I've tested my changes with keyboard and screen readers.
  • My code has proper inline documentation.
  • I've included developer documentation if appropriate.
  • I've updated all React Native files affected by any refactorings/renamings in this PR (please manually search all *.native.js files for terms that need renaming or removal).
  • I've updated related schemas if appropriate.

@github-actions
Copy link

github-actions bot commented Feb 18, 2022

Size Change: +55 B (0%)

Total Size: 1.16 MB

Filename Size Change
build/block-editor/index.min.js 144 kB +34 B (0%)
build/blocks/index.min.js 46.4 kB +21 B (0%)
ℹ️ View Unchanged
Filename Size
build/a11y/index.min.js 993 B
build/admin-manifest/index.min.js 1.24 kB
build/annotations/index.min.js 2.77 kB
build/api-fetch/index.min.js 2.27 kB
build/autop/index.min.js 2.15 kB
build/blob/index.min.js 487 B
build/block-directory/index.min.js 6.49 kB
build/block-directory/style-rtl.css 1.01 kB
build/block-directory/style.css 1.01 kB
build/block-editor/default-editor-styles-rtl.css 378 B
build/block-editor/default-editor-styles.css 378 B
build/block-editor/style-rtl.css 14.8 kB
build/block-editor/style.css 14.8 kB
build/block-library/blocks/archives/editor-rtl.css 61 B
build/block-library/blocks/archives/editor.css 60 B
build/block-library/blocks/archives/style-rtl.css 65 B
build/block-library/blocks/archives/style.css 65 B
build/block-library/blocks/audio/editor-rtl.css 150 B
build/block-library/blocks/audio/editor.css 150 B
build/block-library/blocks/audio/style-rtl.css 111 B
build/block-library/blocks/audio/style.css 111 B
build/block-library/blocks/audio/theme-rtl.css 125 B
build/block-library/blocks/audio/theme.css 125 B
build/block-library/blocks/block/editor-rtl.css 161 B
build/block-library/blocks/block/editor.css 161 B
build/block-library/blocks/button/editor-rtl.css 445 B
build/block-library/blocks/button/editor.css 445 B
build/block-library/blocks/button/style-rtl.css 560 B
build/block-library/blocks/button/style.css 560 B
build/block-library/blocks/buttons/editor-rtl.css 292 B
build/block-library/blocks/buttons/editor.css 292 B
build/block-library/blocks/buttons/style-rtl.css 275 B
build/block-library/blocks/buttons/style.css 275 B
build/block-library/blocks/calendar/style-rtl.css 207 B
build/block-library/blocks/calendar/style.css 207 B
build/block-library/blocks/categories/editor-rtl.css 84 B
build/block-library/blocks/categories/editor.css 83 B
build/block-library/blocks/categories/style-rtl.css 79 B
build/block-library/blocks/categories/style.css 79 B
build/block-library/blocks/code/style-rtl.css 103 B
build/block-library/blocks/code/style.css 103 B
build/block-library/blocks/code/theme-rtl.css 124 B
build/block-library/blocks/code/theme.css 124 B
build/block-library/blocks/columns/editor-rtl.css 108 B
build/block-library/blocks/columns/editor.css 108 B
build/block-library/blocks/columns/style-rtl.css 406 B
build/block-library/blocks/columns/style.css 406 B
build/block-library/blocks/comment-author-avatar/editor-rtl.css 125 B
build/block-library/blocks/comment-author-avatar/editor.css 125 B
build/block-library/blocks/comment-template/style-rtl.css 127 B
build/block-library/blocks/comment-template/style.css 127 B
build/block-library/blocks/comments-pagination-numbers/editor-rtl.css 123 B
build/block-library/blocks/comments-pagination-numbers/editor.css 121 B
build/block-library/blocks/comments-pagination/editor-rtl.css 222 B
build/block-library/blocks/comments-pagination/editor.css 209 B
build/block-library/blocks/comments-pagination/style-rtl.css 235 B
build/block-library/blocks/comments-pagination/style.css 231 B
build/block-library/blocks/comments-query-loop/editor-rtl.css 95 B
build/block-library/blocks/comments-query-loop/editor.css 95 B
build/block-library/blocks/cover/editor-rtl.css 546 B
build/block-library/blocks/cover/editor.css 547 B
build/block-library/blocks/cover/style-rtl.css 1.56 kB
build/block-library/blocks/cover/style.css 1.56 kB
build/block-library/blocks/embed/editor-rtl.css 293 B
build/block-library/blocks/embed/editor.css 293 B
build/block-library/blocks/embed/style-rtl.css 417 B
build/block-library/blocks/embed/style.css 417 B
build/block-library/blocks/embed/theme-rtl.css 124 B
build/block-library/blocks/embed/theme.css 124 B
build/block-library/blocks/file/editor-rtl.css 300 B
build/block-library/blocks/file/editor.css 300 B
build/block-library/blocks/file/style-rtl.css 255 B
build/block-library/blocks/file/style.css 255 B
build/block-library/blocks/file/view.min.js 353 B
build/block-library/blocks/freeform/editor-rtl.css 2.44 kB
build/block-library/blocks/freeform/editor.css 2.44 kB
build/block-library/blocks/gallery/editor-rtl.css 965 B
build/block-library/blocks/gallery/editor.css 967 B
build/block-library/blocks/gallery/style-rtl.css 1.61 kB
build/block-library/blocks/gallery/style.css 1.61 kB
build/block-library/blocks/gallery/theme-rtl.css 122 B
build/block-library/blocks/gallery/theme.css 122 B
build/block-library/blocks/group/editor-rtl.css 159 B
build/block-library/blocks/group/editor.css 159 B
build/block-library/blocks/group/style-rtl.css 57 B
build/block-library/blocks/group/style.css 57 B
build/block-library/blocks/group/theme-rtl.css 78 B
build/block-library/blocks/group/theme.css 78 B
build/block-library/blocks/heading/style-rtl.css 114 B
build/block-library/blocks/heading/style.css 114 B
build/block-library/blocks/html/editor-rtl.css 332 B
build/block-library/blocks/html/editor.css 333 B
build/block-library/blocks/image/editor-rtl.css 731 B
build/block-library/blocks/image/editor.css 730 B
build/block-library/blocks/image/style-rtl.css 529 B
build/block-library/blocks/image/style.css 535 B
build/block-library/blocks/image/theme-rtl.css 124 B
build/block-library/blocks/image/theme.css 124 B
build/block-library/blocks/latest-comments/style-rtl.css 284 B
build/block-library/blocks/latest-comments/style.css 284 B
build/block-library/blocks/latest-posts/editor-rtl.css 199 B
build/block-library/blocks/latest-posts/editor.css 198 B
build/block-library/blocks/latest-posts/style-rtl.css 447 B
build/block-library/blocks/latest-posts/style.css 446 B
build/block-library/blocks/list/style-rtl.css 94 B
build/block-library/blocks/list/style.css 94 B
build/block-library/blocks/media-text/editor-rtl.css 266 B
build/block-library/blocks/media-text/editor.css 263 B
build/block-library/blocks/media-text/style-rtl.css 493 B
build/block-library/blocks/media-text/style.css 490 B
build/block-library/blocks/more/editor-rtl.css 431 B
build/block-library/blocks/more/editor.css 431 B
build/block-library/blocks/navigation-link/editor-rtl.css 708 B
build/block-library/blocks/navigation-link/editor.css 706 B
build/block-library/blocks/navigation-link/style-rtl.css 94 B
build/block-library/blocks/navigation-link/style.css 94 B
build/block-library/blocks/navigation-submenu/editor-rtl.css 299 B
build/block-library/blocks/navigation-submenu/editor.css 299 B
build/block-library/blocks/navigation-submenu/view.min.js 375 B
build/block-library/blocks/navigation/editor-rtl.css 2.03 kB
build/block-library/blocks/navigation/editor.css 2.04 kB
build/block-library/blocks/navigation/style-rtl.css 1.89 kB
build/block-library/blocks/navigation/style.css 1.88 kB
build/block-library/blocks/navigation/view.min.js 2.85 kB
build/block-library/blocks/nextpage/editor-rtl.css 395 B
build/block-library/blocks/nextpage/editor.css 395 B
build/block-library/blocks/page-list/editor-rtl.css 363 B
build/block-library/blocks/page-list/editor.css 363 B
build/block-library/blocks/page-list/style-rtl.css 175 B
build/block-library/blocks/page-list/style.css 175 B
build/block-library/blocks/paragraph/editor-rtl.css 157 B
build/block-library/blocks/paragraph/editor.css 157 B
build/block-library/blocks/paragraph/style-rtl.css 273 B
build/block-library/blocks/paragraph/style.css 273 B
build/block-library/blocks/post-author/style-rtl.css 175 B
build/block-library/blocks/post-author/style.css 176 B
build/block-library/blocks/post-comments-form/style-rtl.css 446 B
build/block-library/blocks/post-comments-form/style.css 446 B
build/block-library/blocks/post-comments/style-rtl.css 521 B
build/block-library/blocks/post-comments/style.css 521 B
build/block-library/blocks/post-excerpt/editor-rtl.css 73 B
build/block-library/blocks/post-excerpt/editor.css 73 B
build/block-library/blocks/post-excerpt/style-rtl.css 69 B
build/block-library/blocks/post-excerpt/style.css 69 B
build/block-library/blocks/post-featured-image/editor-rtl.css 721 B
build/block-library/blocks/post-featured-image/editor.css 721 B
build/block-library/blocks/post-featured-image/style-rtl.css 153 B
build/block-library/blocks/post-featured-image/style.css 153 B
build/block-library/blocks/post-template/editor-rtl.css 99 B
build/block-library/blocks/post-template/editor.css 98 B
build/block-library/blocks/post-template/style-rtl.css 323 B
build/block-library/blocks/post-template/style.css 323 B
build/block-library/blocks/post-terms/style-rtl.css 73 B
build/block-library/blocks/post-terms/style.css 73 B
build/block-library/blocks/post-title/style-rtl.css 80 B
build/block-library/blocks/post-title/style.css 80 B
build/block-library/blocks/preformatted/style-rtl.css 103 B
build/block-library/blocks/preformatted/style.css 103 B
build/block-library/blocks/pullquote/editor-rtl.css 198 B
build/block-library/blocks/pullquote/editor.css 198 B
build/block-library/blocks/pullquote/style-rtl.css 389 B
build/block-library/blocks/pullquote/style.css 388 B
build/block-library/blocks/pullquote/theme-rtl.css 167 B
build/block-library/blocks/pullquote/theme.css 167 B
build/block-library/blocks/query-pagination-numbers/editor-rtl.css 122 B
build/block-library/blocks/query-pagination-numbers/editor.css 121 B
build/block-library/blocks/query-pagination/editor-rtl.css 221 B
build/block-library/blocks/query-pagination/editor.css 211 B
build/block-library/blocks/query-pagination/style-rtl.css 234 B
build/block-library/blocks/query-pagination/style.css 231 B
build/block-library/blocks/query/editor-rtl.css 131 B
build/block-library/blocks/query/editor.css 132 B
build/block-library/blocks/quote/style-rtl.css 201 B
build/block-library/blocks/quote/style.css 201 B
build/block-library/blocks/quote/theme-rtl.css 223 B
build/block-library/blocks/quote/theme.css 226 B
build/block-library/blocks/read-more/style-rtl.css 132 B
build/block-library/blocks/read-more/style.css 132 B
build/block-library/blocks/rss/editor-rtl.css 202 B
build/block-library/blocks/rss/editor.css 204 B
build/block-library/blocks/rss/style-rtl.css 289 B
build/block-library/blocks/rss/style.css 288 B
build/block-library/blocks/search/editor-rtl.css 165 B
build/block-library/blocks/search/editor.css 165 B
build/block-library/blocks/search/style-rtl.css 397 B
build/block-library/blocks/search/style.css 398 B
build/block-library/blocks/search/theme-rtl.css 64 B
build/block-library/blocks/search/theme.css 64 B
build/block-library/blocks/separator/editor-rtl.css 99 B
build/block-library/blocks/separator/editor.css 99 B
build/block-library/blocks/separator/style-rtl.css 233 B
build/block-library/blocks/separator/style.css 233 B
build/block-library/blocks/separator/theme-rtl.css 172 B
build/block-library/blocks/separator/theme.css 172 B
build/block-library/blocks/shortcode/editor-rtl.css 474 B
build/block-library/blocks/shortcode/editor.css 474 B
build/block-library/blocks/site-logo/editor-rtl.css 744 B
build/block-library/blocks/site-logo/editor.css 744 B
build/block-library/blocks/site-logo/style-rtl.css 181 B
build/block-library/blocks/site-logo/style.css 181 B
build/block-library/blocks/site-tagline/editor-rtl.css 86 B
build/block-library/blocks/site-tagline/editor.css 86 B
build/block-library/blocks/site-title/editor-rtl.css 84 B
build/block-library/blocks/site-title/editor.css 84 B
build/block-library/blocks/social-link/editor-rtl.css 177 B
build/block-library/blocks/social-link/editor.css 177 B
build/block-library/blocks/social-links/editor-rtl.css 674 B
build/block-library/blocks/social-links/editor.css 673 B
build/block-library/blocks/social-links/style-rtl.css 1.37 kB
build/block-library/blocks/social-links/style.css 1.36 kB
build/block-library/blocks/spacer/editor-rtl.css 332 B
build/block-library/blocks/spacer/editor.css 332 B
build/block-library/blocks/spacer/style-rtl.css 48 B
build/block-library/blocks/spacer/style.css 48 B
build/block-library/blocks/table/editor-rtl.css 471 B
build/block-library/blocks/table/editor.css 472 B
build/block-library/blocks/table/style-rtl.css 481 B
build/block-library/blocks/table/style.css 481 B
build/block-library/blocks/table/theme-rtl.css 188 B
build/block-library/blocks/table/theme.css 188 B
build/block-library/blocks/tag-cloud/style-rtl.css 226 B
build/block-library/blocks/tag-cloud/style.css 227 B
build/block-library/blocks/template-part/editor-rtl.css 235 B
build/block-library/blocks/template-part/editor.css 235 B
build/block-library/blocks/template-part/theme-rtl.css 101 B
build/block-library/blocks/template-part/theme.css 101 B
build/block-library/blocks/text-columns/editor-rtl.css 95 B
build/block-library/blocks/text-columns/editor.css 95 B
build/block-library/blocks/text-columns/style-rtl.css 166 B
build/block-library/blocks/text-columns/style.css 166 B
build/block-library/blocks/verse/style-rtl.css 87 B
build/block-library/blocks/verse/style.css 87 B
build/block-library/blocks/video/editor-rtl.css 571 B
build/block-library/blocks/video/editor.css 572 B
build/block-library/blocks/video/style-rtl.css 173 B
build/block-library/blocks/video/style.css 173 B
build/block-library/blocks/video/theme-rtl.css 124 B
build/block-library/blocks/video/theme.css 124 B
build/block-library/common-rtl.css 934 B
build/block-library/common.css 932 B
build/block-library/editor-rtl.css 9.96 kB
build/block-library/editor.css 9.96 kB
build/block-library/index.min.js 168 kB
build/block-library/reset-rtl.css 474 B
build/block-library/reset.css 474 B
build/block-library/style-rtl.css 11.4 kB
build/block-library/style.css 11.4 kB
build/block-library/theme-rtl.css 665 B
build/block-library/theme.css 670 B
build/block-serialization-default-parser/index.min.js 1.12 kB
build/block-serialization-spec-parser/index.min.js 2.83 kB
build/components/index.min.js 217 kB
build/components/style-rtl.css 15.6 kB
build/components/style.css 15.6 kB
build/compose/index.min.js 11.2 kB
build/core-data/index.min.js 14.1 kB
build/customize-widgets/index.min.js 11.2 kB
build/customize-widgets/style-rtl.css 1.39 kB
build/customize-widgets/style.css 1.39 kB
build/data-controls/index.min.js 663 B
build/data/index.min.js 8.05 kB
build/date/index.min.js 31.9 kB
build/deprecated/index.min.js 518 B
build/dom-ready/index.min.js 336 B
build/dom/index.min.js 4.53 kB
build/edit-navigation/index.min.js 16.1 kB
build/edit-navigation/style-rtl.css 4.04 kB
build/edit-navigation/style.css 4.05 kB
build/edit-post/classic-rtl.css 546 B
build/edit-post/classic.css 547 B
build/edit-post/index.min.js 29.8 kB
build/edit-post/style-rtl.css 7.07 kB
build/edit-post/style.css 7.07 kB
build/edit-site/index.min.js 43.9 kB
build/edit-site/style-rtl.css 7.44 kB
build/edit-site/style.css 7.42 kB
build/edit-widgets/index.min.js 16.5 kB
build/edit-widgets/style-rtl.css 4.39 kB
build/edit-widgets/style.css 4.39 kB
build/editor/index.min.js 38.4 kB
build/editor/style-rtl.css 3.71 kB
build/editor/style.css 3.71 kB
build/element/index.min.js 4.29 kB
build/escape-html/index.min.js 548 B
build/format-library/index.min.js 6.62 kB
build/format-library/style-rtl.css 571 B
build/format-library/style.css 571 B
build/hooks/index.min.js 1.66 kB
build/html-entities/index.min.js 454 B
build/i18n/index.min.js 3.79 kB
build/is-shallow-equal/index.min.js 535 B
build/keyboard-shortcuts/index.min.js 1.83 kB
build/keycodes/index.min.js 1.41 kB
build/list-reusable-blocks/index.min.js 1.75 kB
build/list-reusable-blocks/style-rtl.css 838 B
build/list-reusable-blocks/style.css 838 B
build/media-utils/index.min.js 2.94 kB
build/notices/index.min.js 957 B
build/nux/index.min.js 2.12 kB
build/nux/style-rtl.css 751 B
build/nux/style.css 749 B
build/plugins/index.min.js 1.98 kB
build/preferences/index.min.js 1.2 kB
build/primitives/index.min.js 949 B
build/priority-queue/index.min.js 611 B
build/react-i18n/index.min.js 704 B
build/react-refresh-entry/index.min.js 8.44 kB
build/react-refresh-runtime/index.min.js 7.31 kB
build/redux-routine/index.min.js 2.69 kB
build/reusable-blocks/index.min.js 2.24 kB
build/reusable-blocks/style-rtl.css 256 B
build/reusable-blocks/style.css 256 B
build/rich-text/index.min.js 11.1 kB
build/server-side-render/index.min.js 1.61 kB
build/shortcode/index.min.js 1.52 kB
build/token-list/index.min.js 668 B
build/url/index.min.js 1.99 kB
build/viewport/index.min.js 1.08 kB
build/warning/index.min.js 280 B
build/widgets/index.min.js 7.21 kB
build/widgets/style-rtl.css 1.16 kB
build/widgets/style.css 1.16 kB
build/wordcount/index.min.js 1.07 kB

compressed-size-action

@dmsnell dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch 2 times, most recently from a779af7 to 8a73c1c Compare February 23, 2022 21:07
@dmsnell dmsnell requested a review from gziolo February 23, 2022 21:07
@dmsnell
Copy link
Member Author

dmsnell commented Feb 23, 2022

cc: @youknowriad thanks for reviewing the first draft of this. this one I believe is ready for merging plus proposes a more stable API that will serve us better as we clean up all the places that corrupt or erase existing content.

@gziolo gziolo added [Feature] Block API API that allows to express the block paradigm. [Type] New API New API to be used by plugin developers or package users. [Feature] Block Validation/Deprecation Handling block validation to determine accuracy and deprecation labels Feb 24, 2022
// Preserve the original unprocessed version of the block
// that we received so that we can preserve it in its
// existing state when we save.
updatedBlock.source = rawBlock;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The impact here is that we're making the raw block a public API, something we never did in the client and may need deeper thoughts about impact.

cc @mtias @mcsf @WordPress/gutenberg-core

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what other places we will need the rawBlock? Could we just pass the serialized raw block here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I am not understanding is: how is this different from updatedBlock.originalContent? Indeed, this value is already available in BlockListBlock at props.block.originalContent.

Copy link
Member Author

@dmsnell dmsnell Feb 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my first attempts were to pass the serialized raw block but we actually use various forms of the original block when performing comparisons of an "invalid block" and various ways of resolving it. further we have various ways of resolving it, some involving splitting that block into multiple blocks.

originalContent is fundamentally flawed in that it only looks at the innerHTML property of the raw block (to the neglect of inner blocks) which is the cause for a host of data loss and corruption issues and is also one of the reasons we can't convert non-nested blocks to nested blocks without creating those opportunities for data loss. technically it's the combination of our reliance on originalContent, innerHTML, and the way we call getSaveContent(…) to re-generate markup for invalid blocks; this PR is one step in undoing that so that we can preserve that unrecognized content and resolve issues in less risky ways. in other words the difference is that this is fix to a defect that originalContent introduced and unfortunately we can't fix originalContent without changing everything that depends on it.

I understand it's making this part of exposed surface of the block but we use this information, as we do with originalContent, in different places around the editor and I've personally concluded we have to carry it around if we want to avoid the corruption, data loss, and UX problems (such as demonstrated in the screenshot above where it looks like the list lost all its bullet points). Ultimately I would like to get rid of originalContent but as noted in #38922 (and in the PR description above) that's a much broader change and I couldn't get it in a comprehensive yet focused PR.

Copy link
Contributor

@adamziel adamziel Mar 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way to access rawBlocks on save if it's stored somewhere. Let's make the following distinction: It could be a part of the block object, or it could live outside of the block object.

I pondered a different variations of the latter, such as saving rawBlocks in a redux store, but I don't see a reliable way to retrieve them later. We would need a function such as ( block ) => lookupKey that returns the same result throughout the entire lifecycle of the block object. However, the clientId may change just like the block props and about any other detail. I don't think such function exists at the moment.

Generating a stable id like block.stableLifecycleId = uuid() would make it possible, but it would then require additional machinery like maintaining a lookup table, garbage collection, and such. It would buy us some memory, but the rawBlock would still become a public API.

And so, to me it seems like a sensible start. Perhaps we could call it __unstableSource instead of source to make the stakes lower in case it doesn't pan out somewhere down the road, but other than that I like it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I missed this comment @adamziel.

honestly I am fairly persuaded by the source/__unstableSource method, but I want to make sure we exhaust our search of other options that I may be less persuaded at initially.

in this discussion I think it's become even more clear that the block node itself is the place where this should live, and I'll rename to __unstableSource

@scruffian
Copy link
Contributor

I gave this one a spin. The approach makes sense, but the results I saw don't match the description. It works for the list but not for the columns block:
Screenshot 2022-03-02 at 12 09 40

@dmsnell
Copy link
Member Author

dmsnell commented Mar 2, 2022

Thanks for testing @scruffian. can you share the markup you used to break those columns? I recognized what's happening, but at least in my head it's a different problem. I believe that the block in the picture is throwing a JS error (can you check the console). If it's related to the new fallback serialization I bet I'm assuming a field is present that isn't.

@scruffian
Copy link
Contributor

The markup I used was the markup you shared in the PR description for testing. I do see these JS errors:
Screenshot 2022-03-02 at 17 23 46

@dmsnell dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch from 8a73c1c to ee6b540 Compare March 4, 2022 22:19
dmsnell added a commit that referenced this pull request Mar 4, 2022
…lue.

While working on #38923 with invalid blocks it happened that a JavaScript
error surfaced when working with a block that specified an incorrect layout
type. In the case at hand the block asked for the `flerx` alignment instead
of the `flex` alignment.

It's reasonable to expect blocks to call for unrecognized layouts,
especially if a plugin introduces new layouts and then blocks are loaded
without that plugin's support.

When `getLayoutType` is unable to find a recognized layout it returns
`undefined` which causes numerous issues with calling code that expects
a valid layout.

In this patch we're guarding for those cases and if no known layout can
be found for the specified layout type then we return the default layout
instead. This introduces data corruption so it could be that this solution
is more of a coverup than a fix. There is no way however to ignore that
layout and preserve the behavior while still editing the block. On the
other hand a block should not fail validation due to an implementation bug
in the core editor and if this fix truly resolves the issue we should find
invalidated blocks and prevent editing anyway (unfortunately this is not
the case in this patch).
@dmsnell
Copy link
Member Author

dmsnell commented Mar 4, 2022

Thanks for your review and patience with me @scruffian; it took me a while to get back and investigate this. I think I did see this on my own when testing but I overlooked it because it wasn't related to the breakage that this PR addresses.

In fact that does already exist in trunk and appears to come from useAvailableAlignments and my invalid layout type of flerx. Either way it's a good point to also try and fix these cases but I don't think this patch will address that kind of failure and I think that's okay. That's a JS bug in the implementation of getLayoutType, specifically in how we assume that the layouts specified will exist. I've submitted #39232 to start addressing that particular bug, but would rather follow-up to this PR with a separate fix that displays the static preview in cases where JS errors break the block rendering.

@dmsnell
Copy link
Member Author

dmsnell commented Mar 4, 2022

Actually it's fairly trivial to go ahead and show the preview even with JS crashes but it does require that we always save a copy of source for a block even if not invalid. cc: @youknowriad - I think there may be a point where we want to consider just going for it and seeing if memory use is significantly impacted by that change.

Right now there's no way of telling at the parsing stage if a JS error will later hide a block preview. We can either show that preview anyway (and keep around a reference to the raw parsed block) or maintain the current behavior which is to hide everything in that block (and let the garbage collector get around to clearing up that now-unused reference - it's always located when parsing anyways).

@youknowriad
Copy link
Contributor

I'm still unclear about how this approach is better than #38460

In terms of performance (memory), both add bytes to the block object, and if we're keeping the property for just invalid block (which I think is the right thing to do), I'd argue that preserving the string is probably better than preserving the objects (more data).

In terms of APIs, I think keeping the original HTML makes sense, keeping an intermediary block format that is confusing because we can end up with two different block formats in different parts of our code base. If that's limited to the parsing function, that's fine.

For the use-case of the JS error of a valid block, I don't think we should anything special there, the block author needs to be aware of the JS error to solve it, and I'm not sure that showing a preview is a good idea. In all cases, I still prefer to only add the property for invalid blocks to keep the impact small.

@adamziel
Copy link
Contributor

adamziel commented Mar 9, 2022

I wonder if #39183 would help with the UI aspect of making the invalid blocks non-editable

@dmsnell
Copy link
Member Author

dmsnell commented Mar 9, 2022

Thanks @youknowriad for the review!

I'm still unclear about how this approach is better than #38460

The text-only approach led to multiple dead-ends particularly around resolving block errors. It's complicated to describe right now because so many pieces are broken (for example, we currently wipe out inner blocks when running certain block resolutions).

There's certain need in comparing while ignoring inner blocks, but only if we ignore them after re-generating their content. I did explore a number of options including passing the raw strings around and then running the parser on those snippets as needed, but it definitely wasn't that elegant and I feel like it exposed even more new API surface area.

Ultimately the reason it's not just better, but workable where #38460 isn't, is that we use the structural form of a broken block for various contexts where we'd end up needing to store multiple text copies of the block in memory if that's all we stored:

  • Just the original innerHTML
  • The entire raw text from the post
  • The entire raw text from the post except for the original top-most block comment delimiter
  • The array of inner blocks associated with that text, in text or parsed form.

Further now we have the recursive question: what to do about those inner blocks. We descend and store all these strings and inner blocks and then recurse in another way if any of those inner blocks also happens to be broken.

By storing the raw parsed content we get this:

  • The information is already there, we're holding onto allocated memory and deferring the processing of that until we need it, vs. allocating new strings and printing those raw blocks into raw strings.
  • A simpler raw source that raises fewer questions about what it should be. It's a block node as parsed, and in this case, does not particularly need to come from the text.
  • The more raw form of a block that we can use in many different ways from one chunk of data. We can build anything we need from the raw block node but it's not easy to build all the different forms from the text without redoing multiple stages of processing.

I'm still not sold on the performance argument without any measurement. Let's zoom out a bit and ask where the impact is likely to accrue: only in posts with broken blocks; only in posts with lots of broken blocks.

Put in context:

  • This shouldn't impact most posts at all.
  • Wherever this does have a performance impact, it will likely be because the post is very broken and there's a high chance the editor will be corrupting the document or erasing content without some safeguard.

So I view this as the cost of that safety. Is an object we already have worse than a string we have to generate and process? I don't know but I'm open to discuss any data we can find that speaks to it.

keeping an intermediary block format that is confusing because we can end up with two different block formats in different parts of our code base

I hear that but I think we can't avoid this. We'll carry around that string and have different forms of the string plus inner blocks if we go the text route.

Notable points of safety here:

  • If you clone a block you lose the source because it's only assigned in the parser.
  • If you create a block it won't have a source, essentially the source is fiat and there's no history of that block which could be broken now.
  • The source is not an attribute on the block; it's generally not accessible inside the block implementation (such as in edit). If someone does reach in there with our imperative APIs to get it, well, people already do all sorts of stuff and break things anyway so I don't see this being a special exception where we prevent people from doing that.

For the use-case of the JS error of a valid block, I don't think we should anything special there, the block author needs to be aware of the JS error to solve it, and I'm not sure that showing a preview is a good idea.

We don't have to swallow the error, but I'm curious why we wouldn't want to show a preview of the block. There's nothing wrong with the blocks in these cases and the markup is possibly valid. If we don't show a preview we uphold the current state of showing very different versions of the post in the editor vs. when rendered. Would be very easy to mess up and publish something you didn't mean to because large amounts of content was hidden inside a block whose JS threw.

I still prefer to only add the property for invalid blocks to keep the impact small.

Regardless this PR is only addressing the case for blocks that fail to validate during parse. I think eventually though we will have to decide if we want to preserve blocks whose implementations are buggy or let one error cascade into data loss. At that point we'll probably have to made a tradeoff between potential performance impact and trust and data security.

I wonder if #39183 would help with the UI aspect of making the invalid blocks non-editable

"invalid" blocks are currently non-editable. there are a couple of problems though revolving around the fact that corruption is introduced during the loading of blocks out of the parser (meaning the defects are introduced before locking) and also that for some "invalid" blocks the editor doesn't realize they're invalid because it ignores inner blocks and so it wouldn't have any reason to lock them either.

@dmsnell dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch from ee6b540 to 22cb97e Compare March 9, 2022 23:32
@youknowriad
Copy link
Contributor

Thanks for expanding on your reasoning @dmsnell I feel like there's no perfect solution here, maybe the "source" is just something internal to the framework and not exposed to the user (at least for now) which would ease some of my concerns. In that sense, doing what @adamziel suggest could be a potential approach, aka __unstableSource or

We don't have to swallow the error, but I'm curious why we wouldn't want to show a preview of the block.

I might have misunderstood the solution proposed, I think the error should be prominent as the JS error can be anything and we shouldn't just replace it with the preview at the risk of confusing the user.

@dmsnell
Copy link
Member Author

dmsnell commented Mar 11, 2022

maybe the "source" is just something internal to the framework and not exposed to the user

One thing I thought about but didn't explore further is storing the source separately, either in a global or in the data stores. A block implementation could request it, or the block-list/block wrapper, and we could associate them by clientId.

The problem with that approach in my head was just the complexity it introduced in comparison to the more pragmatic approach of storing this data on the block node itself.

It doesn't seem unreasonable to me to communicate this source information through some global in the parser module, but then we'd have to be more careful about purging that store once a block is deleted, something we don't have to do when storing it on the block itself.

Happy to explore other approaches if you want me to. At least at this point I know how to start fitting it together once we track the source in whichever manner we do.

there's no perfect solution here

yeah I agree, and I look at this from the perspective that we're already in a broken condition by the point where this code matters. things are already broken, how can we stop the bleeding?

maybe the "source" is just something internal to the framework and not exposed to the user (at least for now) which would ease some of my concerns.

what were you thinking of for this? something like I mentioned with globals/stores? or just __unstableSource as the property name?

@adamziel
Copy link
Contributor

adamziel commented Mar 11, 2022

One thing I thought about but didn't explore further is storing the source separately, either in a global or in the data stores. A block implementation could request it, or the block-list/block wrapper, and we could associate them by clientId.

@dmsnell I believe that the clientId could change, making this approach even more complex – this is what I was getting at in my comment above:

We would need a function such as ( block ) => lookupKey that returns the same result throughout the entire lifecycle of the block object. However, the clientId may change just like the block props and about any other detail. I don't think such function exists at the moment.

This bit us in the widgets editor, see #28379:

because #27885 makes it so that when you give BlockEditor a list of blocks, it changes the client IDs of those blocks before storing them in its local block-editor state. It turns out client IDs weren’t ever supposed to be “stable” outside of a block editor instance. They can change. This means we can’t use them how we’re using them, as a foreign key

It says "outside of a block editor instance," so maybe it wouldn't be a problem for this PR after all – I'm not sure, more investigation is needed to decide. Either way, this PR is the simplest way forward, and if the API is unstable then the door to using a global store remain open.

Copy link
Contributor

@adamziel adamziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more change is needed (__unstableSource) to get this PR across the finish line, but since the ideas and the implementation seem solid, I'm approving preemptively.

@dmsnell dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch from 22cb97e to 602c44f Compare March 11, 2022 20:16
@dmsnell
Copy link
Member Author

dmsnell commented Mar 11, 2022

Updated to __unstableBlockSource (added the "block" because with the prefix it seemed a little more helpful to bring in the additional context).

While doing this I wondered about our usage of the __unstable prefix and realized that this doesn't particularly fit the guidelines for __unstable or for __experimental. It's not meant to be unstable or to disappear, even though it's not ever supposed to be used. It's required because the internal aspects of block preservation are stretched across package boundaries.

We have a few instances of __unsafe and __internal prefixes but there aren't guidelines for their use. I wonder if it would be worth introducing __internalBlockSource here instead, though I also wonder if we try hard to avoid setting an example of naming things __internal. Any thoughts?

Other than that, any objections to this PR? After this one we can start improving the handling of block resolution, JS errors in the block implementations, and invaliding based on innerBlock changes.

@youknowriad
Copy link
Contributor

__internal suffix need came up a few times before yeah and It seems we should document it as another option for APIs not allowed to be used by theme authors. I think we used __unstable before for some of the internal APIs as well but it seems having a dedicated suffix is clearer.

No objections left for me here 👍

dmsnell added 5 commits March 14, 2022 14:27
Part of #38922

When the editor is unable to validate a block it should preserve the
broken content in the post and show an accurate representation of that
underlying markup in the absence of being able to interact with it.

Currently when showing a preview of an invalid block in the editor we
attempt to re-generate the save output for a block given the attributes
we originally parsed. This is a flawed approach, however, because by
the nature of being invalid we know that there is a problem with those
attributes as they are.

In this patch we're introducing the `source` attribute on a block which
only exists for invalid blocks at the time of this patch. That `source`
carries the original un-processed data for a block node and can be used
to reconstruct the original markup without using garbage data and without
inadvertently changing it through the series of autofixes, deprecations,
and the like that happen during normal block loading.

The noticable change is in `block-list/block` where we will be showing
that reconstruction rather than the re-generated block content. Previously
it was the case that the preview might represent a corrupted version of the
block or show the block as if emptied of all its content. Now, however,
the preview sould accurately reflect the HTML in the source post even
when it's invalid or unrecognized according to the editor.

Further work should take advantage of the `source` property to provide
a more consistent and trusting experience for working with unrecognized
content.
Potentially this is a conflict with #38794 and once `isValidBlockContent`
is gone we might be able to remove this change.
@dmsnell dmsnell force-pushed the blocks/preserve-source-markup-when-invalid-2 branch from 14a881a to 0b721f8 Compare March 14, 2022 21:27
@dmsnell dmsnell merged commit 3ea2d42 into trunk Mar 14, 2022
@dmsnell dmsnell deleted the blocks/preserve-source-markup-when-invalid-2 branch March 14, 2022 22:42
@github-actions github-actions bot added this to the Gutenberg 12.9 milestone Mar 14, 2022
dmsnell added a commit that referenced this pull request Mar 17, 2022
Following the work of #38923 where we started preserving the source
content for blocks that the editor fails to recognize, this patch
uses that information in the Code Editor to ensure that we don't
accidentally corrupt or lose content when making edits there.

Previously the `serializeBlock` function, which generates the HTML
for the Code Editor, would try to re-generated the saved content
for blocks that we know are invalid. Because we know the input is
invalid we can guarantee that the output will have even more problems.

Now that `serializeBlock` function is aware of the source content for
a block, and if the block is marked invalid _and_ that source exists
it will pass that along to the output and skip the processing and
re-generation of the saved content. This ensures that errors don't
cascade and that unrelated edits to other blocks from the Code Editor
won't destroy content in the marked-invalid blocks.
@bph bph added the Needs Dev Note Requires a developer note for a major WordPress release cycle label Mar 23, 2022
dmsnell added a commit that referenced this pull request Mar 24, 2022
…lue.

While working on #38923 with invalid blocks it happened that a JavaScript
error surfaced when working with a block that specified an incorrect layout
type. In the case at hand the block asked for the `flerx` alignment instead
of the `flex` alignment.

It's reasonable to expect blocks to call for unrecognized layouts,
especially if a plugin introduces new layouts and then blocks are loaded
without that plugin's support.

When `getLayoutType` is unable to find a recognized layout it returns
`undefined` which causes numerous issues with calling code that expects
a valid layout.

In this patch we're guarding for those cases and if no known layout can
be found for the specified layout type then we return the default layout
instead. This introduces data corruption so it could be that this solution
is more of a coverup than a fix. There is no way however to ignore that
layout and preserve the behavior while still editing the block. On the
other hand a block should not fail validation due to an implementation bug
in the core editor and if this fix truly resolves the issue we should find
invalidated blocks and prevent editing anyway (unfortunately this is not
the case in this patch).
dmsnell added a commit that referenced this pull request Mar 24, 2022
Following the work of #38923 where we started preserving the source
content for blocks that the editor fails to recognize, this patch
uses that information in the Code Editor to ensure that we don't
accidentally corrupt or lose content when making edits there.

Previously the `serializeBlock` function, which generates the HTML
for the Code Editor, would try to re-generated the saved content
for blocks that we know are invalid. Because we know the input is
invalid we can guarantee that the output will have even more problems.

Now that `serializeBlock` function is aware of the source content for
a block, and if the block is marked invalid _and_ that source exists
it will pass that along to the output and skip the processing and
re-generation of the saved content. This ensures that errors don't
cascade and that unrelated edits to other blocks from the Code Editor
won't destroy content in the marked-invalid blocks.
dmsnell added a commit that referenced this pull request Mar 25, 2022
Following the work of #38923 where we started preserving the source
content for blocks that the editor fails to recognize, this patch
uses that information in the Code Editor to ensure that we don't
accidentally corrupt or lose content when making edits there.

Previously the `serializeBlock` function, which generates the HTML
for the Code Editor, would try to re-generated the saved content
for blocks that we know are invalid. Because we know the input is
invalid we can guarantee that the output will have even more problems.

Now that `serializeBlock` function is aware of the source content for
a block, and if the block is marked invalid _and_ that source exists
it will pass that along to the output and skip the processing and
re-generation of the saved content. This ensures that errors don't
cascade and that unrelated edits to other blocks from the Code Editor
won't destroy content in the marked-invalid blocks.
dmsnell added a commit that referenced this pull request Mar 28, 2022
…#39523)

Following the work of #38923 where we started preserving the source
content for blocks that the editor fails to recognize, this patch
uses that information in the Code Editor to ensure that we don't
accidentally corrupt or lose content when making edits there.

Previously the `serializeBlock` function, which generates the HTML
for the Code Editor, would try to re-generated the saved content
for blocks that we know are invalid. Because we know the input is
invalid we can guarantee that the output will have even more problems.

Now that `serializeBlock` function is aware of the source content for
a block, and if the block is marked invalid _and_ that source exists
it will pass that along to the output and skip the processing and
re-generation of the saved content. This ensures that errors don't
cascade and that unrelated edits to other blocks from the Code Editor
won't destroy content in the marked-invalid blocks.
dmsnell added a commit that referenced this pull request Apr 18, 2022
Previously when running "Convert to Blocks" on the invalid-block resolution dialog
we have been converting only a portion of the invalid block due to the way we examine
the `originalContent.`

Since #38923 we have had `__unstableBlockSource` available which tracks the entire
contents of the original block that failed to validate.

In this patch we're using that source information in order to split apart the invalid
block and then separately parse each of its constituent components.

The result of this change is that we're able to preserve more block content when
resolving an invalid block than we were before. For example, supposing we have a broken
container block full of valid inner blocks, we are now able to extract all of those
inner blocks and preserve them whereas before we would lose all block information and
the stack would turn into an empty classic block.
dmsnell added a commit that referenced this pull request May 4, 2022
…lue.

While working on #38923 with invalid blocks it happened that a JavaScript
error surfaced when working with a block that specified an incorrect layout
type. In the case at hand the block asked for the `flerx` alignment instead
of the `flex` alignment.

It's reasonable to expect blocks to call for unrecognized layouts,
especially if a plugin introduces new layouts and then blocks are loaded
without that plugin's support.

When `getLayoutType` is unable to find a recognized layout it returns
`undefined` which causes numerous issues with calling code that expects
a valid layout.

In this patch we're guarding for those cases and if no known layout can
be found for the specified layout type then we return the default layout
instead. This introduces data corruption so it could be that this solution
is more of a coverup than a fix. There is no way however to ignore that
layout and preserve the behavior while still editing the block. On the
other hand a block should not fail validation due to an implementation bug
in the core editor and if this fix truly resolves the issue we should find
invalidated blocks and prevent editing anyway (unfortunately this is not
the case in this patch).
dmsnell added a commit that referenced this pull request Dec 16, 2022
…lue.

While working on #38923 with invalid blocks it happened that a JavaScript
error surfaced when working with a block that specified an incorrect layout
type. In the case at hand the block asked for the `flerx` alignment instead
of the `flex` alignment.

It's reasonable to expect blocks to call for unrecognized layouts,
especially if a plugin introduces new layouts and then blocks are loaded
without that plugin's support.

When `getLayoutType` is unable to find a recognized layout it returns
`undefined` which causes numerous issues with calling code that expects
a valid layout.

In this patch we're guarding for those cases and if no known layout can
be found for the specified layout type then we return the default layout
instead. This introduces data corruption so it could be that this solution
is more of a coverup than a fix. There is no way however to ignore that
layout and preserve the behavior while still editing the block. On the
other hand a block should not fail validation due to an implementation bug
in the core editor and if this fix truly resolves the issue we should find
invalidated blocks and prevent editing anyway (unfortunately this is not
the case in this patch).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Block API API that allows to express the block paradigm. [Feature] Block Validation/Deprecation Handling block validation to determine accuracy and deprecation Needs Dev Note Requires a developer note for a major WordPress release cycle [Type] New API New API to be used by plugin developers or package users.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants