On Block Invalidation and Transformations #11440

mtias · 2018-11-02T21:55:34Z

One of the goals of Gutenberg was to drastically reduce the cases where users would inadvertently end up with the infamous HTML soup — nested spans and divs with inline styles that could break the layouts or reduce a website's usability — regardless of how the user introduces content (pasting from external sources, writing, creating blocks, installing blocks, etc). This has also been a goal for WordPress since the time it introduced the first WYSIWYG editor: to ensure semantic and clean markup. Balancing this aim while also empowering users to express themselves freely is one of the fundamental challenges of the WordPress publishing platform.

A good example might the case of including a cite attribution in a Quote block. With Gutenberg, the resulting markup is semantic and the block is also easy to use. There's no need to understand the underlying HTML yet it is semantically enforced without manual requirements.

One principle that emerges is that blocks are generally opinionated about the markup produced, discarding or invalidating a block if something doesn't fit its requirements.

This is the context from which to look at the levers and tools in place for handling conflicts or block conversion issues.

Invalidation

Is the process with which a block source is compared with its output before the user interacts with a block. When this fails, for whatever reason, the block is considered invalid. This has been an extremely useful mechanism during the development process, highlighting issues with blocks, plugin compatibilities, and so on.

It's important to clarify that this is not a case of whether the markup is "valid" in terms of being HTML spec-compliant but about how the editor knows to create such markup and that its inability to create an identical result can be a strong indicator of potential data loss (the invalidation is then a protective measure).

It goes without saying that the general expectation for the user experience is that invalidation doesn't happen, and when it does, that it minimizes the amount of user intervention needed. However, considering an invalidation does occur, there are a few cases that need to be separated:

Things that should not invalidate a block in the first place (HTML attributes like classes or ids or even data-attributes). Even if they were to be discarded after a save cycle.
Things that cannot be reconciled and need a decision (like adding an entirely new paragraph between a figure and img tags within an Image block) given the potential for losing content.
The user experience of handling conflicts.

The invalidation process can also be deconstructed in phases:

Validate the block exists.
Validate source matches output.
Validate source matches deprecated outputs.
Validate significance of differences.

These are stacked in a way that favors performance and optimizes for the majority of cases. That is to say, the evaluation logic can become more sophisticated the further down it goes in the process. The first few checks have to be extremely efficient since they will be run for all valid blocks. However, once a block is detected as invalid — failing the three first steps — it is alright to spend more time determining validity before falling back to the user's decision.

Validate significance of differences

This is the area that could use improvements going forwards. Most of the currently reported issues come from differences that should not be significant yet produce an invalidation. There are generally two ways to approach this:

Revise block saving to allow for these differences (like HTML attributes).
Overwrite differences that fall under a threshold as insignificant.

Related issues:

There are also intricacies that surface once blocks are extended.

Transformations

Another case of data transformation is present in the mechanism for switching a block to another block type. Transforming a block into another block can be destructive, depending on the heuristics established by the two blocks, the source and the destination. Block transformations also come in two shapes:

Registering a transformation for a block into another.
Using raw-handling / pasting for conversion.

The first case knows about the block's attributes and is the one used in the main block transformation menu. It allows the most knowledge-transfer in the mapping of attributes. Issues in this conversion should be assigned to the individual blocks responsible for it (example, mapping a quote's cite to a plain paragraph).

The second case is used for extracting blocks out of a Classic block, or converting an HTML block into validated core blocks.

This process is grounded on the same handlers for pasting, which is why in general it removes elements as part of its cleanup process — #6102 —. The intention behind pasting is to clean-up the source without losing meaningful information. However, it could be assumed that given an existing chunk of Classic content the editor could be more lenient in the conversion. One way of handling this is separating both operations, pasting and raw conversion: #6878. Another possibility is to alert the user when something is removed.

Related issues:

Potential Tasks

The aim of this issue is to provide enough context for all these related problems so that any improvements can be discussed holistically. Some examples:

Capture unexpected top-level attributes and re-apply them without causing an invalidation.
Distinguish between pasting and raw conversion so that different elements can be preserved.
Use a visual diff check after a source invalidation.
Improve the user experience of handling conflicts: Block Validation, Deprecation and Migration Experience #7604

The text was updated successfully, but these errors were encountered:

hos-shams · 2018-11-05T02:47:56Z

The invalidation process is a real problem for some block/theme developers who keep updating their blocks over time; either for fixing issues or adding new features. We have to create lots of entry inside the deprecated array in this case.

Please provide a skipValidation setting option for blocks so the HTML code will generate completely based on the attribute values. There's a discussion around this issue on #10444 .

danielbachhuber · 2018-11-06T02:29:35Z

From today's Slack thread, I think there are a few specific tasks that would help move this issue forward:

Allow a limited set of attributes (based on HTML5 spec) for HTML elements in all Blocks that are only editable through Code Editor (i.e. no UI for them). These should also persist through the save cycle.
- Force a download inside a link #11471 is another more recent example.
~~Allow a limited set of inline HTML elements in RichText that don't have UI for enabling/disabling. Same deal: only editable through code.~~
- As it turns out, this is already possible. The remaining issue is Classic + Custom HTML blocks: Convert to Blocks removes valid inline formatting #6102
Persist classes in Raw block conversion.
In all scenarios except Paste, warn the end user if Gutenberg detects that some of their data may be stripped in the conversion or save process. Gutenberg's current behavior is too silent for data that has semantic value.

susanpaigen · 2018-11-16T01:00:12Z

Comments/requests from an end user -
@mtias Thank you for the overview. It helped me understand the concepts here.
@danielbachhuber Thanks for the reference from #11539 and the list above, which I think would resolve at least some of the issues I'm worried about.

"Allow a limited set of attributes ... in all Blocks."
Why limit/exclude valid attributes in the interests of "ensure semantic and clean markup"? Many of these affect functionality. And please don't forget the millions (billions?) of instances of 'style = "text-align:right;"' that have been written by the WP visual editor.
"Persist classes in Raw block conversion."
Yes please! And maybe this is self-evident, but this would apply to both classes on the Block level element (p, h, ul, table, etc) and the contents (li, span, tr, td, etc)? For me, the whole point of classes is "clean markup".
"In all scenarios except Paste, warn the end user if Gutenberg detects that some of their data may be stripped in the conversion or save process."
Yes please! I think there also has to be a way to paste without losing markup, and at least the option of getting a warning if anything will get lost. Possibilities could be a user setting; paste into visual view vs. paste into html view; a button in the editing bar (sort of the reverse of "Paste as text"). Paste into visual view (clean) vs. paste into html view (keep) seems straightforward for the user.

Another issue is the fate of the

. I know they're often misused, but on our website we use them a lot, almost always for one of the following two reasons:

To contain floats. The most common situation is a floated (L or R) image with some text which wraps or not depending on the copy length and window width.
As the container tags for layout elements from a framework like Bootstrap or UIkit. Examples: columns, accordions, tabs.
It would be wonderful if there was an option for Convert to Blocks to put divs into HTML blocks with their content intact. Testing today with the Gutenberg plugin, choosing 'Convert to Blocks' for a Classic block with div sections completely deletes the divs and all their content. The same is true if I replace
with
.

reneedobbs · 2019-01-17T20:59:47Z

Valid html tags should not be stripped when converting from Classic to Blocks.

Specifically for the image block: title and data-pin-description. Stripping those 2 tags means WordPress is deleting content. Valuable, valid content. People and businesses have spent lots of time, effort, and resources to have those tags in their images.

fklein-lu · 2019-01-23T11:19:58Z

[Invalidation] [is] the process with which a block source is compared with its output before the user interacts with a block. When this fails, for whatever reason, the block is considered invalid. This has been an extremely useful mechanism during the development process, highlighting issues with blocks, plugin compatibilities, and so on.

I consider that the current block invalidation does not work with any type of block besides the extremely basic, and completely static blocks that ship with Core, as outlined in #12708 (comment).

Imagine a client comes to you “Hey I want to present our services on our website, you know just a title and an image”.
Me: “Alright, here’s your block.”

Two weeks later
Client: “A description for our services would be good. Can we add a text to the block?”
Me: 😐 “Okay…” (writes deprecation).

Another two weeks later
Client: “Can we add a link to the services block? You know to link to detail pages?”
Me: 😕 “Okay…” (writes another deprecation).

Yet another two weeks later
Client: “We want to add a contact person to the services block. A photo, name, and phone number. Can you do that?”
Me: 😖 “Sure, will take a bit.“ (Installs the Classic Editor plugin, and dusts off the good old pagebuilder).

In addition I doubt the usefulness of the current validation during development, see #10444.

youknowriad · 2019-02-14T10:44:44Z

Closing as a duplicate of #7604

mtias added [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Feature] Paste [Feature] Block Transforms Block transforms from one block to another labels Nov 2, 2018

danielbachhuber mentioned this issue Nov 5, 2018

Filterable behavior for Image Block raw transformations #8473

Open

mtias added the [Type] Overview Comprehensive, high level view of an area of focus often with multiple tracking issues label Nov 6, 2018

danielbachhuber mentioned this issue Nov 7, 2018

Separate Paste Handler #11539

Merged

4 tasks

chriscct7 mentioned this issue Nov 7, 2018

No way to extend the link popup overlay #11599

Open

mcsf mentioned this issue Nov 9, 2018

Quote block: Using cite elements in quoted content causes block to become invalid. #11652

Closed

mtias mentioned this issue Nov 20, 2018

Better handling for plugin modification of core blocks #10204

Open

designsimply mentioned this issue Feb 12, 2019

Block validation failed error in console, but block working in editor... #13021

Closed

youknowriad closed this as completed Feb 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On Block Invalidation and Transformations #11440

On Block Invalidation and Transformations #11440

mtias commented Nov 2, 2018 •

edited by youknowriad

Loading

hos-shams commented Nov 5, 2018

danielbachhuber commented Nov 6, 2018

susanpaigen commented Nov 16, 2018

reneedobbs commented Jan 17, 2019

fklein-lu commented Jan 23, 2019

youknowriad commented Feb 14, 2019

On Block Invalidation and Transformations #11440

On Block Invalidation and Transformations #11440

Comments

mtias commented Nov 2, 2018 • edited by youknowriad Loading

Invalidation

Validate significance of differences

Transformations

Potential Tasks

hos-shams commented Nov 5, 2018

danielbachhuber commented Nov 6, 2018

susanpaigen commented Nov 16, 2018

reneedobbs commented Jan 17, 2019

fklein-lu commented Jan 23, 2019

youknowriad commented Feb 14, 2019

mtias commented Nov 2, 2018 •

edited by youknowriad

Loading