Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Block Invalidation and Transformations #11440

Closed
mtias opened this issue Nov 2, 2018 · 6 comments
Closed

On Block Invalidation and Transformations #11440

mtias opened this issue Nov 2, 2018 · 6 comments
Labels
[Feature] Block Transforms Block transforms from one block to another [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Feature] Paste [Type] Overview Comprehensive, high level view of an area of focus often with multiple tracking issues

Comments

@mtias
Copy link
Member

mtias commented Nov 2, 2018

One of the goals of Gutenberg was to drastically reduce the cases where users would inadvertently end up with the infamous HTML soup — nested spans and divs with inline styles that could break the layouts or reduce a website's usability — regardless of how the user introduces content (pasting from external sources, writing, creating blocks, installing blocks, etc). This has also been a goal for WordPress since the time it introduced the first WYSIWYG editor: to ensure semantic and clean markup. Balancing this aim while also empowering users to express themselves freely is one of the fundamental challenges of the WordPress publishing platform.

A good example might the case of including a cite attribution in a Quote block. With Gutenberg, the resulting markup is semantic and the block is also easy to use. There's no need to understand the underlying HTML yet it is semantically enforced without manual requirements.

One principle that emerges is that blocks are generally opinionated about the markup produced, discarding or invalidating a block if something doesn't fit its requirements.

This is the context from which to look at the levers and tools in place for handling conflicts or block conversion issues.

Invalidation

Is the process with which a block source is compared with its output before the user interacts with a block. When this fails, for whatever reason, the block is considered invalid. This has been an extremely useful mechanism during the development process, highlighting issues with blocks, plugin compatibilities, and so on.

It's important to clarify that this is not a case of whether the markup is "valid" in terms of being HTML spec-compliant but about how the editor knows to create such markup and that its inability to create an identical result can be a strong indicator of potential data loss (the invalidation is then a protective measure).

It goes without saying that the general expectation for the user experience is that invalidation doesn't happen, and when it does, that it minimizes the amount of user intervention needed. However, considering an invalidation does occur, there are a few cases that need to be separated:

  • Things that should not invalidate a block in the first place (HTML attributes like classes or ids or even data-attributes). Even if they were to be discarded after a save cycle.
  • Things that cannot be reconciled and need a decision (like adding an entirely new paragraph between a figure and img tags within an Image block) given the potential for losing content.
  • The user experience of handling conflicts.

The invalidation process can also be deconstructed in phases:

  1. Validate the block exists.
  2. Validate source matches output.
  3. Validate source matches deprecated outputs.
  4. Validate significance of differences.

These are stacked in a way that favors performance and optimizes for the majority of cases. That is to say, the evaluation logic can become more sophisticated the further down it goes in the process. The first few checks have to be extremely efficient since they will be run for all valid blocks. However, once a block is detected as invalid — failing the three first steps — it is alright to spend more time determining validity before falling back to the user's decision.

Validate significance of differences

This is the area that could use improvements going forwards. Most of the currently reported issues come from differences that should not be significant yet produce an invalidation. There are generally two ways to approach this:

  • Revise block saving to allow for these differences (like HTML attributes).
  • Overwrite differences that fall under a threshold as insignificant.

Related issues:

There are also intricacies that surface once blocks are extended.

Transformations

Another case of data transformation is present in the mechanism for switching a block to another block type. Transforming a block into another block can be destructive, depending on the heuristics established by the two blocks, the source and the destination. Block transformations also come in two shapes:

  • Registering a transformation for a block into another.
  • Using raw-handling / pasting for conversion.

The first case knows about the block's attributes and is the one used in the main block transformation menu. It allows the most knowledge-transfer in the mapping of attributes. Issues in this conversion should be assigned to the individual blocks responsible for it (example, mapping a quote's cite to a plain paragraph).

The second case is used for extracting blocks out of a Classic block, or converting an HTML block into validated core blocks.

This process is grounded on the same handlers for pasting, which is why in general it removes elements as part of its cleanup process — #6102 —. The intention behind pasting is to clean-up the source without losing meaningful information. However, it could be assumed that given an existing chunk of Classic content the editor could be more lenient in the conversion. One way of handling this is separating both operations, pasting and raw conversion: #6878. Another possibility is to alert the user when something is removed.

Related issues:

Potential Tasks

The aim of this issue is to provide enough context for all these related problems so that any improvements can be discussed holistically. Some examples:

  • Capture unexpected top-level attributes and re-apply them without causing an invalidation.
  • Distinguish between pasting and raw conversion so that different elements can be preserved.
  • Use a visual diff check after a source invalidation.
  • Improve the user experience of handling conflicts: Block Validation, Deprecation and Migration Experience #7604
@mtias mtias added [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Feature] Paste [Feature] Block Transforms Block transforms from one block to another labels Nov 2, 2018
@hos-shams
Copy link

The invalidation process is a real problem for some block/theme developers who keep updating their blocks over time; either for fixing issues or adding new features. We have to create lots of entry inside the deprecated array in this case.

Please provide a skipValidation setting option for blocks so the HTML code will generate completely based on the attribute values. There's a discussion around this issue on #10444 .

@danielbachhuber
Copy link
Member

From today's Slack thread, I think there are a few specific tasks that would help move this issue forward:

  1. Allow a limited set of attributes (based on HTML5 spec) for HTML elements in all Blocks that are only editable through Code Editor (i.e. no UI for them). These should also persist through the save cycle.
  2. Allow a limited set of inline HTML elements in RichText that don't have UI for enabling/disabling. Same deal: only editable through code.
  3. Persist classes in Raw block conversion.
  4. In all scenarios except Paste, warn the end user if Gutenberg detects that some of their data may be stripped in the conversion or save process. Gutenberg's current behavior is too silent for data that has semantic value.

@mtias mtias added the [Type] Overview Comprehensive, high level view of an area of focus often with multiple tracking issues label Nov 6, 2018
@susanpaigen
Copy link

Comments/requests from an end user -
@mtias Thank you for the overview. It helped me understand the concepts here.
@danielbachhuber Thanks for the reference from #11539 and the list above, which I think would resolve at least some of the issues I'm worried about.

  1. "Allow a limited set of attributes ... in all Blocks."
    Why limit/exclude valid attributes in the interests of "ensure semantic and clean markup"? Many of these affect functionality. And please don't forget the millions (billions?) of instances of 'style = "text-align:right;"' that have been written by the WP visual editor.
  2. "Persist classes in Raw block conversion."
    Yes please! And maybe this is self-evident, but this would apply to both classes on the Block level element (p, h, ul, table, etc) and the contents (li, span, tr, td, etc)? For me, the whole point of classes is "clean markup".
  3. "In all scenarios except Paste, warn the end user if Gutenberg detects that some of their data may be stripped in the conversion or save process."
    Yes please! I think there also has to be a way to paste without losing markup, and at least the option of getting a warning if anything will get lost. Possibilities could be a user setting; paste into visual view vs. paste into html view; a button in the editing bar (sort of the reverse of "Paste as text"). Paste into visual view (clean) vs. paste into html view (keep) seems straightforward for the user.

Another issue is the fate of the

. I know they're often misused, but on our website we use them a lot, almost always for one of the following two reasons:

  1. To contain floats. The most common situation is a floated (L or R) image with some text which wraps or not depending on the copy length and window width.
  2. As the container tags for layout elements from a framework like Bootstrap or UIkit. Examples: columns, accordions, tabs.
    It would be wonderful if there was an option for Convert to Blocks to put divs into HTML blocks with their content intact. Testing today with the Gutenberg plugin, choosing 'Convert to Blocks' for a Classic block with div sections completely deletes the divs and all their content. The same is true if I replace
    with
    .

@reneedobbs
Copy link

Valid html tags should not be stripped when converting from Classic to Blocks.

Specifically for the image block: title and data-pin-description. Stripping those 2 tags means WordPress is deleting content. Valuable, valid content. People and businesses have spent lots of time, effort, and resources to have those tags in their images.

@fklein-lu
Copy link
Contributor

[Invalidation] [is] the process with which a block source is compared with its output before the user interacts with a block. When this fails, for whatever reason, the block is considered invalid. This has been an extremely useful mechanism during the development process, highlighting issues with blocks, plugin compatibilities, and so on.

I consider that the current block invalidation does not work with any type of block besides the extremely basic, and completely static blocks that ship with Core, as outlined in #12708 (comment).

Imagine a client comes to you “Hey I want to present our services on our website, you know just a title and an image”.
Me: “Alright, here’s your block.”

Two weeks later
Client: “A description for our services would be good. Can we add a text to the block?”
Me: 😐 “Okay…” (writes deprecation).

Another two weeks later
Client: “Can we add a link to the services block? You know to link to detail pages?”
Me: 😕 “Okay…” (writes another deprecation).

Yet another two weeks later
Client: “We want to add a contact person to the services block. A photo, name, and phone number. Can you do that?”
Me: 😖 “Sure, will take a bit.“ (Installs the Classic Editor plugin, and dusts off the good old pagebuilder).

In addition I doubt the usefulness of the current validation during development, see #10444.

@youknowriad
Copy link
Contributor

Closing as a duplicate of #7604

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Block Transforms Block transforms from one block to another [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Feature] Paste [Type] Overview Comprehensive, high level view of an area of focus often with multiple tracking issues
Projects
None yet
Development

No branches or pull requests

7 participants