Parser: Client-side wpautop #4005

aduth · 2017-12-14T15:39:05Z

This pull request seeks to explore solutions for migrating classic content transparently, ~~while also considering paragraph block serialization omitting comment demarcations~~. In #3745, it was found that the server serialization behavior was insufficient to respect nested blocks. This was originally implemented to selectively apply wpautop to non-block content. The approach here instead assumes this to occur within Gutenberg when loading a post with non-block content. Simultaneously it adds parser awareness of the paragraph tag to upgrade to a paragraph block.

An advantage of implementing wpautop upgrading in the Gutenberg editor is that it allows the the_content filter for automatic paragraphs to be skipped altogether for posts containing blocks (already exists) and eliminates the need for server processing to this effect on save.

Caveats:

There are a few uncertainties with the changes proposed here:

We don't want to call on the wp.oldEditor.autop global for applying autop, not just because it's a global, but because it is not functional in test environments. Instead, we should consider extracting these utilities (autop, removep) to a separate @wordpress/autop package. Edit: Updated to use the new @wordpress/autop package
To this point, I had started the work here, and found in porting the PHP wpautop tests to JavaScript that there are many differences between the PHP and JavaScript implementations. We should seek to improve this. Edit: Accounted for in the new @wordpress/autop package
- Related: https://core.trac.wordpress.org/ticket/42531#comment:23
The undemarcated paragraph block is certainly less verbose, but I worry that this is a post-grammar-parse step to convert a freeform string including paragraph tags to a segmented set of freeform and paragraph blocks. This would also mean the server parser would not detect undemarcated paragraphs as paragraph blocks, even if within the Gutenberg editor they were treated as such. Edit: No longer pursuing the undemarcated paragraphs as part of the changes here

Testing instructions:

Verify that saving a post in the classic editor with removep'd content can be loaded into Gutenberg respecting paragraph separations. Example:

classic text

extra p

<div>foo</div>

more p

~~This text, when loaded in Gutenberg, should produce the following sequence: 2 paragraph blocks, 1 freeform block (the div), and a last paragraph block. Saving the post should save with <p> tags.~~ Edit: No longer pursuing the undemarcated paragraphs as part of the changes here

~~Applying background color or alignment to a paragraph should apply comment demarcations, since these attributes are serialized in comment syntax.~~Edit: No longer pursuing the undemarcated paragraphs as part of the changes here

Ensure that this content loads into the Gutenberg editor with paragraph blocks applied, and that even after inserting new block content and saving the post that it appears on the front-end with paragraphs still existing.

aduth · 2018-01-04T14:15:32Z

cc @dmsnell re: The note/worry about parsing undemarcated paragraphs outside of the grammar itself. I might plan to split this out into a separate pull request all the same.

aduth · 2018-01-04T21:35:55Z

Saving commit references planned to be dropped in rebase:

aduth · 2018-01-05T14:49:47Z

This is now ready for a proper review. I've updated the intent of the pull request to work exclusively toward porting wpautop reapplication into the client. Work toward undemarcated paragraphs and fragmenting a freeform block into paragraph blocks will be pursued separately.

youknowriad · 2018-01-08T14:28:39Z

Code changes look good to me. Is this supposed to fix the switching between "classic editor" and "gutenberg"?

If that's the case, it's working properly, aside one issue:

This adds empty paragraphs (free form block) between Gutenberg blocks when:
writing in Gutenberg
Updating in classic editor

aduth · 2018-01-08T14:48:03Z

@youknowriad I wouldn't expect this should fix any outstanding issues so much as recreating the existing behavior in a way that accommodates block nesting, considered as the favorable alternative to server-side parsing enhancements proposed in #3545.

aduth · 2018-01-08T15:01:06Z

If that's the case, it's working properly, aside one issue:

I'm seeing this in master as well. Can you confirm? Investigating why this might be happening. It's creating a new separate paragraph tag (presumably based on the double newline between blocks), which I wouldn't expect to happen as of #2708.

aduth · 2018-01-08T15:10:01Z

Related: #3901

youknowriad

LGTM 👍

About the empty paragraphs issue, one way to solve it is to drop the empty lines between blocks. Granted it makes the generated HTML less readable but I prefer less readable over broken when switching between classic and Gutenberg

mcsf

LGTM

dmsnell · 2018-01-14T23:14:06Z

@aduth I hear your concerns about a post-grammar parse but I can also sympathize here the same way I sympathize with the more tag. What I mean is that I think we're hitting a legacy wall where we have no where further to run to escape the past. That being said the code looks fine to me and the comments are clear.

One idea I had long ago but it never picked up much momentum was a pre-parse converter so that we could handle inconsistencies in a more isolated way. We could run the document through the preprocessor which would look for very specific and searchable patterns then replace them. In the case of the  tag we could search for )*--> and replace it with a full more block. This would leave the gutenberg parser purer because it wouldn't have to make those exceptions. To date we've had very few exceptions though so it's not terrible.

The same could apply to autop and freeform HTML and maybe could help with the known performance issue (#3799) which chokes on long sequences of non-block content. The preprocessor could quickly scan the document and look for anything not a proper block then perform its transformation - maybe turning everything into big free-form HTML blocks, maybe splitting by paragraph via autop, maybe something else.

Easier said than done I guess but it's an idea at least.

In the meantime I support this PR especially if it removes an obstacle to nested blocks.

A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806

aduth added [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Status] In Progress Tracking issues with work in progress labels Dec 14, 2017

This was referenced Dec 15, 2017

Add package: autop WordPress/packages#53

Merged

1.9.1: Classic "Visual" editor litters Gutenberg markup w/ superfluous `<p>´ tags #3901

Closed

aduth mentioned this pull request Jan 4, 2018

Parsing: Replace server serialization by parsed outerHTML #3545

Closed

aduth force-pushed the try/client-wpautop branch from dfc872e to 344a0e7 Compare January 5, 2018 14:44

aduth changed the title ~~Try: Client-side wpautop, undemarcated paragraph blocks~~ Try: Client-side wpautop Jan 5, 2018

aduth changed the title ~~Try: Client-side wpautop~~ Parser: Client-side wpautop Jan 5, 2018

aduth removed the [Status] In Progress Tracking issues with work in progress label Jan 5, 2018

aduth requested a review from mtias January 5, 2018 14:51

youknowriad approved these changes Jan 9, 2018

View reviewed changes

mcsf approved these changes Jan 9, 2018

View reviewed changes

aduth force-pushed the try/client-wpautop branch from 344a0e7 to 30f6063 Compare January 9, 2018 19:33

aduth mentioned this pull request Jan 16, 2018

Content with only "More" block reloads as invalid #4038

Closed

aduth mentioned this pull request Jan 24, 2018

Blocks: Support block nesting (Add Columns block, InnerBlocks component) #3745

Merged

aduth added 2 commits January 30, 2018 14:20

Parser: Apply autop to fallback block content

222068e

aduth force-pushed the try/client-wpautop branch from 30f6063 to 222068e Compare January 30, 2018 21:12

aduth merged commit c6e8798 into master Jan 30, 2018

aduth deleted the try/client-wpautop branch January 30, 2018 21:19

aduth mentioned this pull request Feb 15, 2018

Tests: Remove unused wpautop fixture files #5086

Merged

ntwb mentioned this pull request Mar 6, 2018

Update package-lock.json #5427

Closed

3 tasks

ntwb mentioned this pull request Mar 9, 2018

Update package-lock.json #5525

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parser: Client-side wpautop #4005

Parser: Client-side wpautop #4005

aduth commented Dec 14, 2017 •

edited

Loading

aduth commented Jan 4, 2018

aduth commented Jan 4, 2018 •

edited

Loading

aduth commented Jan 5, 2018

youknowriad commented Jan 8, 2018

aduth commented Jan 8, 2018

aduth commented Jan 8, 2018

aduth commented Jan 8, 2018

youknowriad left a comment

mcsf left a comment

dmsnell commented Jan 14, 2018

Parser: Client-side wpautop #4005

Parser: Client-side wpautop #4005

Conversation

aduth commented Dec 14, 2017 • edited Loading

aduth commented Jan 4, 2018

aduth commented Jan 4, 2018 • edited Loading

aduth commented Jan 5, 2018

youknowriad commented Jan 8, 2018

aduth commented Jan 8, 2018

aduth commented Jan 8, 2018

aduth commented Jan 8, 2018

youknowriad left a comment

Choose a reason for hiding this comment

mcsf left a comment

Choose a reason for hiding this comment

dmsnell commented Jan 14, 2018

aduth commented Dec 14, 2017 •

edited

Loading

aduth commented Jan 4, 2018 •

edited

Loading