Rich Text: Formalize RichText children value abstraction #7934

aduth · 2018-07-12T15:40:43Z

This pull request seeks to work toward detaching the dependency between RichText value and the @wordpress/element React element shape. It formalizes what we already offer in the children source as a proper value shape of its own.

Implementation notes:

Some observations made in the implementation herein:

"Element" is an implementation detail. The only requirements of our RichText implementation is being able to source from and write to HTML. Representing the value as an object shape (vs. a plain markup string) has been the subject of much past discussion (Try: Block API: Remove children source in favor of HTML string #5380, Editable: Treat value as HTML string, eliminate children source #2750) where the conclusion is that we want some of the benefits which could be offered through element (e.g. safety, content analysis) but it does not strictly need to be element itself. Our current need for manipulation is concatenation (merging), which can be offered through an abstraction native to the "children" type.
- Our need for dom-react hinges on this fact that we expect to need to convert a DOM node back to an element for serialization. But our custom serializer (Framework: Use custom serializer for texturize compatibility #5897) is more tolerant to non-"element" props (e.g. class, style). It is fine if this becomes non-true in the future, so long as we have the abstractions in place for children and node source types to work from and to the HTML.
This does not intend to propose any specific object shape for the RichText value. Initial explorations here did, and they introduced needless complexity, when the most critical piece is providing the consistent abstraction. I'll leave value shape to efforts like in RichText state structure for value manipulation #7890 (issue at RichText State Structure #771).
The implementation here includes some more stability:
- Concatenating two children types consisting only of strings has a more consistent result...
  - In master, we produce [ 'First paragraph', 'Second paragraph' ] where here we generate the simpler and consistent [ 'First paragraphSecond paragraph' ]
- Empty node deletion traverses deeply, resolving more issues with zero-width space character presence (see included E2E test case)
I wasn't entirely sure about RichText manually crafting its own BlockChildren objects, vs. something like with dom-react in passing a map/reduce callback (or object of "visitors") into the block API itself.
- Many complications:
  - We may need to clean either TEXT_NODE and ELEMENT_NODE
  - We may need to omit nodes altogether
  - Doing so in a way that is performant (i.e. single pass)
- Since this value is aimed at RichText, it seemed reasonable that it is fully aware of the object shape

Testing instructions:

Verify there are no regressions in the display, manipulation (splitting, merging, transforming), and save+reload of blocks leveraging RichText (e.g. paragraph, quote). A particular focus should be made on areas of low test coverage (e.g. block transforms).

aduth · 2018-07-12T16:19:30Z

packages/element/src/serialize.js

@@ -618,6 +626,11 @@ export function renderAttributes( props ) {
 * @return {string} Style attribute value.
 */
 export function renderStyle( style ) {
+	// Only generate from object, e.g. tolerate string value.


I was conflicted about this change, since it's largely aimed at the use-case being proposed here with children sourcing from DOM. In general, our custom serializer is much more tolerant to things which React would otherwise barf about. This has made the implementation slightly simpler in some regards too (though arguably not in this instance). Here, I saw no reason why style must be an object, toward a goal where the fallback behavior is to always assign an attribute as given (and if it's given as a string, great: write it as-is). Other similar projects seem tolerant as well, e.g. preact-render-to-string, which happily handles the string style. The main downside is a mixed expectation of what could be passed to renderToString vs. render alone.

If it's an issue, there's an alternative: We can detect the string from the child API's fromDOM and parse it back into an object.

hypest · 2018-07-16T14:52:34Z

... we want some of the benefits which could be offered through element (e.g. safety, content analysis) but it does not strictly need to be element itself. Our current need for manipulation is concatenation (merging), which can be offered through an abstraction native to the "children" type.

This is most probably a naive question, stemming from lack of insight: Could we perhaps move the analysis step closer to the operations that need it? Like, instead of always keeping an already analyzed tree in memory (and in state), only keep the serialized html and produce the tree on the spot, when merging needs to happen.

aduth · 2018-07-16T17:13:32Z

Like, instead of always keeping an already analyzed tree in memory (and in state)

What do you have in mind as representing the analyzed tree? Would something like what's being explored in #7890 satisfy what you're talking about?

gziolo · 2018-07-16T20:40:41Z

Aztec takes HTML as an input (similar to TinyMCE) and also generates HTML as an output (TinyMCE is different). I think, question is if we could use HTML string to store value and parse it to the proposed format, when doing all transformations and convertions.

youknowriad · 2018-07-17T09:12:02Z

While I see the value of this abstraction/type and I like how this makes the distinction between the element abstraction and the rich text value format, I have a question, could we guarantee that the abstraction will cover all the use-cases of transforming the rich text value?

My hunch is "no". For example we need to transform li to p when transforming from lists to paragraphs or get only the content of li etc... So I still think we need to formalize the shape of the value somehow and not treat it as a black box.

youknowriad · 2018-07-17T09:13:54Z

(It looks like there's unrelated changes in the diff, probably need a rebase or something)

hypest · 2018-07-17T12:55:47Z

What do you have in mind as representing the analyzed tree?

Nothing in particular @aduth , I'm still trying to understand if the in-memory object tree is just a nice-to-have (but super useful) that enables some features (like merging) or it is perhaps super-needed for the baseline functionality Gutenberg offers.

If nice-to-have, perhaps it would make sense to only perform the conversion on-demand, and keep passing/saving html from/to the state.

Besides, as @gziolo mentioned in his comment, there's an incentive to avoid requiring Aztec (the current native mobile editor) to expose an object model instead of just html. If we can find a way to continue using html, at least as the interface deeper in the RichText component, it will be welcomed for this first attempt at integrating Aztec.

aduth · 2018-07-17T16:06:33Z

I think I should reiterate that this pull request does not intend to express any proposal for the presence of or a particular shape of the RichText value. It primarily seeks to stop presenting it as though it were the @wordpress/element value.

I'm still trying to understand if the in-memory object tree is just a nice-to-have (but super useful) that enables some features (like merging) or it is perhaps super-needed for the baseline functionality Gutenberg offers.

There's substantial context / prior discussion for this in #5380 and #2750.

There's nothing that technically forbids us from tracking the state value as an HTML string, but for all of the reasons mentioned previously (operations upon state value primarily), the abstraction has stuck. Performance is also a very real consideration, as I would expect that on-demand transformations would be very costly.

I'm open to reconsidering #2750, but we need movement now, and I'd still consider this to be an incremental step forward.

tofumatt · 2018-07-19T18:29:15Z

Looks like there's still a lot of stuff in this branch that is unrelated, which makes it tough to review. Maybe it needs another rebase?

aduth · 2018-07-19T20:44:51Z

@tofumatt It's not a small pull request in its own right, but yes, there was something awry with a previous rebase that pulled in some unrelated commits unintentionally. Should be cleared up now.

youknowriad · 2018-07-20T10:07:44Z

blocks/api/children.js

+/**
+ * Internal dependencies
+ */
+import * as node from './node';


Seems like there's a cyclic dependency between node and children, Should we just merge them in the same file?

Seems like there's a cyclic dependency between node and children, Should we just merge them in the same file?

Well-spotted. There is, though Webpack seems to handle it without issue. I personally found some elegance in having independent files with overlapping APIs (matcher, fromDOM, toHTML) mapping one-to-one with the offered source type. I don't feel too strongly about it.

youknowriad · 2018-07-20T10:08:18Z

blocks/api/children.js

+		try {
+			result.push( node.fromDOM( domNodes[ i ] ) );
+		} catch ( error ) {
+			// Simply ignore if DOM node could not be converted.


When can this happen?

When can this happen?

The child matcher throws on anything other than nodeType of TEXT_NODE or ELEMENT_NODE, so if a block uses a children selector on an element which includes a comment node, for example, this would prevent its inclusion.

<div class="my-block">hello world</div>

{ "type": "array", "source": "children", "selector": "div" },

https://developer.mozilla.org/en-US/docs/Web/API/Node/nodeType#Constants

youknowriad

Tested this, it works well. I also like that it includes the shape simplification from #7476

Let's get this in, it's definitely an improvement.

aduth added [Feature] Block API API that allows to express the block paradigm. Framework Issues related to broader framework topics, especially as it relates to javascript [Feature] Rich Text Related to the Rich Text component that allows developers to render a contenteditable labels Jul 12, 2018

aduth requested review from youknowriad and ellatrix July 12, 2018 15:40

aduth commented Jul 12, 2018

View reviewed changes

gziolo assigned aduth Jul 13, 2018

aduth requested a review from a team July 16, 2018 14:36

gziolo requested a review from hypest July 16, 2018 14:38

aduth added this to the 3.3 milestone Jul 18, 2018

aduth force-pushed the try/rich-text-block-node branch from 72c2f51 to 7c85135 Compare July 19, 2018 20:43

youknowriad reviewed Jul 20, 2018

View reviewed changes

youknowriad approved these changes Jul 20, 2018

View reviewed changes

youknowriad mentioned this pull request Jul 20, 2018

RichText: Use a simplified format for rich text values #7476

Closed

aduth force-pushed the try/rich-text-block-node branch from 7c85135 to af09027 Compare July 20, 2018 12:52

aduth added 3 commits July 20, 2018 09:58

Element: Support plain-string style prop

c042d8e

Rich Text: Formalize RichText children value abstraction

f2f8bd5

Testing: Add E2E case to ensure cleaned TinyMCE content

1a637ee

aduth force-pushed the try/rich-text-block-node branch from af09027 to 1a637ee Compare July 20, 2018 13:58

aduth merged commit 1a0a3ba into master Jul 20, 2018

aduth deleted the try/rich-text-block-node branch July 20, 2018 14:15

noisysocks mentioned this pull request Jul 24, 2018

Document Outline: Exception caused by headings with formatting #8160

Closed

aduth mentioned this pull request Jul 25, 2018

Invariant Violation/Minified React Error #31 on Welcome to The Gutenberg Editor #8150

Closed

aduth mentioned this pull request Aug 6, 2018

Fix: Transforms: Audio and Video to file transform bug; Add: getTextElements function: #7830

Closed

aduth mentioned this pull request Sep 13, 2018

Remove React-related attributes from pullquote multi-paragraph block fixtures #7448

Closed

3 tasks

ellatrix mentioned this pull request Sep 25, 2018

RichText state structure for value manipulation #7890

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rich Text: Formalize RichText children value abstraction #7934

Rich Text: Formalize RichText children value abstraction #7934

aduth commented Jul 12, 2018

aduth Jul 12, 2018

hypest commented Jul 16, 2018 •

edited

Loading

aduth commented Jul 16, 2018

gziolo commented Jul 16, 2018

youknowriad commented Jul 17, 2018

youknowriad commented Jul 17, 2018

hypest commented Jul 17, 2018

aduth commented Jul 17, 2018

tofumatt commented Jul 19, 2018

aduth commented Jul 19, 2018

youknowriad Jul 20, 2018

aduth Jul 20, 2018

youknowriad Jul 20, 2018

aduth Jul 20, 2018

youknowriad left a comment

Rich Text: Formalize RichText children value abstraction #7934

Rich Text: Formalize RichText children value abstraction #7934

Conversation

aduth commented Jul 12, 2018

aduth Jul 12, 2018

Choose a reason for hiding this comment

hypest commented Jul 16, 2018 • edited Loading

aduth commented Jul 16, 2018

gziolo commented Jul 16, 2018

youknowriad commented Jul 17, 2018

youknowriad commented Jul 17, 2018

hypest commented Jul 17, 2018

aduth commented Jul 17, 2018

tofumatt commented Jul 19, 2018

aduth commented Jul 19, 2018

youknowriad Jul 20, 2018

Choose a reason for hiding this comment

aduth Jul 20, 2018

Choose a reason for hiding this comment

youknowriad Jul 20, 2018

Choose a reason for hiding this comment

aduth Jul 20, 2018

Choose a reason for hiding this comment

youknowriad left a comment

Choose a reason for hiding this comment

hypest commented Jul 16, 2018 •

edited

Loading