Skip to content
This repository has been archived by the owner on Jul 28, 2023. It is now read-only.

Experiment with WP_HTML_Tag_Processor #118

Closed
wants to merge 54 commits into from
Closed

Conversation

ockham
Copy link
Collaborator

@ockham ockham commented Dec 13, 2022

Superseded by #125.

Early-stage experiments, mostly in order to discover our requirements from WP_HTML_Tag_Processor.

Doesn't work yet!

PR based on #100 -- I've taken the example blocks from there and removed the extra build step that was introduced there. Setting the base branch to #100's for now so it's clearer what I've changed.

@ockham ockham self-assigned this Dec 13, 2022
@ockham
Copy link
Collaborator Author

ockham commented Dec 13, 2022

Here's a preliminary TODO list, based on my inline comments in the code so far. I have some questions (in italics); if you have any answers (or opinions), please comment below! 😊

  • Allow wrapping a tag's inner contents. E.g. <wp-show when="context.myblock.open">abcd</wp-show> becomes <wp-show when="context.myblock.open"><template>abcd</template></wp-show> (if context.myblock.open evaluates to false).
  • Properly parse expressions in attributes like when.
    • Includes nested JS-style objects (e.g. context.myblock.open).
    • Basic logic operations (negation, and, or)? Comparison? cc/ @WordPress/frontend-dx
  • Allow replacing a tag that has a directive attached.
    • What would the API for something like that look like? Assuming that we allow registration of custom directives -- do the directive "processors" get access to the tag processor?
  • Add some directive/components registration mechanism.
    • Maybe similar to blocks, patterns, etc? register_directive, takes a directory with a directive.json which includes information such as attributes, render, context, ...?
  • How do the APIs of signatures relate to their counterpart components? Can there be a 1:1 mapping (i.e. an adapter)?
  • Allow getting attributes that match a certain pattern. E.g. wp-bind actually looks something like <img wp-bind:src="selectors.favorites.isPostIncluded" /> in practice (to bind the value of the image src attribute to the expression). This means we need syntax like getAttribute( 'wp-bind:*' ).

@ockham
Copy link
Collaborator Author

ockham commented Dec 13, 2022

Would it make sense to add a dedicated method for wrapping a tag's inner contents?

I forgot that @dmsnell had shared with me a possible syntax for this:

$content = tags->get_content_inside_balanced_tags();
$tags->set_content_inside_balanced_tags(
    "<template>{$content}</template>"
);

This uses get_content_inside_balanced_tags, which Dennis has already implemented in WordPress/gutenberg#46345 🎉

wp-directives.php Show resolved Hide resolved
if ( 'wp-show' === $tag_name ) {
$attributes['when'] = $tags->get_attribute( 'when' );
}
call_user_func_array( $directives[$tag_name], array( $attributes, &$context ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way I've been exploring this is by passing the WP_HTML_Tag_Processor around to functions, so here that might be call_user_func_array( $directives[ $tag_name ], array( $tags, $attributes, &$context ) )

of course, the Tag Processor will not be able to modify the markup. eventually this might need to be created as the WP_HTML_Processor instead, or some other thing. I've toyed with two ideas:

  • $tags = new WP_HTML_Processor( $html ) instead of creating the tag processor
  • new WP_HTML_Processor( $tags ) to wrap the tag processor

in the first method it suggests we might see people stop ever invoking WP_HTML_Tag_processor() directly, but avoids problems with encapsulation of the tag processor.

in the second method there's an extra step one has to take before walking into the danger zone where performance and reliability drop off, but it allows for a stronger separation of those zones.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, IMO, this is one of the more complex (and interesting) problems here 😅

Given the two options you're suggesting, I'm currently leaning towards the second one (because of danger zone separation).

I'm a bit concerned with the potential leakage of giving each directive processor access to the "entire" Tag Processor. Ideally, I'd love if a directive processor could only operate on its "own" (outer) HTML. I'm not sure how feasible that would be, since we'd still have to "stitch" the modified HTML back into the surrounding parts 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how feasible that would be, since we'd still have to "stitch" the modified HTML back into the surrounding parts 🤔

I think this will end up being one of those fundamental limitations of the system. What we have is HTML, whereas what we don't have is DOM.

So while what you are talking about is possible, it's not currently possible inside the tag processor, and that may never change.

foreach ( … ) {
	$inner_html = $tags->get_content_inside_balanced_tags();
	$inner_tags = new WP_HTML_Tag_Processor( $inner_html );

	$replacement_html = process_directive( …, $inner_tags );
	$tags->set_content_outside_balanced_tags( $replacement_html );
}

Copy link
Collaborator Author

@ockham ockham Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So while what you are talking about is possible, it's not currently possible inside the tag processor, and that may never change.

Maybe we could do a variant of the code you shared with a method that directly returns a WP_HTML_Tag_Processor (or WP_HTML_Processor) which will include all bookmarks that apply to it and which can then be used to modify it in a number of defined ways before stitching it back in?

foreach ( … ) {
	$inner_tags = $tags->get_tags_inside_balanced_tags(); // Probably needs to be WP_HTML_Processor.

	$process_directive( …, $inner_tags );
	$tags->set_tags_inside_balanced_tags( $inner_tags );
}

Among the limited ways of modifying WP_HTML_Processor could be something like "wrap in tag". This would cover our need to wrap things in <template>s.

Would a mechanism like that be able to preserve the integrity of the "outer" WP_HTML_Processor? Am I missing something?


(If we limit the ways of modifying that strongly, it's arguable that we could simply introduce a shorthand method to wrap_content_inside_balanced_tags().)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the absence of more confidence I'm way more comfortable with user-level shorthands than framework level shorthands.

which will include all bookmarks that apply to it and which can then be used to modify it in a number of defined ways before stitching it back in?

Even this just screams for data sync issues. With all of the bookmarks and pointers contained within the WP_HTML_Tag_Processor everything stays in harmony. Granted, bookmarks can disappear and that's another form of the same problem, but at least then we're looking at one object which we have modified vs. birthing a new object, doing stuff to it, and then finding out later that the first object is now different and our updates invalid.

Here's one thing that's fine though: if you extract that inner content, do stuff to it, and then stitch it back in there's no problem and no cross-talk and no interference and no coupling.

That method hasn't been working for me with CSS selection because I need bookmarks back into the outer/parent region of the HTML, but for modifying content as I think you're doing here I don't think there are any issues.

Copy link
Member

@luisherranz luisherranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exciting 🙂👏

wp-directives.php Outdated Show resolved Hide resolved
wp-directives.php Outdated Show resolved Hide resolved
wp-directives.php Outdated Show resolved Hide resolved
Co-authored-by: Luis Herranz <luisherranz@gmail.com>
@ockham
Copy link
Collaborator Author

ockham commented Dec 14, 2022

@luisherranz Thank you for your feedback! While you're here, can I ask your opinion about this question:

How do the APIs of signatures relate to their counterpart components? Can there be a 1:1 mapping (i.e. an adapter)?

E.g. for context, we have for the directive:

<div wp-context='{ "myblock": { "open": false } }'>

and for the component:

<wp-context data='{ "myblock": { "open": false } }'>

Can we always assume that the component will have one named attribute that maps to what would be the value of the corresponding directive? (E.g. for wp-context, that would be data in the example above.) (You might have a more informed opinion through your familiarity with Alpine than I do 😊 )

In that case, we probably need some way to set that attribute's name in the directive declaration.

@ockham
Copy link
Collaborator Author

ockham commented Dec 14, 2022

@dmsnell I added another item to my comment above:

Allow getting attributes that match a certain pattern. E.g. wp-bind actually looks something like <img wp-bind:src="selectors.favorites.isPostIncluded" /> in practice (to bind the value of the image src attribute to the expression). This means we need syntax like getAttribute( 'wp-bind:*' ).

Curious to hear your thoughts. Maybe the alternative would be a get_all_attributes method, as Luis suggested?

@ockham
Copy link
Collaborator Author

ockham commented Dec 14, 2022

I've updated the code to:

  • Use the latest WP_HTML_Tag_Processor -- and WP_HTML_Processor -- from WIP: Introduce class for sourcing block attributes from HTML gutenberg#46345.
  • Unify the directive processors so they can be used for both (attribute) directives and components ("HTML directives").
  • Changed the directive processors to accept their inner HTML content (as obtained by get_content_inside_balanced_tags) as their first argument, and return the (potentially modified) content.

What's still missing is a method to write the modified content back to WP_HTML_Processor -- see #118 (comment), or #118 (comment).

Comment on lines 183 to 187
$bookmark = "{$tag_name}_start";
$tags->set_bookmark( $bookmark );
$content = $tags->get_content_inside_balanced_tags();
$tags->seek( $bookmark );
$tags->release_bookmark( $bookmark );
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit verbose -- especially considering that get_content_inside_balanced_tags actually saves bookmarks that delimit the inner content.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a related note here: WordPress/gutenberg#46345 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in the linked PR is still preliminary and all names and classes might change drastically; not moving the pointer is reasonable and I'll add that in the branch.

@luisherranz
Copy link
Member

These are my opinions about your questions 🙂


Basic logic operations (negation, and, or)? Comparison?

It is true that PHP and JavaScript have the same syntax for most logical operations and building a small parser sounds feasible, but I would try to avoid it, at least at first.

The one we could include initially is negation (!) which is the only operator that doesn't need two variables and therefore the simplest. It will also be the most common since there is no directive for else:

<wp-show when="context.open">
  <span>I'm open</span>
</wp-show>
<wp-show when="!context.open">
  <span>I'm closed</span>
</wp-show>

Allow replacing a tag that has a directive attached

I'm not sure I understand this question. Do you mean something like this?

<div wp-change-tag="span"></div>
<!-- Result: -->
<span wp-change-tag="span"></span>

Maybe similar to blocks, patterns, etc? register_directive, takes a directory with a directive.json which includes information such as attributes, render, context, ...?

That's actually a great idea! Although I guess it depends on what information we need. If it's just a name, a PHP function and a JS file, it may be a bit overkill to have to declare that in a JSON file 🤔

What other information do you think will be needed?

How do the APIs of signatures relate to their counterpart components? Can there be a 1:1 mapping (i.e. an adapter)? [...] Can we always assume that the component will have one named attribute that maps to what would be the value of the corresponding directive?

I think that is more of a question about promoting some type of naming convention/standardization than forcing a strict rule.

I've actually been thinking lately about the fact that components can read multiple values (one per attribute) and directives only one (the value of its own attribute) and what syntax we should promote to standardize multi-value directives to keep the ecosystem consistent.

Imagine a directive/component that needs a handler and has optional options. That's straightforward in a component:

<wp-swipe handler="myblock.swipeHandler" options="myblock.swipeOptions">
  ...
</wp-swipe>

But not so in the directive:

<div wp-swipe="myblock.swipeHandler" options??>...</div>

There are several syntaxes we could promote, but I'm not convinced about any of them. We can open a separate issue to discuss it, although it's not urgent as it'd be only for standardization and not part of the API per se.

<div
  wp-swipe="myblock.swipeHandler"
  wp-swipe-options="myblock.swipeOptions"
  wp-swipe:options="myblock.swipeOptions"
  wp-options:swipe="myblock.swipeOptions"
  wp-swipe="myblock.swipeHandler, myblock.swipeOptions"
  wp-swipe='{ "handler": "myblock.swipeHandler", "options": "myblock.swipeOptions" }'
  wp-swipe.option1.value1.option2.value2="myblock.swipeHandler"
>
  ...
</div>

@luisherranz
Copy link
Member

luisherranz commented Dec 15, 2022

This means we need syntax like getAttribute( 'wp-bind:*' )

Oh, that's true. We need something like that. Although there can be multiple of them in a tag and we need all of them.

By the way, the way this is currently coded in JavaScript is:

const wpDirectives = {};
for (let i = 0; i < attributes.length; i++) {
  if (attributes[i].name.startsWith("wp-")) {
    const [, prefix, suffix] = /wp-([^:]+):?(.*)$/.exec(attributes[i].name);
    wpDirectives[prefix] = wpDirectives[prefix] || {};
    wpDirectives[prefix][suffix || "default"] = attributes[i].value;
  }
}

All the directive attributes end up in an object with this shape:

<div
  wp-effect="myblock.someEffect"
  wp-effect:other-stuff="myblock.otherEffect"
  wp-effect:yet-another="myblock.yetAnotherEffect"
  wp-class:open="myblock.toggleOpen"
  wp-context='{"open": false}'
>
  ...
</div>
{
  wp: {
    effect: {
      default: "myblock.someEffect",
      ["other-stuff"]: "myblock.otherEffect",
      ["yet-another"]: "myblock.yetAnotherEffect",
    },
    class: {
      open: "myblock.toggleOpen"
    },
    context: {
      default: '{"open": false}'
    }
  }
}

We don't need to extract the attributes for them if they have access to a Tag Processor instance, but they need to be able to do it themselves with a method.

@ockham
Copy link
Collaborator Author

ockham commented Dec 20, 2022

Update:

  • I've added a get_attributes_by_prefix method (counterpart GB PR here; alternative get_attribute names here).
  • As the directive processor functions were getting more and more arguments, I decided to change their signature to accept a WP_HTML_Tag_Processor object reference instead (plus $context).
  • The above two items then allowed me to implement (a very basic version of) wp-bind (751027c).

ockham added a commit that referenced this pull request Dec 22, 2022
ockham added a commit that referenced this pull request Dec 22, 2022
@ockham
Copy link
Collaborator Author

ockham commented Dec 22, 2022

This branch is based on #100 and thus includes some fairly unrelated stuff (sample blocks etc). Furthermore, the files in its src/html/ directory are in a bit of a chaotic state (combined from different open Gutenberg PRs).

I've thus decided to move development over to a fresh branch: #125, to make it easier to track and document any changes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants