Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the Recursive Matching #68

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

eric-hemasystems
Copy link

I was trying to use arrive in combination with the Turbolinks library and the webpage was crashing about every other page change in IE 11.

I found this happened even when the "arrive" callback registered was a no-op function (i.e. function() {}) and a non-existent node. So the crash wasn't due to something in what we were doing with arrive but due to something arrive was doing itself.

Since the entire tab was crashing it made it difficult to debug so I started just using the MutationObserver API directly slowly building up to some of what arrive does. Early in the process I found that if I spent too long in the MutationObserver callback I would get this crash.

This made sense as Turbolinks replaces the entire body meaning that the entire DOM is traversed looking for matches. I know Stimulus (also from the Rails universe) is often paired with Turbolinks and
Stimulus operates via the MutationObserver. This caused me to investigate how they found matching elements without crashing IE when a large number of elements change.

Found the answer at:

https://github.com/stimulusjs/stimulus/blob/8cb58d1f8a4ac83a875c7a1fdc7a28f21869efd6/packages/%40stimulus/mutation-observers/src/attribute_observer.ts#L52-L56

They use matchElement like arrive does on the node actually reported by the mutation observer. But then to search the child nodes it uses querySelectorAll. This bring the matching and recusing into native code which is much faster.

For arrive I decided to just use querySelectorAll as an initial filter and to handling the recusing. Once we get the list then it still runs through the matchFunc so that it still does the tracking that
avoids repeating for the same element.

Even if not using Turbolinks (or similar library) this should optimize any site using arrive where a large number of nodes change.

Eric Anderson added 2 commits August 30, 2018 14:41
I was trying to use arrive in combination with the Turbolinks
library[1] and the webpage was crashing about every other page change
in IE 11.

I found this happened even when the "arrive" callback registered was
a no-op function (i.e. `function() {}`) and a non-existent node. So
the crash wasn't due to something in what we were doing with arrive but
due to something arrive was doing itself.

Since the entire tab was crashing it made it difficult to debug so I
started just using the MutationObserver API directly slowly building up
to some of what arrive does. Early in the process I found that if I
spent too long in the MutationObserver callback I would get this
crash.

This made sense as Turbolinks replaces the entire `body` meaning that
the entire DOM is traversed looking for matches. I know Stimulus
(also from the Rails universe) is often paired with Turbolinks and
Stimulus operates via the MutationObserver. This caused me to investigate
how they found matching elements without crashing IE when a large number
of elements change.

Found the answer at:

    https://github.com/stimulusjs/stimulus/blob/8cb58d1f8a4ac83a875c7a1fdc7a28f21869efd6/packages/%40stimulus/mutation-observers/src/attribute_observer.ts#L52-L56

They use `matchElement` like arrive does on the node actually reported
by the mutation observer. But then to search the child nodes it uses
`querySelectorAll`. This bring the matching and recusing into native
code which is much faster.

For `arrive` I decided to just use `querySelectorAll` as an initial
filter and to handling the recusing. Once we get the list then it still
runs through the `matchFunc` so that it still does the tracking that
avoids repeating for the same element.

Even if not using Turbolinks (or similiar library) this should optimize
any site using arrive where a large number of nodes change.

1. https://github.com/turbolinks/turbolinks
This was left off the original commit but is in our tested code
(just a copy/paste error). Not all nodes can have children and therefore
don't have this method. This means we need a guard to prevent the
attempt to use this method.
Copy link
Owner

@uzairfarooq uzairfarooq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the time you spent but the issue I mentioned is a blocker. Let me know if you think there's a solution available otherwise I'll close the pull request.

if (node.childNodes.length > 0) {
utils.checkChildNodesRecursively(node.childNodes, registrationData, matchFunc, callbacksToBeCalled);
if( !node.querySelectorAll ) continue;
var matchedDescendents = node.querySelectorAll(registrationData.selector);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases where querySlectorAll() won't match all elements. Consider this example:

<div class='parent-elem'>
  <div class='current-node'> <!-- Suppose node points to this element -->
    <div class='child'></div>
  </div>
<div>

Suppose arrive is called with following:

document.arrive('.parent-elem .child', (e) => {
  console.log(e);
})

querySelectorAll() won't match child element because we are calling it from .current-node element

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I think all my selectors are simpler than that (usually just looking for a certain class on a certain element). Sounds like we have a choice between:

  1. crashing on IE 11 if a large number of nodes change
  2. complex selectors not being supported in certain edge cases

Either kinda sucks.

I wonder if we could detect if a simple selector is being used and if so go native, if not, use the non-native strategy? That would seem to complicate things though. Maybe make it developer choice? Have an option to enable the more efficient route but with the caveat that certain complex selectors wouldn't be supported in certain edge cases. That would add less complexity and allow folks that don't have the complex selectors but do need to support IE to not need to use a fork of this library.

If either of these options sound like too much complication I would lean towards just closing this and I'll just have my own local fork so I can support IE 11. Hopefully in Oct 2025 when IE 11 is EOL we can move back to the mainline fork. :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could detect if a simple selector is being used and if so go native, if not, use the non-native strategy?

This seems like a good option.

Maybe make it developer choice? Have an option to enable the more efficient route but with the caveat that certain complex selectors wouldn't be supported in certain edge cases.

Hmm. This option would be a bit hard to explain to normal users. I wonder if we can detect simple selector using a regex maybe? I think in a lot of cases users are using simple selectors and if we can detect it intelligently, it would make significant performance improvements for a lot of users.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked and all of my selectors are a single ID or class on an element. Something like #users_index. The only exception is I use the comma to specify multiple selectors on a the same line (but again we are just looking for a single class or ID for each sub-selector). For example: .form-checkbox, .form-radio

It seems if we take the selector, split on comma, trim any whitespace before and after each split part and the check to ensure each split part does not still have a space then that would satisfy all my needs. If someone later comes up with a more complicated selector that does not satisfy that criteria but still could use the native search we could always augment but that would be a good start.

If that sounds good I'll update this PR to use this strategy.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this seems like a good start. We can intelligently handle cases where attribute value contains spaces e.g. ‘.test[title=“some title”]’ but this does not necessarily have to be part of this pull request, we can handle it later.

Also, please send the pull request to dev branch. I’ll merge to master after release.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this project is still active anymore but finally had a few hours to look into seeing about closing this issue out.

Turns out my original code should account for the scenario you uncovered. querySelectorAll does not work like jQuery.find. See this note on MDN for more info.

This was news to me also. I was trying to write a test case where if I took the native path on a nested selector it would fail but if I took the original code path it would pass. But since querySelectorAll doesn't behave like we both assumed it was actually passing either way and my original code will handle the case you describe above.

Given this, if you are still interested this PR should be good to merge. I have re-pointed the PR to the dev branch.

@eric-hemasystems eric-hemasystems changed the base branch from master to dev July 14, 2021 20:19
@eric-hemasystems
Copy link
Author

Just adding a note to this. I updated this branch to the latest from the dev branch since I was testing out this optimization on another app so it should be ok to merge.

Regarding the discussion of the impl, I think when we last left it off the existing code already addressed your concern so it should be 100% compatible with the existing impl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants