Optimize the Recursive Matching #68

eric-hemasystems · 2018-08-30T19:06:07Z

I was trying to use arrive in combination with the Turbolinks library and the webpage was crashing about every other page change in IE 11.

I found this happened even when the "arrive" callback registered was a no-op function (i.e. function() {}) and a non-existent node. So the crash wasn't due to something in what we were doing with arrive but due to something arrive was doing itself.

Since the entire tab was crashing it made it difficult to debug so I started just using the MutationObserver API directly slowly building up to some of what arrive does. Early in the process I found that if I spent too long in the MutationObserver callback I would get this crash.

This made sense as Turbolinks replaces the entire body meaning that the entire DOM is traversed looking for matches. I know Stimulus (also from the Rails universe) is often paired with Turbolinks and
Stimulus operates via the MutationObserver. This caused me to investigate how they found matching elements without crashing IE when a large number of elements change.

Found the answer at:

https://github.com/stimulusjs/stimulus/blob/8cb58d1f8a4ac83a875c7a1fdc7a28f21869efd6/packages/%40stimulus/mutation-observers/src/attribute_observer.ts#L52-L56

They use matchElement like arrive does on the node actually reported by the mutation observer. But then to search the child nodes it uses querySelectorAll. This bring the matching and recusing into native code which is much faster.

For arrive I decided to just use querySelectorAll as an initial filter and to handling the recusing. Once we get the list then it still runs through the matchFunc so that it still does the tracking that
avoids repeating for the same element.

Even if not using Turbolinks (or similar library) this should optimize any site using arrive where a large number of nodes change.

I was trying to use arrive in combination with the Turbolinks library[1] and the webpage was crashing about every other page change in IE 11. I found this happened even when the "arrive" callback registered was a no-op function (i.e. `function() {}`) and a non-existent node. So the crash wasn't due to something in what we were doing with arrive but due to something arrive was doing itself. Since the entire tab was crashing it made it difficult to debug so I started just using the MutationObserver API directly slowly building up to some of what arrive does. Early in the process I found that if I spent too long in the MutationObserver callback I would get this crash. This made sense as Turbolinks replaces the entire `body` meaning that the entire DOM is traversed looking for matches. I know Stimulus (also from the Rails universe) is often paired with Turbolinks and Stimulus operates via the MutationObserver. This caused me to investigate how they found matching elements without crashing IE when a large number of elements change. Found the answer at: https://github.com/stimulusjs/stimulus/blob/8cb58d1f8a4ac83a875c7a1fdc7a28f21869efd6/packages/%40stimulus/mutation-observers/src/attribute_observer.ts#L52-L56 They use `matchElement` like arrive does on the node actually reported by the mutation observer. But then to search the child nodes it uses `querySelectorAll`. This bring the matching and recusing into native code which is much faster. For `arrive` I decided to just use `querySelectorAll` as an initial filter and to handling the recusing. Once we get the list then it still runs through the `matchFunc` so that it still does the tracking that avoids repeating for the same element. Even if not using Turbolinks (or similiar library) this should optimize any site using arrive where a large number of nodes change. 1. https://github.com/turbolinks/turbolinks

This was left off the original commit but is in our tested code (just a copy/paste error). Not all nodes can have children and therefore don't have this method. This means we need a guard to prevent the attempt to use this method.

uzairfarooq

Appreciate the time you spent but the issue I mentioned is a blocker. Let me know if you think there's a solution available otherwise I'll close the pull request.

uzairfarooq · 2020-06-20T11:34:52Z

src/arrive.js

-          if (node.childNodes.length > 0) {
-            utils.checkChildNodesRecursively(node.childNodes, registrationData, matchFunc, callbacksToBeCalled);
+          if( !node.querySelectorAll ) continue;
+          var matchedDescendents = node.querySelectorAll(registrationData.selector);


There are cases where querySlectorAll() won't match all elements. Consider this example:

<div class='parent-elem'> <div class='current-node'>  <div class='child'></div> </div> <div>

Suppose arrive is called with following:

document.arrive('.parent-elem .child', (e) => { console.log(e); })

querySelectorAll() won't match child element because we are calling it from .current-node element

Good catch! I think all my selectors are simpler than that (usually just looking for a certain class on a certain element). Sounds like we have a choice between:

crashing on IE 11 if a large number of nodes change

complex selectors not being supported in certain edge cases

Either kinda sucks.

I wonder if we could detect if a simple selector is being used and if so go native, if not, use the non-native strategy? That would seem to complicate things though. Maybe make it developer choice? Have an option to enable the more efficient route but with the caveat that certain complex selectors wouldn't be supported in certain edge cases. That would add less complexity and allow folks that don't have the complex selectors but do need to support IE to not need to use a fork of this library.

If either of these options sound like too much complication I would lean towards just closing this and I'll just have my own local fork so I can support IE 11. Hopefully in Oct 2025 when IE 11 is EOL we can move back to the mainline fork. :)

I wonder if we could detect if a simple selector is being used and if so go native, if not, use the non-native strategy?

This seems like a good option.

Maybe make it developer choice? Have an option to enable the more efficient route but with the caveat that certain complex selectors wouldn't be supported in certain edge cases.

Hmm. This option would be a bit hard to explain to normal users. I wonder if we can detect simple selector using a regex maybe? I think in a lot of cases users are using simple selectors and if we can detect it intelligently, it would make significant performance improvements for a lot of users.

I just checked and all of my selectors are a single ID or class on an element. Something like #users_index. The only exception is I use the comma to specify multiple selectors on a the same line (but again we are just looking for a single class or ID for each sub-selector). For example: .form-checkbox, .form-radio

It seems if we take the selector, split on comma, trim any whitespace before and after each split part and the check to ensure each split part does not still have a space then that would satisfy all my needs. If someone later comes up with a more complicated selector that does not satisfy that criteria but still could use the native search we could always augment but that would be a good start.

If that sounds good I'll update this PR to use this strategy.

Yeah, this seems like a good start. We can intelligently handle cases where attribute value contains spaces e.g. ‘.test[title=“some title”]’ but this does not necessarily have to be part of this pull request, we can handle it later.

Also, please send the pull request to dev branch. I’ll merge to master after release.

Not sure if this project is still active anymore but finally had a few hours to look into seeing about closing this issue out.

Turns out my original code should account for the scenario you uncovered. querySelectorAll does not work like jQuery.find. See this note on MDN for more info.

This was news to me also. I was trying to write a test case where if I took the native path on a nested selector it would fail but if I took the original code path it would pass. But since querySelectorAll doesn't behave like we both assumed it was actually passing either way and my original code will handle the case you describe above.

Given this, if you are still interested this PR should be good to merge. I have re-pointed the PR to the dev branch.

eric-hemasystems · 2024-05-23T21:12:44Z

Just adding a note to this. I updated this branch to the latest from the dev branch since I was testing out this optimization on another app so it should be ok to merge.

Regarding the discussion of the impl, I think when we last left it off the existing code already addressed your concern so it should be 100% compatible with the existing impl.

Eric Anderson added 2 commits August 30, 2018 14:41

Add left off method check

158c771

This was left off the original commit but is in our tested code (just a copy/paste error). Not all nodes can have children and therefore don't have this method. This means we need a guard to prevent the attempt to use this method.

uzairfarooq requested changes Jun 20, 2020

View reviewed changes

eric-hemasystems changed the base branch from master to dev July 14, 2021 20:19

Merge branch 'dev' into optimize-recursive-matching

f09f004

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize the Recursive Matching #68

Optimize the Recursive Matching #68

eric-hemasystems commented Aug 30, 2018

uzairfarooq left a comment

uzairfarooq Jun 20, 2020

eric-hemasystems Jun 22, 2020

uzairfarooq Jun 26, 2020

eric-hemasystems Jun 26, 2020

uzairfarooq Jun 26, 2020

eric-hemasystems Jul 14, 2021

eric-hemasystems commented May 23, 2024

Optimize the Recursive Matching #68

Are you sure you want to change the base?

Optimize the Recursive Matching #68

Conversation

eric-hemasystems commented Aug 30, 2018

uzairfarooq left a comment

Choose a reason for hiding this comment

uzairfarooq Jun 20, 2020

Choose a reason for hiding this comment

eric-hemasystems Jun 22, 2020

Choose a reason for hiding this comment

uzairfarooq Jun 26, 2020

Choose a reason for hiding this comment

eric-hemasystems Jun 26, 2020

Choose a reason for hiding this comment

uzairfarooq Jun 26, 2020

Choose a reason for hiding this comment

eric-hemasystems Jul 14, 2021

Choose a reason for hiding this comment

eric-hemasystems commented May 23, 2024