fix(td-has-headers): greatly improve performance of td-has-headers rule #1887
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Td-has-headers was a slow running rule for large tables. On my favorite benchmark site (web archive because it's now gone), https://web.archive.org/web/20190613132353/https://giveawaylisting.com/, it never completed (after waiting for 5 minutes). After digging into the code and perf traces, I noticed there were a few problems with the code:
tableGrid
by having to callfindUp
and thentoGrid
:axe.commons.table.toGrid(axe.utils.findUp(cell, 'table'))
. Creating a grid of a 4147 rows was terribly slow, and having to do it for every cell multiple times was worse. So we memoized the function. I also needed to clear the memoized function cache at the end of a run, so needed a way to save which functions have been memoized as there is no global call to clear the cache of all memoized functions.getHeaders
function calledtraverse
, which had to gather every cell in the direction before it could process it for results. This was also terribly slow as every cell would have to gather every cell above it and then loop through those cells to determine if they were headers. Thats approximately 4147! * 2 iterations. The generic traverse method doesn't really have any way to cache results, so I had to create a function that will cache row and col headers on each node so now the function only needs to look at 4147 * 3 cells (each cell is checked thrice: once for the cell itself, once for the cell below it, and once for the cell to the right). We should remove thetraverse
method so no one uses it as it is terrible for performance on large tables.toGrid
was memoized, looking up the memoized cache took ~1.5ms every time. Doing so for every cell means it takes 1.5ms * 4147 = 6s to process the large table in just that one call. So I added the ability to pass the grid from the check, which already has thetable
and can call it just once. I did the same for other table functions when thetableGrid
has already been calculated from the calling function.The end result of all this work is that the rule now runs in 1499.20ms (average of 5 runs) on giveawaylistings.com.
Closes issue: #908
Reviewer checks
Required fields, to be filled out by PR reviewer(s)