Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(td-has-headers): greatly improve performance of td-has-headers rule #1887

Merged
merged 4 commits into from
Nov 14, 2019

Conversation

straker
Copy link
Contributor

@straker straker commented Nov 7, 2019

Td-has-headers was a slow running rule for large tables. On my favorite benchmark site (web archive because it's now gone), https://web.archive.org/web/20190613132353/https://giveawaylisting.com/, it never completed (after waiting for 5 minutes). After digging into the code and perf traces, I noticed there were a few problems with the code:

  1. All the table functions would recreate the tableGrid by having to call findUp and then toGrid: axe.commons.table.toGrid(axe.utils.findUp(cell, 'table')). Creating a grid of a 4147 rows was terribly slow, and having to do it for every cell multiple times was worse. So we memoized the function. I also needed to clear the memoized function cache at the end of a run, so needed a way to save which functions have been memoized as there is no global call to clear the cache of all memoized functions.
  2. The getHeaders function called traverse, which had to gather every cell in the direction before it could process it for results. This was also terribly slow as every cell would have to gather every cell above it and then loop through those cells to determine if they were headers. Thats approximately 4147! * 2 iterations. The generic traverse method doesn't really have any way to cache results, so I had to create a function that will cache row and col headers on each node so now the function only needs to look at 4147 * 3 cells (each cell is checked thrice: once for the cell itself, once for the cell below it, and once for the cell to the right). We should remove the traverse method so no one uses it as it is terrible for performance on large tables.
  3. Even though toGrid was memoized, looking up the memoized cache took ~1.5ms every time. Doing so for every cell means it takes 1.5ms * 4147 = 6s to process the large table in just that one call. So I added the ability to pass the grid from the check, which already has the table and can call it just once. I did the same for other table functions when the tableGrid has already been calculated from the calling function.

The end result of all this work is that the rule now runs in 1499.20ms (average of 5 runs) on giveawaylistings.com.

Closes issue: #908

Reviewer checks

Required fields, to be filled out by PR reviewer(s)

  • Follows the commit message policy, appropriate for next version
  • Code is reviewed for security

@straker straker requested a review from a team as a code owner November 7, 2019 18:20
lib/core/utils/memoize.js Show resolved Hide resolved
lib/core/public/run-rules.js Show resolved Hide resolved
lib/commons/table/get-headers.js Show resolved Hide resolved
@straker straker merged commit a550309 into develop Nov 14, 2019
@straker straker deleted the perfTdHasHeaders branch November 14, 2019 15:15
straker added a commit that referenced this pull request Dec 11, 2019
…le (#1887)

* fix(td-has-headers): greatly improve performance of td-has-headers rule

* remove console.log

* add tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants