feat: implement `--shard` option #12546

marionebl · 2022-03-04T02:57:29Z

Summary

Add a --shard=n/m option, allowing for parallel execution on multiple machines
Use new option in jest's Github workflows
Use new option in jest's CircleCI jobs
More context in Parallelizing across machines in CI #6270

Fixes #6270

Test plan

Run Node.js workflow (e.g. https://github.com/marionebl/jest/actions/runs/1931641255)
Run CircleCI jobs (e.g. https://app.circleci.com/pipelines/github/facebook/jest/7410/workflows/430ee033-9d57-4949-a205-76e047004a99/jobs/120530)
Run integration test suite (includes a new shard test)
Run unit test suite (covers new validation logic)

SimenB

super exciting!

can you add an integration test using this?

and a changelog entry 🙂

packages/jest-config/src/normalize.ts

packages/jest-core/src/runJest.ts

packages/jest-test-sequencer/src/index.ts

Co-authored-by: Simen Bekkhus <sbekkhus91@gmail.com>

marionebl · 2022-03-04T10:04:32Z

marionebl · 2022-03-04T11:42:31Z

@SimenB I applied all requested changes and changed the CircleCI config to make use of the new feature. Let me know :)

nickofthyme

Well done, this is incredibly thorough and quick!

From a quick review, it appears that this only shards the spec files themselves and not the tests within the suites. This implementation would be better than nothing, but to me the true solution would be to shard the total test count between machines. This is how the --shard option works in @playwright/test for single or many test suites.

# shards all tests within a single test file
npx playwright test --shard=1/3 tests/one.test.ts
# shards all tests across 2 test files
npx playwright test --shard=1/3 tests/one.test.ts tests/two.test.ts
# shards all tests across all project test files
npx playwright test --shard=1/3

So if you had 2 test files one with 20 tests and the other with 80, both commands below would run 50 tests, never the same test twice.

 npx playwright test --shard=1/2 # 50 tests run
 npx playwright test --shard=2/2 # 50 tests run

I imagine most jest setups have good distribution and separation of test files such that the test suite sharding solution would still be incredibly useful. I'm not aware if the jest runner setup is suitable for sharding on a per-test basis (i.e. running a consistent subset of tests for for a single test file/suite), maybe a jest maintainer can opine.

I'm just a spectator so feel free to ignore my thoughts. 👍🏼

marionebl · 2022-03-04T23:55:28Z

This implementation would be better than nothing, but to me the true solution would be to shard the total test count between machines.

Good shout! I've toyed with the concept and discarded if after initial tests in a code base I maintain. My observations:

For Jest transpilation cost is proportional to the amount of code under test
For Playwright transpilation cost is proportional to the amount of test code

This assumes we need full blown transpilation before listing test cases which I'm not 100% confident about maybe @SimenB can chime in.

SimenB · 2022-03-05T08:42:42Z

Yeah, we don't want to execute the test files before splitting up as that would require transpilation as mentioned, but also actually running the test file (but not the describes and its) which will cause imports etc. negating much, if not all, of the benefit. That just doesn't fit with Jest's model. The approach made in this PR is the correct one.

SimenB · 2022-03-05T08:46:47Z

One thing to explore for this repo is to shard in a way that mostly splits off e2e/** to separate runs as those are the slow ones

So we should probably, funnily enough, implement our of test scheduler and not use the default one. That said of course, the gains here are already quite massive 😅

SimenB

the numbers reported from CI makes me quite excited about this! 😀

.circleci/config.yml

.github/workflows/nodejs.yml

docs/CLI.md

e2e/__tests__/shard.test.ts

packages/jest-cli/src/cli/args.ts

packages/jest-config/src/parseShardPair.ts

packages/jest-core/src/runJest.ts

SimenB · 2022-03-05T09:09:53Z

packages/jest-test-sequencer/src/index.ts

+    const shardEnd = shardSize * options.shardIndex;
+
+    return [...tests]
+      .sort((a, b) => (a.path > b.path ? 1 : -1))


I wonder if we should be more clever here. Thoughts on calling await this.sort(tests) and then splitting off from there? and then using index % options.shardIndex === 0 or something (not that exactly since it won't work 😅) since sort by default tries to schedule slowest/failing/largest first.

Might be overkill, dunno

I've thought about this one a bit - my team will use a jump consistent hash (https://arxiv.org/pdf/1406.2294.pdf) based on the filename, sort based on that and then apply modulo. That should make for nicely distributed and very stable shards.

In a later iteration I believe we could try and balance the shards to contain a similar amount of predicted test run time based on historical data, not sure yet what that would like yet though 😅

TL;DR If you don't object I'll give a smarter shard algorithm a crack; thoughts on pulling in a jump consistent hashing lib as dependency vs. having it in source?

Worth a go! Might be best to do that in a separate PR though, just to not block this one 🙂 I'm very interested in seeing that happen, though!

I went ahead and added simple hashing of the test path, results are encouraging. The distribution of e2e test went from

108 102 0 0

to

51 48 62 49

Command used:

λ yarn jest --listTests --shard='1/4' | grep e2e | wc -l && yarn jest --listTests --shard='2/4' | grep e2e | wc -l && yarn jest --listTests --shard='3/4' | grep e2e | wc -l && yarn jest --listTests --shard='4/4' | grep e2e | wc -l

For CI this means:

Oh yeah, this is awesome! 8 minutes CI is a dream 😀

packages/jest-config/src/parseShardPair.ts

Co-authored-by: Simen Bekkhus <sbekkhus91@gmail.com>

SimenB · 2022-03-05T18:42:14Z

packages/jest-test-sequencer/src/index.ts

+        const relativeTestPath = path.relative(
+          test.context.config.rootDir,
+          test.path,
+        );


a change in the logic to make the sharding stable between machines with different paths (test.path is an absolute path, so the hash differs between machines). Not really an issue except that we test the sharding in this repo, so we need to be stable 🙂

SimenB · 2022-03-05T19:05:50Z

4 minute CI, gotta love it

EDIT: or even 3 😅

SimenB · 2022-03-05T21:16:19Z

Huh, test is failing on windows node@14, but not 12, 16 or 17...

marionebl · 2022-03-05T22:36:42Z

packages/jest-test-sequencer/src/index.ts

+          test,
+        };
+      })
+      .sort((a, b) => a.hash.localeCompare(b.hash))


Will produce inconsistent results across machines with different locale settings - maybe forcing it to a specific locale would be the best option here?

would have thought that didn't impact file paths, but I've been surprised by FS behaviour before 😛

marionebl · 2022-03-05T23:54:19Z

Test failures:

Node v12.x on macOS-latest (2/4)
- notify › does not report --notify flag
Node v12.x on macOS-latest (3/4)
- on node >=12.17.0 › typescript › reads config from ts file when package.json#type=module
Node v14.x on macOS-latest (3/4)
- on node >=12.17.0 › typescript › reads config from ts file when package.json#type=module
Node v14.x on windows-latest (1/4)
- --shard=1/2 custom sharding test sequencer
Node_ v16.x on macOS-latest (3/4)
- on node >=12.17.0 › typescript › reads config from ts file when package.json#type=module
Node LTS on macOS-latest using jest-jasmine2 (3/4)
- on node >=12.17.0 › typescript › reads config from ts file when package.json#type=module

4 seem to be a recurring timeout issue that only affects macOS. (typescript › reads config)
1 needs investigation but I can't see how it relates to the changes at hand (notify › does not report)
1 seems to be related, possibly caused by localCompare differences (--shard=1/2 custom sharding test sequencer)

SimenB · 2022-03-06T00:25:35Z

This time it failed with jest jasmine, not circus... something is off. I only have access to node 12 on windows, but I'll try dig

SimenB · 2022-03-06T09:13:53Z

Thank you again @marionebl!

marionebl · 2022-03-06T10:02:05Z

Thank you for getting it across the line! 🎉

therynamo · 2022-03-14T17:32:22Z

QQ for you both @SimenB @marionebl - when running coverage with shards there is a discrepancy because coverage doesn't account for the tests being sliced. So the coverage report comes back with a lot of failed coverage. Is there a way to use shards with coverage right now? Or should I raise an issue that can be worked on?

llwt · 2022-03-15T19:07:37Z

@therynamo ~~This package is old, but still does the trick for me https://www.npmjs.com/package/istanbul-combine~~

Istanbul report supports it natively now: jamestalmage/istanbul-combine#2

istanbul report --dir /path/to/coverage/dir --include **/*coverage.json json

therynamo · 2022-03-15T19:28:09Z

@llwt - woah! Nice! So I just need to make sure that jest allows for the --include flag on --coverage. If it isn't there - that'd likely be an easy add to the CLI/Config options.

Wow - great find! Thank you for the reply too 👍

SimenB · 2022-03-15T22:35:58Z

You'll need to merge the reports manually, yeah. Or use a CI/reporter that does it for you (e.g. coveralls)

github-actions · 2022-04-15T00:07:39Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

facebook-github-bot added the cla signed label Mar 4, 2022

feat: implement --shard option jestjs#6270

3052d3b

marionebl force-pushed the 6270 branch 2 times, most recently from 7934540 to 3052d3b Compare March 4, 2022 05:34

marionebl mentioned this pull request Mar 4, 2022

Parallelizing across machines in CI #6270

Closed

SimenB requested changes Mar 4, 2022

View reviewed changes

fix: remove unneeded optional chain

9a02991

Co-authored-by: Simen Bekkhus <sbekkhus91@gmail.com>

marionebl added 3 commits March 4, 2022 21:26

fix: validate shard option in runJest too

1bdab88

fix: simplify .shard control flow

edcd51b

test: add shard e2e test

8a6d66f

marionebl force-pushed the 6270 branch from 0b5b53e to 5ff3de4 Compare March 4, 2022 11:02

ci: try dogfooding on circleci

fc7574d

marionebl force-pushed the 6270 branch from 5ff3de4 to fc7574d Compare March 4, 2022 11:03

marionebl added 3 commits March 4, 2022 22:07

test: fix failing test

d524ecf

docs: add changelog entry

392f2f2

ci: simplify circleci config

d507ae5

marionebl force-pushed the 6270 branch from ec06241 to d507ae5 Compare March 4, 2022 11:22

nickofthyme reviewed Mar 4, 2022

View reviewed changes

marionebl requested a review from SimenB March 4, 2022 23:58

SimenB requested changes Mar 5, 2022

View reviewed changes

SimenB reviewed Mar 5, 2022

View reviewed changes

packages/jest-config/src/parseShardPair.ts Outdated Show resolved Hide resolved

marionebl and others added 4 commits March 5, 2022 20:29

Apply formatting suggestion

002e9d5

Co-authored-by: Simen Bekkhus <sbekkhus91@gmail.com>

Grammar fix

17a2f41

Co-authored-by: Simen Bekkhus <sbekkhus91@gmail.com>

fix: validate only once

c2cc758

test: cover negative number validation

9136939

SimenB reviewed Mar 5, 2022

View reviewed changes

SimenB added 3 commits March 5, 2022 19:44

localeCompare, not number conversion

74b1a0c

unit test

798fd40

type errors in test

128047f

oops

6facf86

SimenB changed the title ~~Implement shard option #6270~~ Implement shard option Mar 5, 2022

SimenB added 2 commits March 5, 2022 20:15

move docs and mention in troubleshooting

be17873

maybe?

70d073f

marionebl commented Mar 5, 2022

View reviewed changes

compare manually

3748808

SimenB added 2 commits March 6, 2022 09:31

maybe

75014e9

shard coverage run

a2bb729

SimenB changed the title ~~Implement shard option~~ feat: implement --shard option Mar 6, 2022

SimenB merged commit 54eab57 into jestjs:main Mar 6, 2022

noseworthy mentioned this pull request Mar 7, 2022

Use default export of jest-environment if present shelfio/jest-mongodb#331

Merged

swissspidy mentioned this pull request Mar 8, 2022

Update to Jest v28 GoogleForCreators/web-stories-wp#10843

Closed

F3n67u pushed a commit to F3n67u/jest that referenced this pull request Mar 9, 2022

feat: implement --shard option (jestjs#12546)

73def6d

lemonmade mentioned this pull request Mar 10, 2022

Bring in Jest's new --shard flag lemonmade/quilt#175

Closed

github-actions bot locked as resolved and limited conversation to collaborators Apr 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement `--shard` option #12546

feat: implement `--shard` option #12546

marionebl commented Mar 4, 2022 •

edited by SimenB

Loading

SimenB left a comment

marionebl commented Mar 4, 2022 •

edited

Loading

marionebl commented Mar 4, 2022

nickofthyme left a comment •

edited

Loading

marionebl commented Mar 4, 2022

SimenB commented Mar 5, 2022

SimenB commented Mar 5, 2022

SimenB left a comment

SimenB Mar 5, 2022

marionebl Mar 5, 2022

marionebl Mar 5, 2022

SimenB Mar 5, 2022

marionebl Mar 5, 2022

SimenB Mar 5, 2022

SimenB Mar 5, 2022

SimenB commented Mar 5, 2022 •

edited

Loading

SimenB commented Mar 5, 2022

marionebl Mar 5, 2022

SimenB Mar 5, 2022

marionebl commented Mar 5, 2022

SimenB commented Mar 6, 2022

SimenB commented Mar 6, 2022

marionebl commented Mar 6, 2022

therynamo commented Mar 14, 2022

llwt commented Mar 15, 2022 •

edited

Loading

therynamo commented Mar 15, 2022

SimenB commented Mar 15, 2022

github-actions bot commented Apr 15, 2022

feat: implement --shard option #12546

feat: implement --shard option #12546

Conversation

marionebl commented Mar 4, 2022 • edited by SimenB Loading

Summary

Test plan

SimenB left a comment

Choose a reason for hiding this comment

marionebl commented Mar 4, 2022 • edited Loading

Todo

marionebl commented Mar 4, 2022

nickofthyme left a comment • edited Loading

Choose a reason for hiding this comment

marionebl commented Mar 4, 2022

SimenB commented Mar 5, 2022

SimenB commented Mar 5, 2022

SimenB left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SimenB commented Mar 5, 2022 • edited Loading

SimenB commented Mar 5, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marionebl commented Mar 5, 2022

SimenB commented Mar 6, 2022

SimenB commented Mar 6, 2022

marionebl commented Mar 6, 2022

therynamo commented Mar 14, 2022

llwt commented Mar 15, 2022 • edited Loading

therynamo commented Mar 15, 2022

SimenB commented Mar 15, 2022

github-actions bot commented Apr 15, 2022

feat: implement `--shard` option #12546

feat: implement `--shard` option #12546

marionebl commented Mar 4, 2022 •

edited by SimenB

Loading

marionebl commented Mar 4, 2022 •

edited

Loading

nickofthyme left a comment •

edited

Loading

SimenB commented Mar 5, 2022 •

edited

Loading

llwt commented Mar 15, 2022 •

edited

Loading