Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: speed up windows jobs using ReFS #3522

Merged
merged 1 commit into from
May 13, 2024

Conversation

samypr100
Copy link
Collaborator

@samypr100 samypr100 commented May 11, 2024

Summary

Switch to using Virtual HD + ReFS to speed up windows jobs.

Closes #3508

@zanieb
Copy link
Member

zanieb commented May 11, 2024

So promising, thanks for exploring! You can push an empty commit to test the cached test time.

@samypr100
Copy link
Collaborator Author

samypr100 commented May 11, 2024

Stats from first run...

Step Before After
Install Toolchain 1m ~ 11s~
Install nextest 15s ~ 8s~
Cargo Tests 5m 25s ~ (cached) 5m 44s ~ (uncached) -- TODO check cached speeds

Follow up updated run, will do a few more to try get the cached versions of some of these

Steps Before After
Clippy 30s~ (cached) 2m ~ (uncached)
Cache Restoration 1.5m ~ 12s ~
Installing Toolchain 1m ~ 11s~
Install nextest 15s ~ 8s~
Cargo Tests 5m 25s ~ (cached) 3m 59s ~ (cached)
Trampoline 24s ~ (cached) 25s ~ (uncached)
build binary 2m ~ (cached) 2m 30s ~ (uncached)

@samypr100
Copy link
Collaborator Author

More Step Totals

Steps Before After
Clippy step 30s~ (cached) 30s ~ (cached)
Cache Restoration 1.5m ~ 12s ~
Installing Toolchain 1m ~ 11s~
Install nextest 15s ~ 8s~
Cargo Tests 5m 25s ~ (cached) 3m 59s ~ (cached)
trampoline build step 24s ~ (cached) 24s ~ (cached)
binary build step 2m ~ (cached) 50s ~ (cached)

Job Totals (looking at a few runs)

Job Before After
Clippy 1m-1.5m ~ 1m-1.5m ~
Cargo Tests 8m 4m-5m ~
Trampoline build 4m ~ 2m ~
build binary 4m 1m-2m ~

Other notes might be worth adjusting the vhdx default size to speed up it's creation as it can take between 10s to 30s. I'll reduce to 5GB next.

@samypr100
Copy link
Collaborator Author

samypr100 commented May 11, 2024

Reduced to 5GB clippy & trampoline, left test & build binary 10GB since it ran out of space otherwise. Made no substantial difference, left at 10GB.

@samypr100 samypr100 changed the title feat: speed up windows tests using ReFS feat: speed up windows jobs using ReFS May 11, 2024
@samypr100
Copy link
Collaborator Author

samypr100 commented May 11, 2024

In summary, migrated windows jobs got 1.5x-2.x ~ speed up except clippy which stayed at 1x. Clippy job can be a bit faster or slower depending on how fast the VHDX gets created. Maybe switching to cargo-xwin as described in #3507 along with VHDX from this PR can unlock faster speeds.

Note, I did not get data for uncached before. Only cached before and cached/uncached after. So it's possible when comparing clippy before and after uncached it might be faster.

@samypr100 samypr100 marked this pull request as ready for review May 11, 2024 15:22
@charliermarsh
Copy link
Member

I'll defer to @konstin and @zanieb on the review but based on the numbers, this seems really good! You're awesome!

@zanieb
Copy link
Member

zanieb commented May 13, 2024

Should we skip it on Clippy for now then? I don't know if it merits the complexity there if it's both slower/faster.

@charliermarsh
Copy link
Member

I don't mind keeping it for consistency but defer to you.

@zanieb
Copy link
Member

zanieb commented May 13, 2024

I don't know, we need to split the job out of the matrix and maybe we're going to switch to xwin later? 🤷‍♀️ I don't have strong opinions though.

@charliermarsh
Copy link
Member

Sadly the decision falls upon you...

@samypr100
Copy link
Collaborator Author

samypr100 commented May 13, 2024

Note clippy might still be faster in purely uncached times, I haven't compared though. I'd say worth trying it out for a bit and we can change later easily.

Agreed, changing to xwin will require a split from the matrix regardless.

@zanieb zanieb added testing Internal testing of behavior internal A refactor or improvement that is not user-facing labels May 13, 2024
@zanieb zanieb merged commit 790206b into astral-sh:main May 13, 2024
44 checks passed
@samypr100 samypr100 deleted the windows-fs-speedup branch May 14, 2024 02:11
@zooba
Copy link

zooba commented May 14, 2024

This is awesome to see!

You might need some error handling in case you land on an older system, but there's a -DevDrive option to Format-Volume that ought to improve things further. It may not be hugely significant since this is already a somewhat stripped back CI system rather than an OS install.

Putting the VHDX in $env:GITHUB_WORKSPACE might also help (I don't know exactly how the GHA setup looks, but it's based on Azure Pipelines which puts the OS on a slower disk than the workspace), and passing -Fixed to New-VHD may also help (but I'm less confident about that one).

@samypr100
Copy link
Collaborator Author

samypr100 commented May 15, 2024

Thanks! @zooba

there's a -DevDrive option to Format-Volume that ought to improve things further

Great point! I did try out -DevDrive, but got a Format-Volume: A parameter cannot be found that matches parameter name 'DevDrive'. since it seems it works on windows versions 10.0.22621 or greater and runners seem to be a bit below that ~10.0.20348.

I did add that check to the new action I created out of this PR that uses -DevDrive when it meets the minimum os version (e.g. self hosted runners) and also uses the same drive ${{ github.workspace }} uses. Feedback welcome of course :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal A refactor or improvement that is not user-facing testing Internal testing of behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explore using a dev drive for windows CI
4 participants