Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Cache config.cache across runs to speed up build #104800

Merged
merged 4 commits into from
May 25, 2023

Conversation

hugovk
Copy link
Member

@hugovk hugovk commented May 23, 2023

Run ./configure with the --config-cache to create a cache file on Ubuntu and macOS jobs, and use https://github.com/actions/cache to re-use it between runs:

uses: actions/cache@v3
with:
  path: config.cache
  key: ${{ github.job }}-${{ hashFiles('configure') }}-${{ hashFiles('configure.ac') }}

When the key changes, the action will upload a fresh cache file.

So we have a unique cache file per job ID (github.job, for example: build_macos, build_ubuntu_ssltests), and we'll also get a fresh cache file whenever the configure or configure.ac files change.

Seconds to download cache + configure:

Job Without cache With cache Times faster Seconds saved
check_generated_files 22 4 5.5 18
build_macos 96 17 5.6 79
build_ubuntu 23 4 5.8 19
build_ubuntu_ssltests (1.1.1t) 24 4 6.0 20
build_ubuntu_ssltests (3.0.8) 25 8 3.1 17
build_ubuntu_ssltests (3.0.8) 25 7 3.6 18
test_hypothesis 24 4 6.0 20
build_asan 29 5 5.8 24
         
Total: 268 53 5.1 215
  link link

@hugovk hugovk marked this pull request as ready for review May 23, 2023 15:44
@hugovk hugovk requested a review from ezio-melotti as a code owner May 23, 2023 15:44
@itamaro
Copy link
Contributor

itamaro commented May 23, 2023

where is the cache file stored? is it using per-repo or per-org storage?
wondering about that, thinking how it will behave in forks.

I'm not familiar with config.cache - are the configure and configure.ac files the only relevant inputs?
what's the impact on total CI runtime when there's a cache hit? cache miss?

I also tried thinking about cross-branch and cross-PR interactions, but since the key uses the hash of the input files then if the hashes are identical across branches and PRs then it should be ok to reuse the cached file.

one concern is implicit inputs that are not captured by the current key:

  • the "job name" is a proxy to flags that affect the result, but it is possible that flags change and the job name stays the same
  • the versions of the github actions used may change and implicitly affect the result, but afaik that is not captured in the cache key

@hugovk
Copy link
Member Author

hugovk commented May 23, 2023

where is the cache file stored? is it using per-repo or per-org storage? wondering about that, thinking how it will behave in forks.

https://github.com/actions/cache does it per-org, so it won't be shared with forks. Forks will have their own copies.

I'm not familiar with config.cache - are the configure and configure.ac files the only relevant inputs?

Let's ask @erlend-aasland :)

what's the impact on total CI runtime when there's a cache hit? cache miss?

I ran this PR twice: once with no cache, and second time with cache present.

  • See the "Post Restore config.cache" steps at the end for uploading a cache in the first run: they take either 0 or 1 seconds.

  • See the "Restore config.cache" steps at the start for restoring a cache in the second run: they take either 0 or 1 seconds.

I also tried thinking about cross-branch and cross-PR interactions, but since the key uses the hash of the input files then if the hashes are identical across branches and PRs then it should be ok to reuse the cached file.

one concern is implicit inputs that are not captured by the current key:

  • the "job name" is a proxy to flags that affect the result, but it is possible that flags change and the job name stays the same

Could you give an example?

  • the versions of the github actions used may change and implicitly affect the result, but afaik that is not captured in the cache key

I think the action will invalidate the cache itself if they change something, but I couldn't find anything to confirm. I also don't think the action's version is exposed, anyway.

@arhadthedev
Copy link
Member

arhadthedev commented May 23, 2023

Could you give an example?

  • check_generated_files: --config-cache --with-pydebug --enable-shared
  • build_macos: --config-cache --with-pydebug --prefix=/opt/python-dev --with-openssl="$(brew --prefix openssl@1.1)"
  • build_ubuntu, build_ubuntu_ssltests, test_hypothesis: --config-cache --with-pydebug --with-openssl=$OPENSSL_DIR
  • build_asan: --config-cache --with-address-sanitizer --without-pymalloc

Also, environment variables:

  • build_macos:
    • HOMEBREW_NO_ANALYTICS: 1
    • HOMEBREW_NO_AUTO_UPDATE: 1
    • HOMEBREW_NO_INSTALL_CLEANUP: 1
    • PYTHONSTRICTEXTENSIONBUILD: 1
  • build_ubuntu, test_hypothesis:
    • OPENSSL_VER: 1.1.1t
    • PYTHONSTRICTEXTENSIONBUILD: 1
  • build_ubuntu_ssltests:
    • OPENSSL_VER: ${{ matrix.openssl_ver }}
    • MULTISSL_DIR: ${{ github.workspace }}/multissl
    • OPENSSL_DIR: ${{ github.workspace }}/multissl/openssl/${{ matrix.openssl_ver }}
    • LD_LIBRARY_PATH: ${{ github.workspace }}/multissl/openssl/${{ matrix.openssl_ver }}/lib
  • build_asan:
    • OPENSSL_VER: 1.1.1t
    • PYTHONSTRICTEXTENSIONBUILD: 1
    • ASAN_OPTIONS: detect_leaks=0:allocator_may_return_null=1:handle_segv=0

@itamaro
Copy link
Contributor

itamaro commented May 23, 2023

@arhadthedev provided examples :)

the result can also be affected by the exact OS version (I don't know how granular runner.os is), which toolchain is used (gcc/clang), toolchain versions, and host architecture. any changes in any of these should invalidate the cache.

regarding the implementation - hashFiles can operate on multiple files, so use hashFiles(file1, file2) instead.
also for a given commit, the hash of these files is constant - would it be possible to extract the hash computation to perform it once and share between jobs?

@hugovk
Copy link
Member Author

hugovk commented May 23, 2023

@arhadthedev provided examples :)

👍

the result can also be affected by the exact OS version (I don't know how granular runner.os is), which toolchain is used (gcc/clang), toolchain versions, and host architecture. any changes in any of these should invalidate the cache.

runner.os: "The operating system of the runner executing the job. Possible values are LinuxWindows, or macOS."

https://docs.github.com/en/actions/learn-github-actions/contexts#runner-context

I added the workflow file to hashFiles, so it will invalidate and make a new cache when things defined in the file change, for example the env vars and CLI flags.

There's a runner.arch: "The architecture of the runner executing the job. Possible values are X86X64ARM, or ARM64" but we're using defaults and don't expect these will change?

regarding the implementation - hashFiles can operate on multiple files, so use hashFiles(file1, file2) instead.

Combined.

also for a given commit, the hash of these files is constant - would it be possible to extract the hash computation to perform it once and share between jobs?

Possibly but I'm not sure if it's worth the complexity: we can just have hashFiles computed each time.

@itamaro
Copy link
Contributor

itamaro commented May 23, 2023

LG!

I added the workflow file to hashFiles, so it will invalidate and make a new cache when things defined in the file change, for example the env vars and CLI flags.

this will be over-conservative, invalidating for any change in the workflow file, even if it doesn't affect the result (which is probably most changes). probably a fair trade off (preferring correctness over CI latency).

Possibly but I'm not sure if it's worth the complexity: we can just have hashFiles computed each time.

agreed. I'd suggest at least refactoring the list of files to be a shared constant.

Copy link
Contributor

@erlend-aasland erlend-aasland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. With the workflow file in the cache key, we're pretty safe regarding any environment change.

@erlend-aasland
Copy link
Contributor

I guess we want this CI speedup in the other branches as well.

@hugovk
Copy link
Member Author

hugovk commented May 24, 2023

I'd suggest at least refactoring the list of files to be a shared constant.

Do you know how to do this? I tried a few things, but didn't get it working.

@itamaro
Copy link
Contributor

itamaro commented May 24, 2023

Do you know how to do this? I tried a few things, but didn't get it working.

I think workflow-level environment variable can work for this

https://docs.github.com/en/actions/learn-github-actions/variables

@hugovk
Copy link
Member Author

hugovk commented May 24, 2023

I tried that like:

env:
  CONFIG_CACHE_HASH_FILES: ${{ hashFiles('configure', 'configure.ac', '.github/workflows/build.yml') }}

and:

          key: ${{ github.job }}-${{ runner.os }}-${{ env.CONFIG_CACHE_HASH_FILES }}

But it said "The workflow is not valid ", "Unrecognized function: 'hashFiles'"

I also tried:

env:
  CONFIG_CACHE_HASH_FILES: ('configure', 'configure.ac', '.github/workflows/build.yml')

and:

          key: ${{ github.job }}-${{ runner.os }}-${{ hashFiles(env.CONFIG_CACHE_HASH_FILES) }}

But it couldn't read the env var:

Cache not found for input keys: check_generated_files-Linux-

Let me try again with format():

https://github.com/orgs/community/discussions/25761#discussioncomment-4626908

@hugovk
Copy link
Member Author

hugovk commented May 24, 2023

Hmm, no luck, because it formats into strings, and we need a sort of tuple of strings.

We might be able to do something like:

env:
  CONFIG_CACHE_HASH_FILE1: 'configure'
  CONFIG_CACHE_HASH_FILE2: 'configure.ac'
  CONFIG_CACHE_HASH_FILE3: '.github/workflows/build.yml'

And then format it something like this:

          key: ${{ github.job }}-${{ runner.os }}-${{ hashFiles(format('{0}', env.CONFIG_CACHE_HASH_FILE1))-${{ hashFiles(format('{0}', env.CONFIG_CACHE_HASH_FILE2))-${{ format('{0}', env.CONFIG_CACHE_HASH_FILE3)) }}

But it feels like a backward refactor...

@itamaro
Copy link
Contributor

itamaro commented May 24, 2023

But it feels like a backward refactor...

yeah, I agree, this doesn't improve anything. let's leave it as explicit lists of files, I might give it another try separately after this merges.

@hugovk hugovk merged commit 1080c43 into python:main May 25, 2023
@hugovk hugovk deleted the ci-config.cache branch May 25, 2023 11:09
@erlend-aasland
Copy link
Contributor

Backport through all security branches?

@hugovk hugovk added needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels May 26, 2023
@miss-islington
Copy link
Contributor

Sorry @hugovk, I had trouble checking out the 3.12 backport branch.
Please retry by removing and re-adding the "needs backport to 3.12" label.
Alternatively, you can backport using cherry_picker on the command line.
cherry_picker 1080c4386dd3beb9ee808acdf3c3f01835f73860 3.12

hugovk added a commit to hugovk/cpython that referenced this pull request May 26, 2023
…-104800)

(cherry picked from commit 1080c43)

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
@bedevere-bot
Copy link

GH-104967 is a backport of this pull request to the 3.12 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.12 bug and security fixes label May 26, 2023
@bedevere-bot
Copy link

GH-104968 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label May 26, 2023
hugovk added a commit to hugovk/cpython that referenced this pull request May 26, 2023
…-104800).\n(cherry picked from commit 1080c43)\n\n\nCo-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
@miss-islington
Copy link
Contributor

Thanks @hugovk for the PR 🌮🎉.. I'm working now to backport this PR to: 3.7.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @hugovk for the PR 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @hugovk for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @hugovk for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @hugovk, I could not cleanly backport this to 3.7 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 1080c4386dd3beb9ee808acdf3c3f01835f73860 3.7

@miss-islington
Copy link
Contributor

Sorry @hugovk, I had trouble checking out the 3.9 backport branch.
Please retry by removing and re-adding the "needs backport to 3.9" label.
Alternatively, you can backport using cherry_picker on the command line.
cherry_picker 1080c4386dd3beb9ee808acdf3c3f01835f73860 3.9

@miss-islington
Copy link
Contributor

Sorry, @hugovk, I could not cleanly backport this to 3.8 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 1080c4386dd3beb9ee808acdf3c3f01835f73860 3.8

@miss-islington
Copy link
Contributor

Sorry @hugovk, I had trouble checking out the 3.10 backport branch.
Please retry by removing and re-adding the "needs backport to 3.10" label.
Alternatively, you can backport using cherry_picker on the command line.
cherry_picker 1080c4386dd3beb9ee808acdf3c3f01835f73860 3.10

@hugovk
Copy link
Member Author

hugovk commented May 26, 2023

Actually, maybe we should skip the security branches?

They've already diverged quite a bit, and don't get built (or backported) as often, so there's less benefit.

@erlend-aasland
Copy link
Contributor

Actually, maybe we should skip the security branches?

They've already diverged quite a bit, and don't get built (or backported) as often, so there's less benefit.

Yeah, makes sense.

hugovk added a commit that referenced this pull request May 26, 2023
… (#104968)

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
@itamaro
Copy link
Contributor

itamaro commented May 27, 2023

I might give it another try separately after this merges.

I think it worked out nicely in gh-105008

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants