Is there a way to use parallelism when building the wheels? #1088

Strilanc · 2022-04-12T23:06:05Z

Waiting for cibuildwheel to finish is really painful, even for a single wheel. Is there a flag I can use in order to force it to use multiple threads when building? For example, when building code with make I can specify --jobs 8 to build 8 source files at a time instead of 1 at a time.

For scale, cibuildwheel is one hundred (!!!!!) times slower than all the other checks I do. Being able to reduce that to 10x slower by using parallelism would be hugely helpful.

The text was updated successfully, but these errors were encountered:

dvarrazzo · 2022-05-02T01:07:58Z

You can use the github action grid, or some equivalent feature in the CI system you use.

I have done it recently on setproctitle and build went down from hours to 10 minutes.

The result is some 30 parallel jobs.

Strilanc · 2022-05-10T21:00:14Z

That does help, but the individual jobs still take 10 minutes to complete. When I build the wheel locally using bazel, after clearing bazel's cache, the build and test finishes in 1 minute. So there's still something making things very slow in cibuildwheels compared to other strategies for building the wheel. And a likely candidate is that each worker is only using one thread to build the C++ code, so it goes one... file... at... a... time...

10 minutes on github (the worst one takes 20 minutes but 10 seems to be the common value):

quantumlib/Stim#238

1 minute locally (on a 12 core machine):

henryiii · 2022-05-10T21:43:22Z

CI runners have at most 2 cores on most CI's, and run on shared resources. So that's 6x slower than local, or the equivalent of 6 minutes local time. Plus cibuildwheel is downloading Python and setting up multiple virtual environments, downloading dependencies, running tests, etc - easily can be the remaining up to 4 minutes. So that sounds perfectly reasonable. Your build should be using both available cores as long as you've set it up to do that. Cibuildwheel in parallel would not give you much at all. Though I guess you could do it yourself with CIBW_BUILD=<first half> cibuildwheel & ; CIBW_BUILD=<second half> cibuildwheel & followed by a wait for the last two launched background jobs. See https://stackoverflow.com/questions/356100/how-to-wait-in-bash-for-several-subprocesses-to-finish-and-return-exit-code-0. I don't think it will be much faster, though.

Strilanc · 2022-05-11T01:05:48Z

If you actually look at what is taking time, it is not downloading dependencies or running tests. In one of my "fast" cases, the setup and teardown takes about a minute and the tests also take about a minute. Which leaves 8 minutes of building:

I already am only doing one wheel build per worker, so I'll get no benefit from two invocations. What I need is for cibuildwheel to use two cores for one build, so that it is not processing one C++ file at a time.

I did realize that my worse offenders are actually building numpy and pandas, in addition to my wheel, which is why they take like 40 minutes instead of 10-20. The actual time spent building my wheel is 12.5 minutes. Maybe I should go poke numpy and pandas to have wheels for cp39-manylinux_i686, or decide to stop supporting it myself.

henryiii · 2022-05-11T02:31:53Z

At least numpy and pandas do build in parallel. Setuptools doesn't directly support it (unless you have multiple extensions), but it's easy to patch in, pybind11 and numpy both have utilities for it (and of course CMake via Scikit-build, etc. all support it). As long as you are doing that, you aren't waisting that much time not running cibuildwheel in two threads.

Yes, if you provide wheels for something your dependencies don't, not sure it's very helpful. Users will have to build numpy & pandas to use your packaged binary. Also make sure you are using the same manylinux family they are using or better. (like manylinux2014 for Python 3.10, which we do default to these days). If you go older, you'll have to build them if you use them.

joerick · 2022-06-19T09:03:45Z

I don't think currently that it makes sense to provide build parallelism within cibuildwheel. There are already options to do it at a lower level (compiler flags) or at a higher level (CI build matrices). Adding it in cibuildwheel is probably not going to be worth the complexity.

joerick · 2022-06-19T09:15:06Z

One thing that potentially could improve build performance would be to be a bit cleverer about the network IO. E.g. we could download the next docker image or Python version while the previous build is running. That might save a little time. But again, would it be worth the added complexity? Not sure...

ddelange · 2022-09-17T08:01:31Z

I like the high level (CI build matrices) solution, I can split out arch and linux flavour like below. But since QEMU is painfully slow for aarch64, cibuildwheel runs a bunch of tests on the wheels after building, the wait for aarch64 to finish on the below concurrency 2 is still resulting in hours of CI.

Solution is to additionally split out python versions. I would however prefer not to hardcode python versions in CI, rather to let the cibuildwheel python_requires='>=3.6' mechanics determine which wheels to build.

Maybe there is a neat way of introducing the possibility for further concurrency other than implementing parallellism inside cibuildwheel? Most runners only have 2 CPU cores anyway so that would not give a huge speedup...

Edit: found a more or less neat way: updated snippet below to evenly distribute current and future python releases over 5 build jobs. Assuming a package that supports 3.6+ (and stable cibuildwheel currently distributing up to cp311-*), there won't be any runners doing empty work with the setup below. Even neater would be to distribute a single forward-compatible py36+ (ABI3) wheel like giampaolo/psutil#2103, but that isn't currently feasible for my particular project.

# gh actions
jobs:
  build-wheels:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        include:
          - {os: macos-latest, arch: x86_64, build: "*"}
          - {os: macos-latest, arch: arm64, build: "*"}
          - {os: windows-latest, arch: AMD64, build: "*"}
          - {os: windows-latest, arch: x86, build: "*"}          
          - {os: ubuntu-latest, arch: x86_64, build: "*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[61]-manylinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[72]-manylinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[83]-manylinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[94]-manylinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[05]-manylinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[61]-musllinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[72]-musllinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[83]-musllinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[94]-musllinux*"}
          - {os: ubuntu-latest, arch: aarch64, build: "*[05]-musllinux*"}
    steps:
      - uses: docker/setup-qemu-action@v2
        if: matrix.os == 'ubuntu-latest'
      - uses: pypa/cibuildwheel@v2.10.0
        env:
          CIBW_BUILD_VERBOSITY: 1
          CIBW_ARCHS: ${{ matrix.arch }}
          CIBW_BUILD: ${{ matrix.build }}

henryiii · 2022-09-17T12:13:53Z

See https://iscinumpy.dev/post/cibuildwheel-2-10-0/#only-210 or #1261.

henryiii · 2022-09-17T12:14:38Z

(Though that's kind of neat too)

joerick closed this as not planned Won't fix, can't repro, duplicate, stale Jun 19, 2022

ddelange mentioned this issue Sep 17, 2022

Add arm64 mac and linux wheels MagicStack/asyncpg#954

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way to use parallelism when building the wheels? #1088

Is there a way to use parallelism when building the wheels? #1088

Strilanc commented Apr 12, 2022 •

edited

Loading

dvarrazzo commented May 2, 2022

Strilanc commented May 10, 2022 •

edited

Loading

henryiii commented May 10, 2022 •

edited

Loading

Strilanc commented May 11, 2022

henryiii commented May 11, 2022

joerick commented Jun 19, 2022

joerick commented Jun 19, 2022

ddelange commented Sep 17, 2022 •

edited

Loading

henryiii commented Sep 17, 2022

henryiii commented Sep 17, 2022

Is there a way to use parallelism when building the wheels? #1088

Is there a way to use parallelism when building the wheels? #1088

Comments

Strilanc commented Apr 12, 2022 • edited Loading

dvarrazzo commented May 2, 2022

Strilanc commented May 10, 2022 • edited Loading

henryiii commented May 10, 2022 • edited Loading

Strilanc commented May 11, 2022

henryiii commented May 11, 2022

joerick commented Jun 19, 2022

joerick commented Jun 19, 2022

ddelange commented Sep 17, 2022 • edited Loading

henryiii commented Sep 17, 2022

henryiii commented Sep 17, 2022

Strilanc commented Apr 12, 2022 •

edited

Loading

Strilanc commented May 10, 2022 •

edited

Loading

henryiii commented May 10, 2022 •

edited

Loading

ddelange commented Sep 17, 2022 •

edited

Loading