Figure out better way for dealing with CPU arches #23

h-vetinari · 2020-12-22T10:31:58Z

While debugging #22, I found out that the upstream build-refactor that I tried to follow in #17 changed things so that upstream always compiles for both AVX2 & default profiles, and then switches at runtime using:
https://github.com/facebookresearch/faiss/blob/v1.6.5/faiss/python/loader.py#L27-L39

Of course we could mimick that, but given the infrastructure added in conda/conda#9930 by @isuruf, I'm thinking we could do better. Not sure if this is ready for primetime yet -- as in, does CF infra support this already?

I see the following todos (restricted to this feedstock):

Figure out if we want to build for AVX2, resp. other profiles
If yes, decide trade-off between packaging size impact for packaging several (probably negligible) and multiplication of CI jobs
Patch the upstream loader depending on outcome
- If we keep the upstream approach, add windows detection (how?) and upstream the patch

CC @beckermr @mbargull @jakirkham

isuruf · 2020-12-22T21:58:41Z

I'd suggest going with upstream and shipping both. archspec based packages are not ready yet.

h-vetinari · 2020-12-22T22:26:03Z

Thanks @isuruf for that update! Is there any place to track the archspec-based approach so I can subscribe (or possibly help) and test/apply it once done?

Summary: In the context of conda-forge/faiss-split-feedstock#23, I discussed with some of the conda-folks how we should support AVX2 (and potentially other builds) for faiss. In the meantime, we'd like to follow the model that faiss itself is using (i.e. build with AVX2 and without and then load the corresponding library at runtime depending on CPU capabilities). Since windows support for this is missing (and the other stuff is also non-portable in `loader.py`), I chased down `numpy.distutils.cpuinfo`, which is pretty outdated, and opened: numpy/numpy#18058 While the [private API](numpy/numpy#18058 (comment)) is obviously something that _could_ change at any time, I still think it's better than platform-dependent shenanigans. Opening this here to ideally upstream this right away, rather than carrying patches in the conda-forge feedstock. TODO: * [ ] adapt conda recipe for windows in this repo to also build avx2 version Pull Request resolved: #1600 Reviewed By: beauby Differential Revision: D25994705 Pulled By: mdouze fbshipit-source-id: 9986bcfd4be0f232a57c0a844c72ec0e308fff19

h-vetinari · 2021-04-10T10:35:20Z

Fortunately, there's now an issue to track progress on this: conda-forge/conda-forge.github.io#1261

ngam · 2022-04-29T02:40:23Z

I'm thinking we could do better

In what way? Size and unnecessary complexity?

I am often tempted to think about this in general like the noarch situation --- one package for all, but maybe that's the naive way to think about this 😅

h-vetinari · 2022-04-29T02:51:58Z

I'm thinking we could do better

In what way? Size and unnecessary complexity?

Maximize performance & minimize binary size, by building different package variants (e.g. for the x86-v2 - x86-v4 levels) and having each user automatically download the highest-supported binary according to their CPU, via the __archspec virtual package.

one package for all, but maybe that's the naive way to think about this 😅

Doing that doubles (or triples, if building for AVX512 as well) build time & binary size for no other reason other than that we cannot (currently) do better infrastructurally, and don't want to completely forgo the performance benefits that newer CPU arches have via-à-vis SSE4.

h-vetinari · 2024-08-08T22:27:18Z

Now that the feedstock has been revived, this can be tackled. Should be easy to do (c.f. the docs), but need to figure out how to do this on windows. We'll also need to patch the upstream builds a bit, but we were already doing that anyway (and there's a bunch of patches to drop as well...)

h-vetinari mentioned this issue Dec 22, 2020

Rebuild for pypy (redux) #22

Closed

5 tasks

h-vetinari mentioned this issue Dec 22, 2020

numpy.distutils.cpuinfo does not support modern featuresets numpy/numpy#18058

Closed

h-vetinari mentioned this issue Dec 30, 2020

make AVX2-detection platform-independent facebookresearch/faiss#1600

Closed

1 task

h-vetinari mentioned this issue Feb 4, 2021

Build with AVX2 support #27

Merged

h-vetinari mentioned this issue Feb 14, 2021

Handling various special compilation optimizations/architectures conda-forge/conda-forge.github.io#49

Closed

This was referenced May 7, 2021

archspec-enabled packages conda-forge/conda-forge.github.io#1261

Open

Enable building libfaiss just for avx2 facebookresearch/faiss#1877

Closed

h-vetinari mentioned this issue Apr 28, 2022

Unbundling oneDNN conda-forge/tensorflow-feedstock#183

Open

h-vetinari mentioned this issue Jan 6, 2023

Arch Migrator #58

Merged

h-vetinari mentioned this issue Jul 16, 2023

Supporting microarchitecture-specific builds conda/ceps#59

Open

h-vetinari mentioned this issue Aug 8, 2024

NEW: Add batch activation script for Windows conda-forge/microarch-level-feedstock#3

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out better way for dealing with CPU arches #23

Figure out better way for dealing with CPU arches #23

h-vetinari commented Dec 22, 2020

isuruf commented Dec 22, 2020

h-vetinari commented Dec 22, 2020

h-vetinari commented Apr 10, 2021

ngam commented Apr 29, 2022

h-vetinari commented Apr 29, 2022

h-vetinari commented Aug 8, 2024

Figure out better way for dealing with CPU arches #23

Figure out better way for dealing with CPU arches #23

Comments

h-vetinari commented Dec 22, 2020

isuruf commented Dec 22, 2020

h-vetinari commented Dec 22, 2020

h-vetinari commented Apr 10, 2021

ngam commented Apr 29, 2022

h-vetinari commented Apr 29, 2022

h-vetinari commented Aug 8, 2024