-
-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Figure out better way for dealing with CPU arches #23
Comments
I'd suggest going with upstream and shipping both. |
Thanks @isuruf for that update! Is there any place to track the |
Summary: In the context of conda-forge/faiss-split-feedstock#23, I discussed with some of the conda-folks how we should support AVX2 (and potentially other builds) for faiss. In the meantime, we'd like to follow the model that faiss itself is using (i.e. build with AVX2 and without and then load the corresponding library at runtime depending on CPU capabilities). Since windows support for this is missing (and the other stuff is also non-portable in `loader.py`), I chased down `numpy.distutils.cpuinfo`, which is pretty outdated, and opened: numpy/numpy#18058 While the [private API](numpy/numpy#18058 (comment)) is obviously something that _could_ change at any time, I still think it's better than platform-dependent shenanigans. Opening this here to ideally upstream this right away, rather than carrying patches in the conda-forge feedstock. TODO: * [ ] adapt conda recipe for windows in this repo to also build avx2 version Pull Request resolved: #1600 Reviewed By: beauby Differential Revision: D25994705 Pulled By: mdouze fbshipit-source-id: 9986bcfd4be0f232a57c0a844c72ec0e308fff19
Fortunately, there's now an issue to track progress on this: conda-forge/conda-forge.github.io#1261 |
In what way? Size and unnecessary complexity? I am often tempted to think about this in general like the noarch situation --- one package for all, but maybe that's the naive way to think about this 😅 |
Maximize performance & minimize binary size, by building different package variants (e.g. for the x86-v2 - x86-v4 levels) and having each user automatically download the highest-supported binary according to their CPU, via the
Doing that doubles (or triples, if building for AVX512 as well) build time & binary size for no other reason other than that we cannot (currently) do better infrastructurally, and don't want to completely forgo the performance benefits that newer CPU arches have via-à-vis SSE4. |
Now that the feedstock has been revived, this can be tackled. Should be easy to do (c.f. the docs), but need to figure out how to do this on windows. We'll also need to patch the upstream builds a bit, but we were already doing that anyway (and there's a bunch of patches to drop as well...) |
While debugging #22, I found out that the upstream build-refactor that I tried to follow in #17 changed things so that upstream always compiles for both AVX2 & default profiles, and then switches at runtime using:
https://github.com/facebookresearch/faiss/blob/v1.6.5/faiss/python/loader.py#L27-L39
Of course we could mimick that, but given the infrastructure added in conda/conda#9930 by @isuruf, I'm thinking we could do better. Not sure if this is ready for primetime yet -- as in, does CF infra support this already?
I see the following todos (restricted to this feedstock):
CC @beckermr @mbargull @jakirkham
The text was updated successfully, but these errors were encountered: