-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customize Docker build image #103
Conversation
thanks for this! |
How are these docker images built? Again, I maintain that you don't need to maintain a separate manylinux image. |
@skvark , could you comment? You can indeed package whatever additional libraries you need as RPMs or something. Why is that not an option? |
The idea is that anyone can set up multibuild for a package in thirty minutes due to strong conventions, namely the fact that library_builders will install any library that you need in a reasonable time-frame and If we allow third-party docker images, we're solving one person's problem, but we're not providing a solution that's going to work for everyone. |
My own feeling is that it might be better to avoid a custom docker image, but we shouldn't legislate against it. Maybe just add comment in the patch that it's nearly always possible to avoid a custom image with caching. |
The problem is that if we allow a custom docker image, we'll end up with configuration over convention, which in the case of building manylinux wheels, doesn't work out of the box. If we don't allow it, then people will come here thinking that they want to use their own docker image, but then we can inform them that they should actually be using |
Could you explain what you mean by "configuration over convention"? As long as we explain right next to the configuration option that this is not the configuration option you're looking for, and maybe add "please add your favorite library recipe to library_builders, it seems to me that we'll likely be OK. |
What I mean is that instead of adding a function to library builders when you need a library (or just using the one that's already there), you'll have to find a docker image that already has the software that you want. This doesn't work if a docker image has some software but not others -- it results in far more maintenance collectively than having a single docker image and some functions in The underlying assumption here is that there's a need for a custom docker image, but there's not. It breaks the entire model of pushing all of the functions upstream so that someone only needs to write a function to build a library once. |
I looked into this further, and it isn't possible to compile qt from source on Travis-CI because it takes too long. Ideally we would compile and host a .tar.gz for like is done from openblas. |
Long build times (Qt) is one of the reason why I extended the official manylinux images for opencv-python. There are also other dependencies which will make the combined build time even longer. I don't want to host or create separate packages and maintain them separately. It was just a lot easier and faster to write a couple of Dockerfiles. @xoviat I understand your concern about the forked images but I believe it's not very common use case. I was also thinking (I don't know if this is possible) if multibuild somehow could create its own custom Docker image from recipes (Dockerfiles) here in the repo. Just an idea. Then everything would be centralized and precompiled. |
That could be one option. If you (and @matthew-brett ) are okay with this, we could just switch the default image to your docker, but there are a few considerations:
Another option is to add |
There's no principal difference. It all boils down to which kind of hosting you're able to get -- RPM hosting, WHL hosting, Docker hosting, ccache hosting (a comparatively recent feature of Travis) etc. Manylinux project itself uses RPMs from CentOS Vault and Tru Huynh's private CentOS hosting. I didn't see anything that would hint they provide hosting for built libraries that aren't available from these two. |
Anyway, I believe it has been demonstrated that a derived image is a viable way to cache additional libraries, with its pros and cons on par with other methods. I don't think that using a random custom image by default is a wise move unless the |
Yeah, I don't recommend using my image. The Instead of uploading single library artifacts to some location I would suggest that the separate repository (for example When new build scripts are added or the official The image doesn't need to contain all possible libraries. For example every library which takes over 10 minutes to build would be a good start. How does this sound? This would reduce Linux build times across different projects significantly. |
Which libraries do you specifically need that take a long time to build? https://github.com/dockcross provides docker images as well; I (or you) can ask them if they're willing to add these libraries to their docker image. The goal here is to avoid duplicating effort so that this problem only needs to be solved once across the entire ecosystem. |
Qt and FFmpeg (depends on the machine which builds it but usually around 5 - 10 minutes, takes probably longer on Travis). Also |
@njsmith Thoughts on adding these libraries to the manylinux image? |
We don't have any official policy on what goes into the manylinux image,
but we've tended to push back on adding lots of libraries. The manylinux
audience is very large, and everyone has some libraries/tools they need, so
if it tries to work out of the box for the everyone then it'll quickly
become a huge download and impossible to maintain. Qt and ffmpeg seem way
out of scope to me.
…On Dec 23, 2017 11:01 AM, "xoviat" ***@***.***> wrote:
@njsmith <https://github.com/njsmith> Thoughts on adding these libraries
to the manylinux image?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/matthew-brett/multibuild/pull/103#issuecomment-353742662>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAlOaLNd0bi5tjzHcBeU_ISFqGxl74xKks5tDU4cgaJpZM4RIaWa>
.
|
So after some consideration, I think the best approach is actually to upload the libraries to pypi. This can be automated so that there is minimal boilerplate for each new library added. However, the biggest issue is that |
Example packages are
Essentially the |
Interesting, i’d like to know more. You’re suggesting one could make a distribution package with no python content, only some shared libraries. But then where should the libraries be installed inside the python prefix? Using data_files keyword in setup.py with some relative path? I imagine this would be different for each platform. And how would other packages make use of these installed libraries? E.g. how can a ctypes or cffi wrapper find these libraries? Or a cython one, which would need to link at compile time? |
I understand this is a bit out of scope for the current issue, but
it’d be nice to see some examples (the intel ones you linked don’t
provide sources, or I couldn’t find them).
It's not out of scope 'cuz it casts doubt on the validity of the suggestion. From what I know, `site-packages` machinery, including wheels, does not, and is not designed to, support the use of anything but Python packages.
Packages like `NumPy` that support C linking against them include Python modules with custom functions to get include and lib directories that a dependent package's build has to manually retrieve and append to compiler options. AFAICS (https://docs.scipy.org/doc/numpy-1.13.0/reference/c-api.coremath.html#linking-against-the-core-math-library-in-an-extension) , only static linking is supported, too.
|
Imho python packaging tools and PyPI should not be used to distribute some arbitrary libraries which have nothing to do with python. I still think that this PR is completely valid and the most simple way to solve the issue. |
This PR will be merged if I cannot provide an alternative solution to this problem. However, the PR was only opened only four days ago and we're only now having this discussion, so I think I deserve a bit more time to fix this issue. I will take care of everything except providing |
I'll be monkey-patching the subroutine for now, so you can take your time. |
@xoviat - I'd like to merge this - but we should certainly try and make better alternatives to gradually make it unnecessary ... |
Oh, sorry. @matthew-brett I need to update devel and then revert this PR. @native-api Please file against |
Use case: if there are multiple libs to build from source. Used in https://github.com/skvark/opencv-python (see opencv/opencv-python#58 (comment) ).