-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault when importing polars in an amd64 docker image running on Mac M1 through virtualization #5401
Comments
The issue does not occur if we recompile polars from sources without using jemalloc but only using mimalloc. It would be great if there was the option to install the mimalloc version of the package directly from pip is that something we could possibly do? |
Which wheel do mean specifically? |
Hi @ritchie46 :) I believe the wheel that is meant here is one that bundles polars compiled with Thank you for responding to this issue so quickly! |
I don't think so as we only compile mimalloc for windows. |
That is understandable. Would it be possible at all for the CI to publish a |
Sorry for being unclear. We have re compiled polars within the docker container running x86 in emulation via qemu on mac m1. When recompiling we have changed the content of lib.rs so that mimalloc would be used instead of jemalloc. Same thing for cargo.toml. This has removed the issue. The issue is caused by jemalloc. Using jemalloc from the system (installed via apt) through the flag JEMALLOC_OVERRIDE (env variable) doesn’t bring anything good as the version of jemalloc installed by apt is not compiled with the required flag disable_initial_exec_tls. Building within the docker with mimalloc instead of jemalloc solves the issue at the expense of performance (I suppose). |
Yes, I understand that the issue is And I wonder if it is a problem on our/jemalloc's side or your special case of virtualization. |
If possible, we'd be happy with a solution similar to the one for this issue: #2922 . As a backup we'd be okay with being able to select the memory allocator through a compile flag. |
I don’t think the “standard” version of polars should have mimalloc as jemalloc works just fine if there is no emulation involved. I was wondering if an ad-hoc whl with mimalloc could be built at every release and archived somewhere accessible while a known issues type of thing could also be added to the readme targeting docker m1 users. Could be a great solution for everyone. Maybe we could contribute a GitHub action to automate this?! |
This type solution would be more than fine |
That would need another project as I don't think we can select the wheel via feature flags. However we have a more conservative version of polars here: https://pypi.org/project/polars-lts-cpu/ We could also let that one use the mimalloc allocator? |
If polars team is okay with that, I believe this solution would be great! |
Yeap, some help with the github actions would be appreciated. As I am pretty full. :) |
Did you try to run docker correctly with? |
Hi Gert, thank you for jumping in. There's 2 ways to add the `--platform`
arg to docker, one with `docker build/run` and one inside the Dockerfile.
Source: https://docs.docker.com/build/building/multi-platform/
Both were tried and both fail in the same way.
As far as I can tell `docker run --platform` does not need to be specified as Docker for Mac will pick up which platform is appropriate for the build/run.
…On Thu, Nov 3, 2022, 23:12 Gert Hulselmans ***@***.***> wrote:
Did you try to run docker correctly with? docker run --platform
linux/amd64 -it test-failure
—
Reply to this email directly, view it on GitHub
<#5401 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUFSKOP6CCBVM5DKUVBJHDWGQ2EVANCNFSM6AAAAAARVBZXOM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
But did you use that flag also during the docker run command? |
#4848 might be slightly related. |
This problem has been replicated with every possible location location of We've also attempted to preload the polars library but that unfortunately didn't work either:
We've been able to succesfully compile and run polars on arm64 when providing the |
Can you try with the Polars build for older cpus to see if emulation of avx instructions is not the cause of the issue: https://pypi.org/project/polars-lts-cpu/ |
This has been attempted as well, to no avail. We don't get an invalid
instruction error but a segfault.
…On Fri 4 Nov 2022, 01:45 Gert Hulselmans, ***@***.***> wrote:
Can you try with the Polars build for older cpus to see if emulation of
avx instructions is not the cause of the issue:
https://pypi.org/project/polars-lts-cpu/
—
Reply to this email directly, view it on GitHub
<#5401 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA64O5YRL7BIUM6YN64VALDWGRMAVANCNFSM6AAAAAARVBZXOM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
closed by: #5414 |
Hi all, I'm trying to run
to no avail. Doesn't seem to be fixed unless I'm doing something stupid. which is possible.
|
Try |
Sorry for crashing into a closed issue: It would be nice if the segfault could be delayed until one actually does something with polars, and also have some sort of warning printed. As it is, we have an internal utility package that has a database module which handles data exchange with our database - optionally with polars or pandas. So We have a few situations where users are developing on an x86_64 image in docker on their Apple Silicon macs, and are running into the above segfault when importing the db package, despite not needing the polars functionality. We could get around this by using Adding We've also looked at finding a way of adding
I'm happy to help debug this, e.g. by figuring out which section of polars code is triggering the error, and finding a way to gracefully prevent polars from crashing. |
This is out of our control. Their runs an instruction that is not supported by the architecture set. There is no way for us to have a callback or some sort. And always installing |
I see. Can we check the architecture to know whether the instruction set is supported?
That works, but then we need to set that version as a dependency for our_package. I haven’t checked, but I assume it runs quite a bit slower then? I guess we could still do that, and users can optionally add polars to their requirements file if they want a faster version. |
Yes, this are the features we compile: polars/.github/deploy_manylinux.sh Line 12 in df53d8a
They should be in The issue is that the mac/docker virtualization virtualizes a pretty old CPU architecture. |
Polars version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Issue description
Importing polars in a docker image built for amd64 but run on a Mac M1 (which would then go through virtualization through qemu) results in a segfault.
Part of our investigation for this error led us to try and recompile polars in a virtualized amd64 environment, which yielded this failure:
ImportError: /lib/x86_64-linux-gnu/libjemalloc.so.2: cannot allocate memory in static TLS block
It seems jemalloc does not play nicely with qemu. Is there anything that could be done on your side to alleviate this issue? Maybe having some mechanism to allow more control on which memory allocator is being used?
UPDATE: Manually updating the code to replace jemalloc by mimalloc in the "linux" target OS fixes this issue. Would it be possible to make this choice easier for the user?
Reproducible example
This issue can be reproduced on any M1 Mac with the following Dockerfile:
And associated command:
docker build . -t test-failure && docker run -it test-failure
Expected behavior
The expected behaviour would be for the python command to run succesfully. Instead, this error appears:
Installed versions
The text was updated successfully, but these errors were encountered: