Segfault when importing polars in an amd64 docker image running on Mac M1 through virtualization #5401

CalOmnie · 2022-11-02T13:04:55Z

Polars version checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.

Issue description

Importing polars in a docker image built for amd64 but run on a Mac M1 (which would then go through virtualization through qemu) results in a segfault.

Part of our investigation for this error led us to try and recompile polars in a virtualized amd64 environment, which yielded this failure: ImportError: /lib/x86_64-linux-gnu/libjemalloc.so.2: cannot allocate memory in static TLS block
It seems jemalloc does not play nicely with qemu. Is there anything that could be done on your side to alleviate this issue? Maybe having some mechanism to allow more control on which memory allocator is being used?

UPDATE: Manually updating the code to replace jemalloc by mimalloc in the "linux" target OS fixes this issue. Would it be possible to make this choice easier for the user?

Reproducible example

This issue can be reproduced on any M1 Mac with the following Dockerfile:

FROM --platform=linux/amd64 python:3.8-slim
RUN pip install polars
CMD python -c "import polars"

And associated command:
docker build . -t test-failure && docker run -it test-failure

Expected behavior

The expected behaviour would be for the python command to run succesfully. Instead, this error appears:

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

Installed versions

0.14.24

The text was updated successfully, but these errors were encountered:

faustomilletari · 2022-11-02T16:14:12Z

The issue does not occur if we recompile polars from sources without using jemalloc but only using mimalloc.

It would be great if there was the option to install the mimalloc version of the package directly from pip

is that something we could possibly do?

ritchie46 · 2022-11-02T16:28:47Z

t would be great if there was the option to install the mimalloc version of the package directly from pip

Which wheel do mean specifically?

alexandervaneck · 2022-11-02T17:25:32Z

Hi @ritchie46 :)

I believe the wheel that is meant here is one that bundles polars compiled with mimalloc as opposed to jemalloc.

Thank you for responding to this issue so quickly!

ritchie46 · 2022-11-02T17:27:54Z

I don't think so as we only compile mimalloc for windows.

alexandervaneck · 2022-11-02T17:32:50Z

That is understandable. Would it be possible at all for the CI to publish a mimalloc version of polars so users may have an easier time installing it (instead of building it themselves) or is this something that polars doesn't want to maintain/support?

faustomilletari · 2022-11-02T17:38:06Z

Sorry for being unclear.

We have re compiled polars within the docker container running x86 in emulation via qemu on mac m1.

When recompiling we have changed the content of lib.rs so that mimalloc would be used instead of jemalloc. Same thing for cargo.toml.

This has removed the issue. The issue is caused by jemalloc.

Using jemalloc from the system (installed via apt) through the flag JEMALLOC_OVERRIDE (env variable) doesn’t bring anything good as the version of jemalloc installed by apt is not compiled with the required flag disable_initial_exec_tls.

Building within the docker with mimalloc instead of jemalloc solves the issue at the expense of performance (I suppose).

ritchie46 · 2022-11-02T18:31:39Z

Yes, I understand that the issue is jemmaloc. But I am curious which wheels you propose to replace jemalloc for mimalloc. If it is the default linux wheel, I am not really enthusiastic about this as we have much better performance with jemalloc.

And I wonder if it is a problem on our/jemalloc's side or your special case of virtualization.

CalOmnie · 2022-11-02T18:49:37Z

If possible, we'd be happy with a solution similar to the one for this issue: #2922 . As a backup we'd be okay with being able to select the memory allocator through a compile flag.

faustomilletari · 2022-11-02T18:50:36Z

I don’t think the “standard” version of polars should have mimalloc as jemalloc works just fine if there is no emulation involved.

I was wondering if an ad-hoc whl with mimalloc could be built at every release and archived somewhere accessible while a known issues type of thing could also be added to the readme targeting docker m1 users.

Could be a great solution for everyone. Maybe we could contribute a GitHub action to automate this?!

faustomilletari · 2022-11-02T18:55:56Z

If possible, we'd be happy with a solution similar to the one for this issue: #2922 . As a backup we'd be okay with being able to select the memory allocator through a compile flag.

This type solution would be more than fine

ritchie46 · 2022-11-02T18:56:50Z

That would need another project as I don't think we can select the wheel via feature flags. However we have a more conservative version of polars here: https://pypi.org/project/polars-lts-cpu/

We could also let that one use the mimalloc allocator?

faustomilletari · 2022-11-02T18:57:46Z

That would need another project as I don't think we can select the wheel via feature flags. However we have a more conservative version of polars here: https://pypi.org/project/polars-lts-cpu/

We could also let that one use the mimalloc allocator?

If polars team is okay with that, I believe this solution would be great!

ritchie46 · 2022-11-02T18:59:50Z

Yeap, some help with the github actions would be appreciated. As I am pretty full. :)

ghuls · 2022-11-03T22:12:16Z

Did you try to run docker correctly with? docker run --platform linux/amd64 -it test-failure

alexandervaneck · 2022-11-03T22:17:24Z

Hi Gert, thank you for jumping in. There's 2 ways to add the `--platform` arg to docker, one with `docker build/run` and one inside the Dockerfile. Source: https://docs.docker.com/build/building/multi-platform/ Both were tried and both fail in the same way. As far as I can tell `docker run --platform` does not need to be specified as Docker for Mac will pick up which platform is appropriate for the build/run.

…

On Thu, Nov 3, 2022, 23:12 Gert Hulselmans ***@***.***> wrote: Did you try to run docker correctly with? docker run --platform linux/amd64 -it test-failure — Reply to this email directly, view it on GitHub <#5401 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACUFSKOP6CCBVM5DKUVBJHDWGQ2EVANCNFSM6AAAAAARVBZXOM> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

ghuls · 2022-11-03T22:24:42Z

But did you use that flag also during the docker run command?

ghuls · 2022-11-03T22:57:12Z

#4848 might be slightly related.
Does preloading the polars library work? #4848 (comment)

CalOmnie · 2022-11-04T00:10:04Z

This problem has been replicated with every possible location location of --platform linux/amd64. In the docker build and docker run commands as well as in the Dockerfile. Every possible combination as also been attempted, all leading to the same segfault.

We've also attempted to preload the polars library but that unfortunately didn't work either:

root@4ac8651fe6e0:/# LD_PRELOAD=/usr/local/lib/python3.8/site-packages/polars/polars.abi3.so python
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
Python 3.8.15 (default, Oct 25 2022, 06:04:13)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import polars.polars
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

We've been able to succesfully compile and run polars on arm64 when providing the use_mimalloc conditional added in this merge request: #5414

ghuls · 2022-11-04T00:44:47Z

Can you try with the Polars build for older cpus to see if emulation of avx instructions is not the cause of the issue: https://pypi.org/project/polars-lts-cpu/

CalOmnie · 2022-11-04T00:55:02Z

This has been attempted as well, to no avail. We don't get an invalid instruction error but a segfault.

…

On Fri 4 Nov 2022, 01:45 Gert Hulselmans, ***@***.***> wrote: Can you try with the Polars build for older cpus to see if emulation of avx instructions is not the cause of the issue: https://pypi.org/project/polars-lts-cpu/ — Reply to this email directly, view it on GitHub <#5401 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA64O5YRL7BIUM6YN64VALDWGRMAVANCNFSM6AAAAAARVBZXOM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

ritchie46 · 2022-11-04T11:59:24Z

closed by: #5414

loftusa · 2023-04-24T19:00:48Z

Hi all, I'm trying to run python -c 'import polars as pl' in a docker container, apple m2 chip, and I got this error. I tried

pip uninstall polars
pip install --upgrade --no-cache-dir polars

to no avail. Doesn't seem to be fixed unless I'm doing something stupid. which is possible.
Full log. I am in a Docker container running on an M2 chip, 2022 macbook air. pip install polars-lts-cpu worked.

(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ pip install --upgrade --no-cache-dir polars
Requirement already satisfied: polars in /opt/conda/lib/python3.10/site-packages (0.17.8)
Requirement already satisfied: typing_extensions>=4.0.1 in /opt/conda/lib/python3.10/site-packages (from polars) (4.3.0)
(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ pip uninstall polars
Found existing installation: polars 0.17.8
Uninstalling polars-0.17.8:
  Would remove:
    /opt/conda/lib/python3.10/site-packages/polars-0.17.8.dist-info/*
    /opt/conda/lib/python3.10/site-packages/polars/*
Proceed (Y/n)? Y
  Successfully uninstalled polars-0.17.8
(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ pip install --upgrade --no-cache-dir polars
Collecting polars
  Downloading polars-0.17.8-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.8/17.8 MB 34.2 MB/s eta 0:00:00
Requirement already satisfied: typing_extensions>=4.0.1 in /opt/conda/lib/python3.10/site-packages (from polars) (4.3.0)
Installing collected packages: polars
Successfully installed polars-0.17.8
(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ python -c "import polars as pl"
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$

ritchie46 · 2023-04-24T19:10:47Z

Try polars-lts-cpu

thomasaarholt · 2023-06-05T07:15:27Z

Sorry for crashing into a closed issue: It would be nice if the segfault could be delayed until one actually does something with polars, and also have some sort of warning printed. As it is, we have an internal utility package that has a database module which handles data exchange with our database - optionally with polars or pandas. So polars is imported when we do import our_package.db.

We have a few situations where users are developing on an x86_64 image in docker on their Apple Silicon macs, and are running into the above segfault when importing the db package, despite not needing the polars functionality.

We could get around this by using polars-lts-cpu, but since polars is a dependency of our_package, and we install our packages using pip-compile / pip install -r requirements.txt, we have to manually edit our requirements.txt files, which is causing friction for our users who don't understand why their kernels are crashing without any logs.

Adding polars-lts-cpu to the requirements.in in addition to our-package (requirements.in gets "compiled" into a requirements.txt file with specific version requirements, like a lock-file) does install both polars and polars-lts-cpu, but the error still happens since I guess the main package takes precedence.

We've also looked at finding a way of adding polars-lts-cpu as an extra option like polars does with other things (like pip install 'polars[all]'), but we haven't found a way to satisfy

Having polars as a main requirement
Installing polars-lts-cpu and not polars when lts is specified as an extra.

I'm happy to help debug this, e.g. by figuring out which section of polars code is triggering the error, and finding a way to gracefully prevent polars from crashing.

ritchie46 · 2023-06-05T09:40:25Z

It would be nice if the segfault could be delayed until one actually does something with polars, and also have some sort of warning printed.

This is out of our control. Their runs an instruction that is not supported by the architecture set. There is no way for us to have a callback or some sort.

And always installing polars-lts-cpu only?

thomasaarholt · 2023-06-05T09:46:46Z

I see. Can we check the architecture to know whether the instruction set is supported?

And always installing polars-lts-cpu only?

That works, but then we need to set that version as a dependency for our_package. I haven’t checked, but I assume it runs quite a bit slower then? I guess we could still do that, and users can optionally add polars to their requirements file if they want a faster version.

ritchie46 · 2023-06-05T10:03:03Z

I see. Can we check the architecture to know whether the instruction set is supported?

Yes, this are the features we compile:

polars/.github/deploy_manylinux.sh

Line 12 in df53d8a

    
           export RUSTFLAGS='-C target-feature=+fxsr,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+popcnt,+avx,+fma'

They should be in cat /proc/cpuinfo | grep flags.

The issue is that the mac/docker virtualization virtualizes a pretty old CPU architecture.

CalOmnie added bug Something isn't working python Related to Python Polars labels Nov 2, 2022

CalOmnie mentioned this issue Nov 3, 2022

chore: Allow for choice in memory allocator and use mimalloc for polars-lts-cpu #5414

Merged

ritchie46 closed this as completed Nov 4, 2022

CalOmnie mentioned this issue Nov 4, 2022

chore(python): Fix dependencies on memory allocator #5426

Merged

stinodego mentioned this issue Apr 26, 2024

build: use jemalloc in lts-cpu #15913

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfault when importing polars in an amd64 docker image running on Mac M1 through virtualization #5401

Segfault when importing polars in an amd64 docker image running on Mac M1 through virtualization #5401

CalOmnie commented Nov 2, 2022 •

edited

Loading

faustomilletari commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

alexandervaneck commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

alexandervaneck commented Nov 2, 2022

faustomilletari commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

CalOmnie commented Nov 2, 2022

faustomilletari commented Nov 2, 2022 •

edited

Loading

faustomilletari commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

faustomilletari commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

ghuls commented Nov 3, 2022

alexandervaneck commented Nov 3, 2022 via email •

edited

Loading

ghuls commented Nov 3, 2022

ghuls commented Nov 3, 2022

CalOmnie commented Nov 4, 2022

ghuls commented Nov 4, 2022

CalOmnie commented Nov 4, 2022 via email

ritchie46 commented Nov 4, 2022

loftusa commented Apr 24, 2023 •

edited

Loading

ritchie46 commented Apr 24, 2023

thomasaarholt commented Jun 5, 2023

ritchie46 commented Jun 5, 2023

thomasaarholt commented Jun 5, 2023

ritchie46 commented Jun 5, 2023

Segfault when importing polars in an amd64 docker image running on Mac M1 through virtualization #5401

Segfault when importing polars in an amd64 docker image running on Mac M1 through virtualization #5401

Comments

CalOmnie commented Nov 2, 2022 • edited Loading

Polars version checks

Issue description

Reproducible example

Expected behavior

Installed versions

faustomilletari commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

alexandervaneck commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

alexandervaneck commented Nov 2, 2022

faustomilletari commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

CalOmnie commented Nov 2, 2022

faustomilletari commented Nov 2, 2022 • edited Loading

faustomilletari commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

faustomilletari commented Nov 2, 2022

ritchie46 commented Nov 2, 2022

ghuls commented Nov 3, 2022

alexandervaneck commented Nov 3, 2022 via email • edited Loading

ghuls commented Nov 3, 2022

ghuls commented Nov 3, 2022

CalOmnie commented Nov 4, 2022

ghuls commented Nov 4, 2022

CalOmnie commented Nov 4, 2022 via email

ritchie46 commented Nov 4, 2022

loftusa commented Apr 24, 2023 • edited Loading

ritchie46 commented Apr 24, 2023

thomasaarholt commented Jun 5, 2023

ritchie46 commented Jun 5, 2023

thomasaarholt commented Jun 5, 2023

ritchie46 commented Jun 5, 2023

CalOmnie commented Nov 2, 2022 •

edited

Loading

faustomilletari commented Nov 2, 2022 •

edited

Loading

alexandervaneck commented Nov 3, 2022 via email •

edited

Loading

loftusa commented Apr 24, 2023 •

edited

Loading