Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when importing polars in an amd64 docker image running on Mac M1 through virtualization #5401

Closed
2 tasks done
CalOmnie opened this issue Nov 2, 2022 · 27 comments
Closed
2 tasks done
Labels
bug Something isn't working python Related to Python Polars

Comments

@CalOmnie
Copy link
Contributor

CalOmnie commented Nov 2, 2022

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

Importing polars in a docker image built for amd64 but run on a Mac M1 (which would then go through virtualization through qemu) results in a segfault.

Part of our investigation for this error led us to try and recompile polars in a virtualized amd64 environment, which yielded this failure: ImportError: /lib/x86_64-linux-gnu/libjemalloc.so.2: cannot allocate memory in static TLS block
It seems jemalloc does not play nicely with qemu. Is there anything that could be done on your side to alleviate this issue? Maybe having some mechanism to allow more control on which memory allocator is being used?

UPDATE: Manually updating the code to replace jemalloc by mimalloc in the "linux" target OS fixes this issue. Would it be possible to make this choice easier for the user?

Reproducible example

This issue can be reproduced on any M1 Mac with the following Dockerfile:

FROM --platform=linux/amd64 python:3.8-slim
RUN pip install polars
CMD python -c "import polars"

And associated command:
docker build . -t test-failure && docker run -it test-failure

Expected behavior

The expected behaviour would be for the python command to run succesfully. Instead, this error appears:

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

Installed versions

0.14.24
@CalOmnie CalOmnie added bug Something isn't working python Related to Python Polars labels Nov 2, 2022
@faustomilletari
Copy link

The issue does not occur if we recompile polars from sources without using jemalloc but only using mimalloc.

It would be great if there was the option to install the mimalloc version of the package directly from pip

is that something we could possibly do?

@ritchie46
Copy link
Member

t would be great if there was the option to install the mimalloc version of the package directly from pip

Which wheel do mean specifically?

@alexandervaneck
Copy link

Hi @ritchie46 :)

I believe the wheel that is meant here is one that bundles polars compiled with mimalloc as opposed to jemalloc.

Thank you for responding to this issue so quickly!

@ritchie46
Copy link
Member

I don't think so as we only compile mimalloc for windows.

@alexandervaneck
Copy link

That is understandable. Would it be possible at all for the CI to publish a mimalloc version of polars so users may have an easier time installing it (instead of building it themselves) or is this something that polars doesn't want to maintain/support?

@faustomilletari
Copy link

Sorry for being unclear.

We have re compiled polars within the docker container running x86 in emulation via qemu on mac m1.

When recompiling we have changed the content of lib.rs so that mimalloc would be used instead of jemalloc. Same thing for cargo.toml.

This has removed the issue. The issue is caused by jemalloc.

Using jemalloc from the system (installed via apt) through the flag JEMALLOC_OVERRIDE (env variable) doesn’t bring anything good as the version of jemalloc installed by apt is not compiled with the required flag disable_initial_exec_tls.

Building within the docker with mimalloc instead of jemalloc solves the issue at the expense of performance (I suppose).

@ritchie46
Copy link
Member

Yes, I understand that the issue is jemmaloc. But I am curious which wheels you propose to replace jemalloc for mimalloc. If it is the default linux wheel, I am not really enthusiastic about this as we have much better performance with jemalloc.

And I wonder if it is a problem on our/jemalloc's side or your special case of virtualization.

@CalOmnie
Copy link
Contributor Author

CalOmnie commented Nov 2, 2022

If possible, we'd be happy with a solution similar to the one for this issue: #2922 . As a backup we'd be okay with being able to select the memory allocator through a compile flag.

@faustomilletari
Copy link

faustomilletari commented Nov 2, 2022

I don’t think the “standard” version of polars should have mimalloc as jemalloc works just fine if there is no emulation involved.

I was wondering if an ad-hoc whl with mimalloc could be built at every release and archived somewhere accessible while a known issues type of thing could also be added to the readme targeting docker m1 users.

Could be a great solution for everyone. Maybe we could contribute a GitHub action to automate this?!

@faustomilletari
Copy link

If possible, we'd be happy with a solution similar to the one for this issue: #2922 . As a backup we'd be okay with being able to select the memory allocator through a compile flag.

This type solution would be more than fine

@ritchie46
Copy link
Member

That would need another project as I don't think we can select the wheel via feature flags. However we have a more conservative version of polars here: https://pypi.org/project/polars-lts-cpu/

We could also let that one use the mimalloc allocator?

@faustomilletari
Copy link

That would need another project as I don't think we can select the wheel via feature flags. However we have a more conservative version of polars here: https://pypi.org/project/polars-lts-cpu/

We could also let that one use the mimalloc allocator?

If polars team is okay with that, I believe this solution would be great!

@ritchie46
Copy link
Member

Yeap, some help with the github actions would be appreciated. As I am pretty full. :)

@ghuls
Copy link
Collaborator

ghuls commented Nov 3, 2022

Did you try to run docker correctly with? docker run --platform linux/amd64 -it test-failure

@alexandervaneck
Copy link

alexandervaneck commented Nov 3, 2022 via email

@ghuls
Copy link
Collaborator

ghuls commented Nov 3, 2022

But did you use that flag also during the docker run command?

@ghuls
Copy link
Collaborator

ghuls commented Nov 3, 2022

#4848 might be slightly related.
Does preloading the polars library work? #4848 (comment)

@CalOmnie
Copy link
Contributor Author

CalOmnie commented Nov 4, 2022

This problem has been replicated with every possible location location of --platform linux/amd64. In the docker build and docker run commands as well as in the Dockerfile. Every possible combination as also been attempted, all leading to the same segfault.

We've also attempted to preload the polars library but that unfortunately didn't work either:

root@4ac8651fe6e0:/# LD_PRELOAD=/usr/local/lib/python3.8/site-packages/polars/polars.abi3.so python
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
Python 3.8.15 (default, Oct 25 2022, 06:04:13)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import polars.polars
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

We've been able to succesfully compile and run polars on arm64 when providing the use_mimalloc conditional added in this merge request: #5414

@ghuls
Copy link
Collaborator

ghuls commented Nov 4, 2022

Can you try with the Polars build for older cpus to see if emulation of avx instructions is not the cause of the issue: https://pypi.org/project/polars-lts-cpu/

@CalOmnie
Copy link
Contributor Author

CalOmnie commented Nov 4, 2022 via email

@ritchie46
Copy link
Member

closed by: #5414

@loftusa
Copy link

loftusa commented Apr 24, 2023

Hi all, I'm trying to run python -c 'import polars as pl' in a docker container, apple m2 chip, and I got this error. I tried

pip uninstall polars
pip install --upgrade --no-cache-dir polars

to no avail. Doesn't seem to be fixed unless I'm doing something stupid. which is possible.
Full log. I am in a Docker container running on an M2 chip, 2022 macbook air. pip install polars-lts-cpu worked.

(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ pip install --upgrade --no-cache-dir polars
Requirement already satisfied: polars in /opt/conda/lib/python3.10/site-packages (0.17.8)
Requirement already satisfied: typing_extensions>=4.0.1 in /opt/conda/lib/python3.10/site-packages (from polars) (4.3.0)
(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ pip uninstall polars
Found existing installation: polars 0.17.8
Uninstalling polars-0.17.8:
  Would remove:
    /opt/conda/lib/python3.10/site-packages/polars-0.17.8.dist-info/*
    /opt/conda/lib/python3.10/site-packages/polars/*
Proceed (Y/n)? Y
  Successfully uninstalled polars-0.17.8
(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ pip install --upgrade --no-cache-dir polars
Collecting polars
  Downloading polars-0.17.8-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.8/17.8 MB 34.2 MB/s eta 0:00:00
Requirement already satisfied: typing_extensions>=4.0.1 in /opt/conda/lib/python3.10/site-packages (from polars) (4.3.0)
Installing collected packages: polars
Successfully installed polars-0.17.8
(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ python -c "import polars as pl"
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
(base) jovyan@996de6e505ed:~/work/pipelines/Diadophis$ 

@ritchie46
Copy link
Member

Try polars-lts-cpu

@thomasaarholt
Copy link
Contributor

Sorry for crashing into a closed issue: It would be nice if the segfault could be delayed until one actually does something with polars, and also have some sort of warning printed. As it is, we have an internal utility package that has a database module which handles data exchange with our database - optionally with polars or pandas. So polars is imported when we do import our_package.db.

We have a few situations where users are developing on an x86_64 image in docker on their Apple Silicon macs, and are running into the above segfault when importing the db package, despite not needing the polars functionality.

We could get around this by using polars-lts-cpu, but since polars is a dependency of our_package, and we install our packages using pip-compile / pip install -r requirements.txt, we have to manually edit our requirements.txt files, which is causing friction for our users who don't understand why their kernels are crashing without any logs.

Adding polars-lts-cpu to the requirements.in in addition to our-package (requirements.in gets "compiled" into a requirements.txt file with specific version requirements, like a lock-file) does install both polars and polars-lts-cpu, but the error still happens since I guess the main package takes precedence.

We've also looked at finding a way of adding polars-lts-cpu as an extra option like polars does with other things (like pip install 'polars[all]'), but we haven't found a way to satisfy

  1. Having polars as a main requirement
  2. Installing polars-lts-cpu and not polars when lts is specified as an extra.

I'm happy to help debug this, e.g. by figuring out which section of polars code is triggering the error, and finding a way to gracefully prevent polars from crashing.

@ritchie46
Copy link
Member

It would be nice if the segfault could be delayed until one actually does something with polars, and also have some sort of warning printed.

This is out of our control. Their runs an instruction that is not supported by the architecture set. There is no way for us to have a callback or some sort.

And always installing polars-lts-cpu only?

@thomasaarholt
Copy link
Contributor

I see. Can we check the architecture to know whether the instruction set is supported?

And always installing polars-lts-cpu only?

That works, but then we need to set that version as a dependency for our_package. I haven’t checked, but I assume it runs quite a bit slower then? I guess we could still do that, and users can optionally add polars to their requirements file if they want a faster version.

@ritchie46
Copy link
Member

I see. Can we check the architecture to know whether the instruction set is supported?

Yes, this are the features we compile:

export RUSTFLAGS='-C target-feature=+fxsr,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+popcnt,+avx,+fma'

They should be in cat /proc/cpuinfo | grep flags.

The issue is that the mac/docker virtualization virtualizes a pretty old CPU architecture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

7 participants