-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import pandas error for missing compression libraries #27575
Comments
I am surprised this issue keeps popping up for something in the stdlib though - conda and official distributions do always come with lzma right? |
My stab in the dark guess is that this is a pipenv issue (based on the 3 issues) and somehow https://github.com/python/cpython/blob/b9a0376b0dedf16a2f82fa43d851119d1f7a2707/setup.py#L1558 |
Yea so here's a related discussion on bpo: https://bugs.python.org/issue34895 So not a definitive response but I guess still implied that lzma is expected to be available as part of a standard Python distribution |
I feel like the closing of this issue was not appropriate. The other two issues linked also have the same problem -- that pandas 0.25 assumes you have things installed that may not actually have come with python by default. This should be made explicitly clear up front before installation completes, not as an import error after installation. |
I second @raybuhr 's comment. Pyenv is a project with 16k stars. It's very widely used.
I feel this is an incorrect assumption then. I've been using Pyenv successfully and never run into an issue with |
That's not possible, at least not with binary distributions (wheels / conda packages). Unless you're saying that there should be an error when you're compiling Python, in which case I agree (right now it's just a warning). On the larger issue, I'm not sure what's best. Clearly this is affecting people. But at some point we need to be able to rely on importing module from the standard library, right? |
And just to be clear, this isn't a pyenv issue. It's a problem on the user's machine not having the proper dependencies when Python is compiled. |
Yea this is certainly unfortunate but quoting what I think is the most definitive response from the Python mailing list:
https://mail.python.org/pipermail/python-ideas/2018-October/054089.html So since Python doesn't document this library as optional it should be available and if not the responsibility of the distributor to handle that expectation |
FWIW pyenv also documents this as the first step in their "Common Build Problems" page: https://github.com/pyenv/pyenv/wiki/Common-build-problems So perhaps could help them improve that aspect of the documentation if it isn't immediately obvious |
I see the points made above about this probably being an issue with system level dependencies. I am in fact using pyenv to install and fixing for our team isn't particularly difficult. Since python expects the compression libraries to be installed since the modules are part of the standard library, this probably doesn't have to be an issue for the pandas team. That said, I still feel like making the compression libraries prerequisites for using pandas as unnecessary overhead. I think a more sympathetic response would be to try importing the compression modules and return a message that they aren't installed while still allowing pandas to be imported and used, just without support for compression. |
Pandas 0.25.0 is not useable with tools like kubeless as debian base images for Docker don't appear to contain the proper libs for _lzma any more. You'd need to build out custom images. Pandas 0.24.2 works fine. |
I suspect we would accept a PR that did the lzma import in a try / except ImportError block. When the module is not present, we would emit a Is anyone interested in submitting a PR? |
FYI, we'll probably want to do the 0.25.1 release in 1-2 weeks. It'd be good to include this. |
Any takers to work on this? No obligation of course. If no one else is able to, I'll put something together later in the week. cc @islander, @selvathiruarul, @salompas, @tvanyo who reported this in other issues. |
If I remember correctly I was able to solve this issue by |
Thanks @salompas. Feel free to start something if you think you have a handle on what needs to be done. We're deciding a release date for 0.25.1 at our dev meeting on Wednesday. If necessary, one of us will step in and finish things off before we need to release. |
@TomAugspurger I am following your suggestions and will submit a PR soon (just need a bit of time to go through the "Contributing to pandas" page).
|
@TomAugspurger I have been trying to modify the code, but ran into a problem. One of the first files to complain about a missing |
The issue with |
Fortunately in this case, you can just use regular Python in diff --git a/pandas/_libs/parsers.pyx b/pandas/_libs/parsers.pyx
index cafc31dad..385349629 100644
--- a/pandas/_libs/parsers.pyx
+++ b/pandas/_libs/parsers.pyx
@@ -2,7 +2,6 @@
# See LICENSE for the license
import bz2
import gzip
-import lzma
import os
import sys
import time
@@ -59,9 +58,12 @@ from pandas.core.arrays import Categorical
from pandas.core.dtypes.concat import union_categoricals
import pandas.io.common as icom
+from pandas.compat import import_lzma
from pandas.errors import (ParserError, DtypeWarning,
EmptyDataError, ParserWarning)
+lzma = import_lzma()
+
# Import CParserError as alias of ParserError for backwards compatibility.
# Ultimately, we want to remove this import. See gh-12665 and gh-14479.
CParserError = ParserError
diff --git a/pandas/compat/__init__.py b/pandas/compat/__init__.py
index 5ecd641fc..04e8d44a3 100644
--- a/pandas/compat/__init__.py
+++ b/pandas/compat/__init__.py
@@ -65,3 +65,17 @@ def is_platform_mac():
def is_platform_32bit():
return struct.calcsize("P") * 8 < 64
+
+
+def import_lzma():
+ import warnings
+
+ try:
+ import lzma
+ return lzma
+ except ImportError:
+ msg = (
+ "Could not import the lzma module. Your installed Python is incomplete. "
+ "Attempting to use `lzma` compression will result in a RuntimeError."
+ )
+ warnings.warn(msg)
diff --git a/pandas/io/common.py b/pandas/io/common.py
index e01e47304..0a66c58b8 100644
--- a/pandas/io/common.py
+++ b/pandas/io/common.py
@@ -6,7 +6,6 @@ import csv
import gzip
from http.client import HTTPException # noqa
from io import BytesIO
-import lzma
import mmap
import os
import pathlib
@@ -31,10 +30,12 @@ from pandas.errors import ( # noqa
ParserWarning,
)
+from pandas.compat import import_lzma
from pandas.core.dtypes.common import is_file_like
from pandas._typing import FilePathOrBuffer
+lzma = import_lzma()
# gh-12665: Alias for now and remove later.
CParserError = ParserError Then going through and fixing up uses of lzma to check for LMK if you want me to take over. We can always find other issues for you to work on 😄 This is getting to be a bit tricky (which is why it'd be nice to rely on lzma just being present!) |
@TomAugspurger cool idea! I am trying that right now, thanks for the hint! |
ModuleNotFoundError: No module named '_lzma': |
I already OS: BigSur |
I was getting this warning: I was finally able to get rid of it with this command: Just thought I'd drop this here for anyone with my same problem. OS: Big Sur |
Thank you @corbinday ! This saved me as well on M1 with python 3.9.0 + pyenv + BigSur v11.6 |
Code Sample
Problem description
After installing pandas 0.25.0, I can't import the library because of missing compression libraries. First it returned the error message
ModuleNotFoundError: No module named '_bz2'
. I installed withsudo apt-get install libbz2-dev
and tried again to get the error message from the code sample above,ModuleNotFoundError: No module named '_lzma'
.This was not an issue with the previous version of pandas and I tested by downgrading to pandas 0.24.0 and was able to import without the error messages. I feel like pandas should not prevent usage just because some optional compression programs are not installed, like the default behavior of the last version.
Expected Output
Output of
pd.show_versions()
Unable to run because can't import pandas.
The text was updated successfully, but these errors were encountered: