Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError: No module named 'numpy.core.multiarray' When Python 3 try to load pkl file that generated by Python 2 pickle #3193

Open
zincers opened this issue Nov 14, 2024 · 2 comments
Assignees
Labels
enhancement An improvement rather than a bug help wanted Please help with this, we think you can
Milestone

Comments

@zincers
Copy link

zincers commented Nov 14, 2024

Note: This issue is due to I was trying to use pickle(Python 3) to load a pkl file that generated by Python 2.7.x pickle. I don't think this issue needs pay much attention, as Python 2's lifecycle has been over for a while now. I have resolved this issue, steps to reproduce the issue and the solution are as follows.

  • Nuitka version, full Python version, flavor, OS, etc. as output by this exact command.

    python -m nuitka --version

  2.4.11
  Commercial: None
  Python: 3.12.7 | packaged by Anaconda, Inc. | (main, Oct  4 2024, 13:17:27) [MSC v.1929 64 bit (AMD64)]
  Flavor: Anaconda Python
  Executable: D:\Software\Miniconda3\envs\Test\python.exe
  OS: Windows
  Arch: x86_64
  WindowsRelease: 11
  Version C compiler: C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\bin\Hostx64\x64\cl.exe (cl 14.3).
  • How did you install Nuitka and Python

    conda virtualenv, pip install nuitka

  • The specific PyPI names and versions

    python -m pip list -v

  > Package     Version Location                                           Installer
  > ----------- ------- -------------------------------------------------- ---------
  > Nuitka      2.4.11  D:\Software\Miniconda3\envs\Test\Lib\site-packages pip
  > numpy       2.1.3   D:\Software\Miniconda3\envs\Test\Lib\site-packages pip
  > ordered-set 4.1.0   D:\Software\Miniconda3\envs\Test\Lib\site-packages pip
  > pip         24.2    D:\Software\Miniconda3\envs\Test\Lib\site-packages conda
  > setuptools  75.1.0  D:\Software\Miniconda3\envs\Test\Lib\site-packages
  > wheel       0.44.0  D:\Software\Miniconda3\envs\Test\Lib\site-packages
  > zstandard   0.23.0  D:\Software\Miniconda3\envs\Test\Lib\site-packages pip
  • Also supply a Short, Self Contained, Correct, Example

generate_pkl_file.py

#-*- coding:utf-8 -*-

import numpy as np
import pickle
import zlib

original_data = {
    'slice': [0.0, 1.0], 
    'data': {
        "sec1": np.zeros(5), 
        "sec2": np.ones(5), 
    }
}

with open("demo_py2.pkl", 'wb') as fh:
    fh.write(zlib.compress(pickle.dumps(original_data, protocol=2)))
fh.close()

The original pkl file is too large, so I wrote this script to generate small test files.
Please make sure run generate_pkl_file.py in both Python 2 (2.7.18) and Python 3 (3.12.7)environments.
This script is use to generate two test files(demo_py2.pkl and demo_py3.pkl).
Adjust the filename as needed before run this script.

test.py

#-*- coding:utf-8 -*-

# import numpy.core.multiarray
import numpy as np
import pickle
import zlib

final_data = pickle.loads(zlib.decompress(open("demo_py3.pkl", 'rb').read()), encoding='latin1')
print("Python 3:\n", final_data)

final_data = pickle.loads(zlib.decompress(open("demo_py2.pkl", 'rb').read()), encoding='latin1')
print("Python 2:\n", final_data)

Compile test.py using Nuitka . Please keep import numpy.core.multiarray line is commented out(see solution 2).

After compiling, run the generated .\test.dist\test.exe file to reproduce the issue. The output:

Python 3:
 {'slice': [0.0, 1.0], 'data': {'sec1': array([0., 0., 0., 0., 0.]), 'sec2': array([1., 1., 1., 1., 1.])}}
Traceback (most recent call last):
  File "path\to\test.dist\test.py", line 12, in <module>
ModuleNotFoundError: No module named 'numpy.core.multiarray'

The file generated by Python 3 is load correctly, while the file generated by Python 2 is not.

  • Provide in your issue the Nuitka options used

python -m nuitka --msvc=latest --standalone .\test.py

  • Note if this is a regression

numpy < 2.0 works

  • Solution 1:
    add implicit-import to path\to\myenv\Lib\site-packages\nuitka\plugins\standard stdlib3.nuitka-package.config.yml
- module-name: 'pickle' # checksum: 167cb032
  implicit-imports:  
    - depends:
      - 'numpy.core.multiarray'
  anti-bloat:        
    - description: 'remove module ability to run as a binary'
      change_function:
        '_test': "'(lambda: None)'"
  • Solution 2:
    Simply adding the import statement import numpy.core.multiarray before pickle.loads function(uncomment this line in test.py) should resolve the issue.
    This solution meets my requirements, as in my entire project, this function is only used in specific location, so the impact of this solution is minimal.
@kayhayen
Copy link
Member

So, pickle will only be able to load modules that are included, not all modules are includes, unless some dependency pulls it in, typically in a program, these will align, but as observed, they don't have to, and if pickles are used for IPC, then of course absolutely not has to be the case.

So, what can be done here, is to give a more informative error in case of ImportError with explanations as a non-deployment mode hook, but I don't see why every pickle using program would have to include numpy at all, and that's just a specific case.

Need to figure out, how we could make this happen, for pickle functions there, we can probably catch errors for them by wrapping them with module append code and that should be easy. For deployment mode, we might still give a better error message, but maybe without pointing out what to compile with.

@kayhayen kayhayen self-assigned this Nov 14, 2024
@kayhayen kayhayen added enhancement An improvement rather than a bug help wanted Please help with this, we think you can labels Nov 14, 2024
@kayhayen kayhayen added this to the 2.6 milestone Nov 14, 2024
@kayhayen
Copy link
Member

I would be nice, if somebody were to tackle this:

Here is an example how to monkey patch this:

- module-name: 'polars.series.utils' # checksum: 65b85a57
  anti-bloat:
    - description: 'workaround for compiled method support'
      append_plain: |
        _orig_is_empty_method = _is_empty_method
        def _is_empty_method(func):
          if hasattr(func, "__compiled_constant__"):
              return getattr(func, "__compiled_constant__") is None

          return _orig_is_empty_method(func)

The wrapper basically should catch ImportError and turn it into RuntimeError with a usage hint.

We have used things like this:

sys.exit('''Nuitka: Need to use this as an option to compile with '--include-distribution-metadata=%s'.''' % pkg_name)

The module name not found will be in the exception object you caught, and then you can inform the developer. In other cases, we did RuntimeError like this: raise (RuntimeError('Nuitka: Needs to be elevated already, use --windows-uac-admin')) and it would be better maybe. For deployment mode, maybe sys.exit could be used, for non-deployment raising, to make sure it goes through to the developer, but that's something I can add myself later once I harmonize these things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement rather than a bug help wanted Please help with this, we think you can
Projects
None yet
Development

No branches or pull requests

2 participants