Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matplotlib 3.8.0 is unable to create a basic plot #1276

Open
DavidHuber-NOAA opened this issue Aug 26, 2024 · 43 comments · May be fixed by JCSDA/spack#486
Open

matplotlib 3.8.0 is unable to create a basic plot #1276

DavidHuber-NOAA opened this issue Aug 26, 2024 · 43 comments · May be fixed by JCSDA/spack#486
Assignees
Labels
bug Something is not working

Comments

@DavidHuber-NOAA
Copy link
Collaborator

DavidHuber-NOAA commented Aug 26, 2024

Describe the bug
The matplotlib python library is unable to generate a basic plot, i.e. https://matplotlib.org/stable/gallery/lines_bars_and_markers/simple_plot.html. There may be a bug between the version of matplotlib and another python library. Another user reported the same bug, though it was never resolved: matplotlib/matplotlib#23923.

To Reproduce

> module use /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/modulefiles/Core
> module load stack-intel stack-python py-matplotlib
> wget https://matplotlib.org/3.8.4/_downloads/841352d8ea6065fce570abdf6225ef02/simple_plot.py
> python simple_plot.py
Traceback (most recent call last):
  File "/home/David.Huber/testplot.py", line 15, in <module>
    fig.savefig("test.png")
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/figure.py", line 3390, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/backend_bases.py", line 2187, in print_figure
    result = print_method(
             ^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/backend_bases.py", line 2043, in <lambda>
    print_method = functools.wraps(meth)(lambda *args, **kwargs: meth(
                                                                 ^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/backends/backend_agg.py", line 497, in print_png
    self._print_pil(filename_or_obj, "png", pil_kwargs, metadata)
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/backends/backend_agg.py", line 445, in _print_pil
    FigureCanvasAgg.draw(self)
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/backends/backend_agg.py", line 388, in draw
    self.figure.draw(self.renderer)
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/artist.py", line 95, in draw_wrapper
    result = draw(artist, renderer, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/artist.py", line 72, in draw_wrapper
    return draw(artist, renderer)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/figure.py", line 3154, in draw
    mimage._draw_list_compositing_images(
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/artist.py", line 72, in draw_wrapper
    return draw(artist, renderer)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/axes/_base.py", line 3034, in draw
    self._update_title_position(renderer)
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/axes/_base.py", line 2978, in _update_title_position
    ax.yaxis.get_tightbbox(renderer)  # update offsetText
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/axis.py", line 1334, in get_tightbbox
    ticks_to_draw = self._update_ticks()
                    ^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/axis.py", line 1275, in _update_ticks
    major_locs = self.get_majorticklocs()
                 ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/axis.py", line 1495, in get_majorticklocs
    return self.major.locator()
           ^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/ticker.py", line 2142, in __call__
    return self.tick_values(vmin, vmax)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/ticker.py", line 2150, in tick_values
    locs = self._raw_ticks(vmin, vmax)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/ticker.py", line 2088, in _raw_ticks
    nbins = np.clip(self.axis.get_tick_space(),
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/py-matplotlib-3.8.0-fcxyi47/lib/python3.11/site-packages/matplotlib/axis.py", line 2764, in get_tick_space
    return int(np.floor(length / size))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: cannot convert float NaN to integer

Expected behavior
A basic sine-wave plot would be generated.

System:
At least Hera

Additional context
Found while testing Python 3.11.7 upgrade #1217.

@DavidHuber-NOAA DavidHuber-NOAA added the bug Something is not working label Aug 26, 2024
@DavidHuber-NOAA
Copy link
Collaborator Author

FYI @malloryprow @climbfuji

@climbfuji
Copy link
Collaborator

Thanks! I was just creating an issue, but you were faster ;-)

@climbfuji
Copy link
Collaborator

I haven't looked into this at all, but I will note that there are many more matplotlib versions available in spack already. Maybe one of the slightly older ones still works with python@3.11.7 and does make plots?

$ spack info py-matplotlib
PythonPackage:   py-matplotlib

Description:
    Matplotlib is a comprehensive library for creating static, animated, and
    interactive visualizations in Python.

Homepage: https://matplotlib.org/

Preferred version:
    3.8.4    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.8.4.tar.gz

Safe versions:
    3.8.4    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.8.4.tar.gz
    3.8.3    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.8.3.tar.gz
    3.8.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.8.2.tar.gz
    3.8.1    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.8.1.tar.gz
    3.8.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.8.0.tar.gz
    3.7.4    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.7.4.tar.gz
    3.7.3    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.7.3.tar.gz
    3.7.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.7.2.tar.gz
    3.7.1    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.7.1.tar.gz
    3.7.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.7.0.tar.gz
    3.6.3    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.6.3.tar.gz
    3.6.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.6.2.tar.gz
    3.6.1    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.6.1.tar.gz
    3.6.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.6.0.tar.gz
    3.5.3    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.5.3.tar.gz
    3.5.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.5.2.tar.gz
    3.5.1    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.5.1.tar.gz
    3.5.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.5.0.tar.gz
    3.4.3    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.4.3.tar.gz
    3.4.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.4.2.tar.gz
    3.4.1    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.4.1.tar.gz
    3.4.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.4.0.tar.gz
    3.3.4    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.3.4.tar.gz
    3.3.3    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.3.3.tar.gz
    3.3.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.3.2.tar.gz
    3.3.1    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.3.1.tar.gz
    3.3.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.3.0.tar.gz
    3.2.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.2.2.tar.gz
    3.2.1    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.2.1.tar.gz
    3.2.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.2.0.tar.gz
    3.1.3    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.1.3.tar.gz
    3.1.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.1.2.tar.gz
    3.1.1    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.1.1.tar.gz
    3.1.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.1.0.tar.gz
    3.0.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.0.2.tar.gz
    3.0.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-3.0.0.tar.gz

Deprecated versions:
    2.2.5    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-2.2.5.tar.gz
    2.2.4    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-2.2.4.tar.gz
    2.2.3    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-2.2.3.tar.gz
    2.2.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-2.2.2.tar.gz
    2.0.2    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-2.0.2.tar.gz
    2.0.0    https://files.pythonhosted.org/packages/source/m/matplotlib/matplotlib-2.0.0.tar.gz

@DavidHuber-NOAA
Copy link
Collaborator Author

I will try installing 3.8.4 via pip3 with the dependent libraries from the spack-stack build and see if that resolves the issue. If not, I will try bumping other versions (e.g. numpy) to see if that resolves it.

@DavidHuber-NOAA
Copy link
Collaborator Author

It not, I'll see if I can reopen the matplotlib GitHub issue.

@DavidHuber-NOAA
Copy link
Collaborator Author

The issue appears to be with numpy, and specifically when compiled with Intel. It's a little tricky to replicate as well:

module use /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/modulefiles/Core
module load stack-intel stack-python py-matplotlib
python simple_plot.py # returns an error
module unload py-numpy/1.25.2
module load py-pip
python -m venv venv
. venv/bin/activate
pip install py-numpy==1.25.2
python simple_plot.py # success!

@DavidHuber-NOAA
Copy link
Collaborator Author

I should mention that upgrading to matplotlib==3.8.4 did not resolve the issue.

@climbfuji
Copy link
Collaborator

Thanks for looking into this. Just to confirm, this works fine if py-numpy is compiled with GNU? And this is on Hera, right? I'll try to build a GNU and Intel environment on another system at NRL to reproduce this.

@DavidHuber-NOAA
Copy link
Collaborator Author

@climbfuji I'm not sure if this works with GNU or not, but probably. All I have confirmed is that it works when pip installs the binaries, which are almost certainly GNU-compiled. numpy is built in spack-stack with openblas support, which I'm not sure if the pip installation includes or not, so that is at least theoretically another possibility.

@climbfuji
Copy link
Collaborator

We always compiled numpy with openblas, with both GNU and Intel, for many releases of spack-stack. That hasn't changed on Hera as far as I know.

@climbfuji
Copy link
Collaborator

@DavidHuber-NOAA Unfortunately I cannot reproduce this on Nautilus. The machine uses intel@2021.5.0 with intel-oneapi-mkl as the linalg provider (blas, lapack, fftw-api). This is different from Hera, which uses openblas (specifically requested by NOAA). But I am not sure if this is the reason or not.

$ module purge
$ module use /p/work2/heinzell/spack-stack-1.8.0-rc1/envs/ue-intel-2021.5.0/install/modulefiles/Core
$ module load stack-intel/2021.5.0
$ module load stack-openmpi/4.1.6
$ module load stack-python/3.11.7
$ module load py-matplotlib/3.8.0
$ python simple_plot.py
/p/home/heinzell/simple_plot.py:16: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown
  plt.show()
[heinzell@nautilus08 ~]$ ls -lart
...
-rw-------.    1 ***** *****    34492 Aug 26 21:50 test.png
...
$ module li
Currently Loaded Modulefiles:
 1) intel/tbb/latest           6) penguin/openmpi/4.1.6/intel-classic-2022.0.2  11) libxcrypt/4.4.35        16) stack-python/3.11.7        21) py-numpy/1.25.2      26) py-kiwisolver/1.4.5  31) py-six/1.16.0
 2) intel/compiler-rt/latest   7) slurm                                         12) zlib-ng/2.1.6           17) libpng/1.6.37              22) py-pybind11/2.11.0   27) py-packaging/23.1    32) py-python-dateutil/2.8.2
 3) intel/oclfpga/latest       8) stack-openmpi/4.1.6                           13) sqlite/3.43.2           18) intel-oneapi-mkl/2022.0.2  23) py-contourpy/1.0.7   28) libjpeg/2.1.0        33) qhull/2020.2
 4) intel/compiler/2022.0.2    9) gettext/0.19.8.1                              14) util-linux-uuid/2.38.1  19) python-venv/1.0            24) py-cycler/0.11.0     29) py-pillow/9.5.0      34) py-matplotlib/3.8.0
 5) stack-intel/2021.5.0      10) glibc/2.28                                    15) python/3.11.7           20) py-setuptools/63.4.3       25) py-fonttools/4.39.4  30) py-pyparsing/3.1.2

test

@DavidHuber-NOAA
Copy link
Collaborator Author

I built a basic gcc environment on Hera with the following spack.yaml:

# spack-stack hash: ac43a4c
# spack hash: d8de455f35
spack:
  concretizer:
    unify: when_possible

  view: false
  include:
  - site
  - common

  definitions:
  - compilers: ['%gcc']
  - packages:
    - python@3.11.7
    - py-matplotlib@3.8.0
  specs:
  - matrix:
    - [$packages]
    - [$compilers]
  packages:
    all:
      prefer: ['%gcc']

Numpy built with openblas and I was able to generate the plot.

I then built this same environment with Intel and was unable to make the plot, receiving the same cannot convert float NaN to integer error as before. Hera is using the same version of Intel (2021.5.0) as Nautilus, so I'm not sure what exactly is causing this issue. Here is the module list that is loaded with Intel on Hera:

> module use /scratch1/NCEPDEV/global/David.Huber/SPACK/ss_matplotlib/envs/matplotlib_intel/install/modulefiles/Core
> module load stack-intel stack-python py-matplotlib
> module list

Currently Loaded Modules:
  1) intel/2022.1.2         6) libxcrypt/4.4.35        11) stack-python/3.11.7   16) py-numpy/1.25.2      21) py-kiwisolver/1.4.5  26) py-six/1.16.0
  2) stack-intel/2021.5.0   7) zlib-ng/2.1.6           12) libpng/1.6.37         17) py-pybind11/2.11.0   22) py-packaging/23.1    27) py-python-dateutil/2.8.2
  3) glibc/2.28             8) sqlite/3.43.2           13) openblas/0.3.24       18) py-contourpy/1.0.7   23) libjpeg/2.1.0        28) qhull/2020.2
  4) tar/1.26               9) util-linux-uuid/2.38.1  14) python-venv/1.0       19) py-cycler/0.11.0     24) py-pillow/9.5.0      29) py-matplotlib/3.8.0
  5) gettext/0.21.1        10) python/3.11.7           15) py-setuptools/63.4.3  20) py-fonttools/4.39.4  25) py-pyparsing/3.1.2

@DavidHuber-NOAA
Copy link
Collaborator Author

Besides the openblas/mkl difference between Nautilus and Hera, the only other difference I see is gettext. Nautilus is running version 0.19.8.1 and Hera 0.21.1. Also, even though the ifort and icc versions are both 2021.5.0, the entire Intel packages differ by a major version (Hera @ 2022.1.2, Nautilus at 2022.0.2).

@DavidHuber-NOAA
Copy link
Collaborator Author

@climbfuji I tried rebuilding the environment with gettext@0.19.8.1 to match Nautilus, but gettext fails to build with the error

==> Installing gettext-0.19.8.1-eqf5yfao2qymautajaznt5mqtivff7dc [28/59]
==> No binary for gettext-0.19.8.1-eqf5yfao2qymautajaznt5mqtivff7dc found: installing from source
==> Fetching file:///scratch1/NCEPDEV/nems/role.epic/spack-stack/source-cache/_source-cache/archive/10/105556dbc5c3fbbc2aa0edb46d22d055748b6f5c7cd7a8d99f8e7eb84e938be4.tar.xz
==> Applied patch /scratch1/NCEPDEV/global/David.Huber/SPACK/ss_matplotlib/spack/var/spack/repos/builtin/packages/gettext/test-verify-parallel-make-check.patch
==> Error: FileNotFoundError: [Errno 2] No such file or directory: 'libtextstyle/configure'

/scratch1/NCEPDEV/global/David.Huber/SPACK/ss_matplotlib/spack/var/spack/repos/builtin/packages/gettext/package.py:85, in patch:
         82            "gl_cv_libxml_force_included=yes",
         83            "gl_cv_libxml_force_included=no",
         84            "libtextstyle/configure",
  >>     85            string=True,
         86        )

I have two more ideas: 1) try building again with gettext@0.21.1 and Intel-MKL and 2) Try working forward from gettext@0.19.8.1 until I can find a minimum version that builds on Hera and see if the issue is still present.

@climbfuji
Copy link
Collaborator

external gettext?

@DavidHuber-NOAA
Copy link
Collaborator Author

Building the environment with intel-MKL instead of openblas did not fix the issue.

@climbfuji Good suggestion. I'll try an external gettext package.

@DavidHuber-NOAA
Copy link
Collaborator Author

I'm also going to open a matplotlib issue.

@DavidHuber-NOAA
Copy link
Collaborator Author

@DavidHuber-NOAA
Copy link
Collaborator Author

@climbfuji I'm unable to concretize with the system gettext. Here is what I added to site/packages.yaml:

    gettext:
      buildable: false
      externals:
      - spec: gettext@0.19.8.1
        prefix: /usr

And here is the output of spack concretize:

==> Error: concretization failed for the following reasons:

   1. cannot satisfy a requirement for package 'py-setuptools'.

Any suggestions?

@climbfuji
Copy link
Collaborator

@climbfuji I'm unable to concretize with the system gettext. Here is what I added to site/packages.yaml:

    gettext:
      buildable: false
      externals:
      - spec: gettext@0.19.8.1
        prefix: /usr

And here is the output of spack concretize:

==> Error: concretization failed for the following reasons:

   1. cannot satisfy a requirement for package 'py-setuptools'.

Any suggestions?

Not really, but what I would try is to unpin py-setuptools or pin it to a different version if you get duplicates

@DavidHuber-NOAA
Copy link
Collaborator Author

@climbfuji I wasn't able to unpin py-setuptools as I got duplicates and attempting to pin it to something newer caused conflicts with py-numpy@1.25.2, so I updated py-numpy to 1.26.4 instead. This did not solve the issue. I then used the system gettext (0.19.8.1), but that also did not solve the issue. I accidentally installed this last round with openblas instead of Intel-MKL, but I doubt that is the issue given that the problem occurred previously with Intel-MKL/py-numpy@1.25.2.

I'm not sure where to go from here, but would gladly take suggestions for you and/or @AlexanderRichert-NOAA.

@DavidHuber-NOAA
Copy link
Collaborator Author

I'm going to try rebuilding the environment but pointing at the gcc-compiled version of py-numpy.

@climbfuji
Copy link
Collaborator

@DavidHuber-NOAA I installed the spack-stack-1.8.0 release candidate on S4 for testing, and I was able to create simple plots with matplotlib. S4 uses intel@2021.5.0.

@DavidHuber-NOAA
Copy link
Collaborator Author

Building with the gcc-compiled py-numpy was successful.

I will wait for an official release candidate installation on Hera then try the test again.

@climbfuji
Copy link
Collaborator

@DavidHuber-NOAA I installed the spack-stack-1.8.0 release candidate on S4 for testing, and I was able to create simple plots with matplotlib. S4 uses intel@2021.5.0.

@DavidHuber-NOAA I need to correct my previous statement. I was able to plot a different, simple plot on S4. Your simple_plot.py fails in the same way on S4 as it does on Hera.

@DavidHuber-NOAA
Copy link
Collaborator Author

OK, good to know. Thanks for the correction. I'll keep working with my installation on Hera, then. The matplotlib developers may have an idea on how to fix the issue 🤞

@climbfuji
Copy link
Collaborator

climbfuji commented Aug 30, 2024

That would be great. I am changing py-numpy versions back by one major number at a time, in the hope that it will work with a somewhat more recent version than what we had in spack-stack-1.7.0. I am also checking on the oneapi compilers, and there are no problems with py-numpy@1.25.2 and py-matplotlib@3.8.0.

@climbfuji
Copy link
Collaborator

@DavidHuber-NOAA I got it to work with py-numpy@1.23.5 and py-matplotlib@3.7.4. Since py-matplotlib@3.8: depend on py-numpy@1.25:, we could work around this by pinning py-numpy to 1.23.5 and py-matplotlib to 3.7.4 for %intel only (i.e. in configs/common/packages_intel.yaml.

After all, Intel classic will be going away soon ... and there is no problem with the LLVM-based compilers with 1.25.2 / 3.8.0.

@DavidHuber-NOAA
Copy link
Collaborator Author

Fantastic! Alright, that sounds like a plan to me.

@DavidHuber-NOAA
Copy link
Collaborator Author

I'll keep working with the matplotlib folks and let them know of this workaround.

@climbfuji
Copy link
Collaborator

We have a workaround in place for spack-stack-1.8.0: use py-numpy@1.23.5 and py-matplotlib@3.7.4 with newer versions of the Intel classic compilers (intel@2021.x.y). On Acorn with intel@19, we use py-numpy@1.24.4 and py-matplotlib@3.7.4.

We need to keep this issue open until the issue is fixed upstream (matplotlib/matplotlib#28762), or until we switch to the Intel LLVM compilers for C/C++. Whichever solution, hopefully it is in time for spack-stack-1.9.0.

@DavidHuber-NOAA
Copy link
Collaborator Author

The issue is with the numpy ndarray min (and possibly other) methods. A simple reproducible example is

import numpy as np

rows = np.asarray([0, 0])
val = np.asarray([0.11])
print(val[rows].min())  # Returns 0.11 with GNU and LLVM, NaN with Intel Classic

Opened Numpy issue numpy/numpy#27840.

@DavidHuber-NOAA
Copy link
Collaborator Author

@climbfuji It seems that the issue can be mitigated by disabling some of the AVX512 CPU features for numpy with either
NPY_DISABLE_CPU_FEATURES=AVX512F or AVX512CD at runtime and may indicate a bug within the Intel compilers. I'm not sure if there's much of a chance for a bug fix from the numpy developers on this. Is this something we can add to the module file?

@climbfuji
Copy link
Collaborator

@DavidHuber-NOAA Can you try this locally first and then open a PR? There shouldn't be any issues turning off these features.

@DavidHuber-NOAA
Copy link
Collaborator Author

Sure, I'd be happy to. I'm just not sure where to make changes to the module files. Could you point me there?

@AlexanderRichert-NOAA
Copy link
Collaborator

We could either change it in config/common/modules_{lmod,tcl}.yaml (in the main section with the rest of the packages):

      'py-numpy%intel target=x86_64_v4:':
        environment:
          set:
            NPY_DISABLE_CPU_FEATURES: AVX512F

or add it to the py-numpy recipe, assuming we are pretty sure that this is always a problem for numpy+intel classic+avx512 and could convince the Spack devs to incorporate it:

    def setup_run_environment(self, env):
        if self.spec.satisfies("target=x86_64_v4: %intel"):
            env.set("NPY_DISABLE_CPU_FEATURES", "AVX512F")

@climbfuji
Copy link
Collaborator

I would prefer the latter. Is x86_64_v4 sufficient to cover all the optimization choices that enable AVX512?

@DavidHuber-NOAA
Copy link
Collaborator Author

@climbfuji I agree. And no, it may not be sufficient for all systems. I have been working on Hera, but Hercules will likely have newer AVX512 flags. I will build a simple numpy installation over there and see what flags would be required.

@AlexanderRichert-NOAA
Copy link
Collaborator

AlexanderRichert-NOAA commented Nov 25, 2024

I believe it is sufficient, yes. The key there is the colon-- see for example the openblas recipe where we disable AVX512 features for non-target=x86_64_v4: targets. In the spack repo, see lib/spack/external/archspec/json/cpu/microarchitectures.json and follow the "from" tags for each architecture; every architecture with avx512 instructions, apart from x86_64_v4 itself, ultimately depends on x86_64_v4 (for instance zen5 is 'from' zen4, which is in turn 'from' x86_64_v4).

@climbfuji
Copy link
Collaborator

There are a few exotic CPU types that aren't covered, I think. Essentially those that have avx512 flags but inherit from skylake or broadwell (cannonlake, mic_knl). My guess is that the spack developers will want those covered, too. But I might be wrong.

@AlexanderRichert-NOAA
Copy link
Collaborator

Ah dang, yep. Well, how about

    def setup_run_environment(self, env):
        archs = ("x86_64_v4:", "cannonlake:", "mic_knl")
        if any([self.spec.satisfies(f"target={arch} %intel") for arch in archs]):
            env.set("NPY_DISABLE_CPU_FEATURES", "AVX512F")

which covers cannonlake->icelake->sapphirerapids 🙃

@DavidHuber-NOAA
Copy link
Collaborator Author

Thanks @AlexanderRichert-NOAA, the second option worked on Hercules. One question, though. Should archs be defined as

archs = ("x86_64_v4:", "cannonlake:", "mic_knl:")  # Colon after mic_knl

I built a minimal stack to build py-matplotlib@3.8.4 and py-numpy@1.26.4 and was able to generate these plots: simple plot and 3D surface.

I'll go ahead and make these changes in spack. Should I also open a spack-stack PR to increment matplotlib and numpy?

@climbfuji
Copy link
Collaborator

I think Intel's MIC architecture is a dead horse and we won't be seeing anything that builds on it. Therefore, we don't need the : after mic_knl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working
Projects
Development

Successfully merging a pull request may close this issue.

3 participants