Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series of arrays produces ValueError later in operations #253

Closed
ikondov opened this issue Nov 4, 2024 · 1 comment
Closed

Series of arrays produces ValueError later in operations #253

ikondov opened this issue Nov 4, 2024 · 1 comment

Comments

@ikondov
Copy link

ikondov commented Nov 4, 2024

I want to create a series with elements that are arrays. Actually it works but when I do some operations on them, for example printing them, I get the error below. In the example below I create a series with one element that is numpy.ndarray

In [1]: import pandas

In [2]: import pint

In [3]: import pint_pandas

In [6]: s_arr = pandas.Series([[1, 2]], name='length', dtype=pint_pandas.PintType('meter'))

In [7]: type(s_arr[0].magnitude)
Out[7]: numpy.ndarray

In [8]: print(s_arr)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[7], line 1
----> 1 print(s_arr)

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pandas/core/series.py:1784, in Series.__repr__(self)
   1782 # pylint: disable=invalid-repr-returned
   1783 repr_params = fmt.get_series_repr_params()
-> 1784 return self.to_string(**repr_params)

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pandas/core/series.py:1883, in Series.to_string(self, buf, na_rep, float_format, header, index, length, dtype, name, max_rows, min_rows)
   1831 """
   1832 Render a string representation of the Series.
   1833
   (...)
   1869 '0    1\\n1    2\\n2    3'
   1870 """
   1871 formatter = fmt.SeriesFormatter(
   1872     self,
   1873     name=name,
   (...)
   1881     max_rows=max_rows,
   1882 )
-> 1883 result = formatter.to_string()
   1885 # catch contract violations
   1886 if not isinstance(result, str):

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pandas/io/formats/format.py:320, in SeriesFormatter.to_string(self)
    318 else:
    319     fmt_index = index._format_flat(include_name=True)
--> 320 fmt_values = self._get_formatted_values()
    322 if self.is_truncated_vertically:
    323     n_header_rows = 0

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pandas/io/formats/format.py:297, in SeriesFormatter._get_formatted_values(self)
    296 def _get_formatted_values(self) -> list[str]:
--> 297     return format_array(
    298         self.tr_series._values,
    299         None,
    300         float_format=self.float_format,
    301         na_rep=self.na_rep,
    302         leading_space=self.index,
    303     )

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pandas/io/formats/format.py:1161, in format_array(values, formatter, float_format, na_rep, digits, space, justify, decimal, leading_space, quoting, fallback_formatter)
   1145     digits = get_option("display.precision")
   1147 fmt_obj = fmt_klass(
   1148     values,
   1149     digits=digits,
   (...)
   1158     fallback_formatter=fallback_formatter,
   1159 )
-> 1161 return fmt_obj.get_result()

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pandas/io/formats/format.py:1194, in _GenericArrayFormatter.get_result(self)
   1193 def get_result(self) -> list[str]:
-> 1194     fmt_values = self._format_strings()
   1195     return _make_fixed_width(fmt_values, self.justify)

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pandas/io/formats/format.py:1528, in _ExtensionArrayFormatter._format_strings(self)
   1526     array = values._internal_get_values()
   1527 else:
-> 1528     array = np.asarray(values, dtype=object)
   1530 fmt_values = format_array(
   1531     array,
   1532     formatter,
   (...)
   1541     fallback_formatter=fallback_formatter,
   1542 )
   1543 return fmt_values

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pint_pandas/pint_array.py:869, in PintArray.__array__(self, dtype, copy)
    867 def __array__(self, dtype=None, copy=False):
    868     if dtype is None or is_object_dtype(dtype):
--> 869         return self._to_array_of_quantity(copy=copy)
    870     if is_string_dtype(dtype):
    871         return np.array([str(x) for x in self.quantity], dtype=str)

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pint_pandas/pint_array.py:875, in PintArray._to_array_of_quantity(self, copy)
    874 def _to_array_of_quantity(self, copy=False):
--> 875     qtys = [
    876         self._Q(item, self._dtype.units)
    877         if not pd.isna(item)
    878         else self.dtype.na_value
    879         for item in self._data
    880     ]
    881     with warnings.catch_warnings(record=True):
    882         return np.array(qtys, dtype="object")

File /mnt/data/ubuntu/work/python-3.10.12_new/lib/python3.10/site-packages/pint_pandas/pint_array.py:876, in <listcomp>(.0)
    874 def _to_array_of_quantity(self, copy=False):
    875     qtys = [
--> 876         self._Q(item, self._dtype.units)
    877         if not pd.isna(item)
    878         else self.dtype.na_value
    879         for item in self._data
    880     ]
    881     with warnings.catch_warnings(record=True):
    882         return np.array(qtys, dtype="object")

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The expected behavior would be like in pandas:

In [15]: import numpy

In [16]: s_arr = pandas.Series([numpy.array([1, 2])], name='length')

In [17]: print(s_arr)
0    [1, 2]
Name: length, dtype: object

In [18]: type(s_arr[0])
Out[18]: numpy.ndarray

The versions used:

pandas: 2.2.3
pint-pandas: 0.6.2
pint: 0.24.3
python:  3.10.12
numpy: both 1.26.4 and 2.1.3
@andrewgsavage
Copy link
Collaborator

This will be difficult to support as the library has been written with single values as the elements in mind, so there will be many functions that would need changing for this to work properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants