pd.DataFrame.replace doesn't work after converting to new dtypes in 1.0.0 #31517

scottboston · 2020-01-31T19:43:47Z

Code Sample, a copy-pastable example if possible

import pandas as pd
pd.show_versions()
df = pd.DataFrame({'grp':[1,2,3,4,5]})
df = df.convert_dtypes()
df.replace(1,10)

Problem description

pd.DataFrame.replace is not working after using pd.DataFrame.convert_dtypes.

Stack trace:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-fb2a3a449007> in <module>
      3 df = pd.DataFrame({'grp':[1,2,3,4,5]})
      4 df = df.convert_dtypes()
----> 5 df.replace(1,10)

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in replace(self, to_replace, value, inplace, limit, regex, method)
   4167             limit=limit,
   4168             regex=regex,
-> 4169             method=method,
   4170         )
   4171 

~/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py in replace(self, to_replace, value, inplace, limit, regex, method)
   6735                 elif not is_list_like(value):  # NA -> 0
   6736                     new_data = self._data.replace(
-> 6737                         to_replace=to_replace, value=value, inplace=inplace, regex=regex
   6738                     )
   6739                 else:

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals/managers.py in replace(self, value, **kwargs)
    587     def replace(self, value, **kwargs):
    588         assert np.ndim(value) == 0, value
--> 589         return self.apply("replace", value=value, **kwargs)
    590 
    591     def replace_list(self, src_list, dest_list, inplace=False, regex=False):

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals/managers.py in apply(self, f, filter, **kwargs)
    440                 applied = b.apply(f, **kwargs)
    441             else:
--> 442                 applied = getattr(b, f)(**kwargs)
    443             result_blocks = _extend_blocks(applied, result_blocks)
    444 

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals/blocks.py in replace(self, to_replace, value, inplace, filter, regex, convert)
    769 
    770         try:
--> 771             blocks = self.putmask(mask, value, inplace=inplace)
    772             # Note: it is _not_ the case that self._can_hold_element(value)
    773             #  is always true at this point.  In particular, that can fail

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals/blocks.py in putmask(self, mask, new, align, inplace, axis, transpose)
   1671         mask = _safe_reshape(mask, new_values.shape)
   1672 
-> 1673         new_values[mask] = new
   1674         return [self.make_block(values=new_values)]
   1675 

~/anaconda3/lib/python3.6/site-packages/pandas/core/arrays/integer.py in __setitem__(self, key, value)
    415             mask = mask[0]
    416 
--> 417         self._data[key] = value
    418         self._mask[key] = mask
    419 

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

Expected Output

Should work just like it does before using pd.DataFrame.convert_dtypes.

#pd.show_versions()
df = pd.DataFrame({'grp':[1,2,3,4,5]})
#df = df.convert_dtypes()
df.replace(1,10)```

Output:

	grp
0	10
1	2
2	3
3	4
4	5

#### Output of ``pd.show_versions()``

<details>

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.6.9.final.0
python-bits      : 64
OS               : Linux
OS-release       : 4.15.0-76-generic
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.0.0
numpy            : 1.16.4
pytz             : 2019.2
dateutil         : 2.8.0
pip              : 20.0.1
setuptools       : 45.1.0
Cython           : 0.29
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.4.1
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.10.1
IPython          : 7.8.0
pandas_datareader: None
bs4              : 4.8.1
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.4.1
matplotlib       : 2.2.3
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
pytest           : None
pyxlsb           : None
s3fs             : None
scipy            : 1.2.1
sqlalchemy       : 1.3.9
tables           : None
tabulate         : 0.8.6
xarray           : None
xlrd             : None
xlwt             : None
xlsxwriter       : None
numba            : None


</details>

The text was updated successfully, but these errors were encountered:

charlesdong1991 · 2020-01-31T19:48:38Z

thanks for posting your issue @scottboston
this is indeed a regression issue. Seem to be fixed by #31484

michelkluger · 2021-05-06T09:12:47Z

I am experiencing a similar error with line df["name"].replace(str(np.nan), np.nan, inplace=True)

michelkluger · 2021-05-06T09:13:19Z

Items: RangeIndex(start=0, stop=109277, step=1)
ObjectBlock: 109277 dtype: object, to_replace = 'nan'
value = <module 'numpy' from 'p:\miniconda3_64bit\ibp2\lib\site-packages\numpy\init.py'>, inplace = True, regex = False

def replace(self, to_replace, value, inplace: bool, regex: bool) -> "BlockManager":

  assert np.ndim(value) == 0, value

E AssertionError: <module 'numpy' from 'p:\miniconda3_64bit\ibp2\lib\site-packages\numpy\init.py'>

p:\miniconda3_64bit\ibp2\lib\site-packages\pandas\core\internals\managers.py:649: AssertionError

charlesdong1991 added the Regression Functionality that used to work in a prior pandas version label Jan 31, 2020

charlesdong1991 mentioned this issue Feb 1, 2020

BUG&TST: df.replace fail after converting to new dtype #31545

Merged

5 tasks

jorisvandenbossche added Bug NA - MaskedArrays Related to pd.NA and nullable extension arrays and removed Regression Functionality that used to work in a prior pandas version labels Feb 1, 2020

jorisvandenbossche added this to the 1.0.1 milestone Feb 1, 2020

jreback closed this as completed in #31545 Feb 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pd.DataFrame.replace doesn't work after converting to new dtypes in 1.0.0 #31517

pd.DataFrame.replace doesn't work after converting to new dtypes in 1.0.0 #31517

scottboston commented Jan 31, 2020 •

edited

Loading

charlesdong1991 commented Jan 31, 2020

michelkluger commented May 6, 2021

michelkluger commented May 6, 2021

pd.DataFrame.replace doesn't work after converting to new dtypes in 1.0.0 #31517

pd.DataFrame.replace doesn't work after converting to new dtypes in 1.0.0 #31517

Comments

scottboston commented Jan 31, 2020 • edited Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

charlesdong1991 commented Jan 31, 2020

michelkluger commented May 6, 2021

michelkluger commented May 6, 2021

scottboston commented Jan 31, 2020 •

edited

Loading