BUG: Dataframe fill_na with series #27197

VibhuJawa · 2019-07-02T23:13:56Z

Code Sample, a copy-pastable example if possible

import pandas as pd

pdf = pd.DataFrame({'a': [1, 2, None], 'b': [None, None, 5]})

fill_value_ser = pd.Series([3,4,5])
filled_df = pdf.fillna(fill_value_ser)
print(filled_df)

print("\n")
filled_ser = pdf['a'].fillna(fill_value_ser)
print(filled_ser)

Output:

     a    b
0  1.0  NaN
1  2.0  NaN
2  NaN  5.0

0    1.0
1    2.0
2    5.0
Name: a, dtype: float64

Problem description

Currently filling a dataframe with a pandas series does not seem to be working as no values are filled.

Filling a single series seems to be working though.

Expected Output

    a    b
0  1.0  3.0
1  2.0  4.0
2  5.0  5.0

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-50-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 5.0.0
pip: 19.1.1
setuptools: 41.0.1
Cython: 0.29.11
numpy: 1.16.4
scipy: None
pyarrow: 0.12.1
xarray: None
IPython: 7.6.0
sphinx: 2.1.2
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

The text was updated successfully, but these errors were encountered:

TomAugspurger · 2019-07-03T02:05:06Z

This may be a Series being dict-like

In [4]: pdf.fillna(pd.Series([0, 1], index=['a', 'c']))
Out[4]:
     a    b
0  1.0  NaN
1  2.0  NaN
2  0.0  5.0

So the keys are the index and the values are the values. 2, 'a' is filled there since the series['a'] is 0.

VibhuJawa · 2019-07-03T02:42:32Z

Aah, Got it.

Thanks for replying.

I think to be consistent with the documentation as well as with series.fillna , fillna function should
support Series where the repacment happens at each index for all coulmns.

From the docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

value : scalar, dict, Series, or DataFrame

Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to
use for each index (for a Series) or column (for a DataFrame). (values not in the dict/Series/DataFrame
will not be filled). This value cannot be a list.

TomAugspurger · 2019-07-03T03:17:19Z

I think to be consistent with the documentation as well as with series.fillna , fillna function should
support Series where the repacment happens at each index for all coulmns.

Can you give an example?

VibhuJawa · 2019-07-03T03:34:03Z

I think to be consistent with the documentation as well as with series.fillna , fillna function should
support Series where the replacement happens at each index for all columns.

Can you give an example?

I would like the behavior to be as follows:

import pandas as pd

df = pd.DataFrame({'a': [1, 2, None], 'b': [None, None, 6]})

fill_value_ser = pd.Series([3,4,5])
filled_df = df.fillna(fill_value_ser)

print(filled_df)

   a    b
0  1.0  3.0
1  2.0  4.0
2  5.0  6.0

Essentially equal to the below logic:

filled_df = pd.DataFrame()
for col in df:
    filled_df[col]=df[col].fillna(fill_value_ser)

TomAugspurger · 2019-07-03T03:39:18Z

I suppose that's `df.fillna(series, axis=1)`, which currently isn't implemented?

…

On Tue, Jul 2, 2019 at 10:34 PM Vibhu Jawa ***@***.***> wrote: I think to be consistent with the documentation as well as with series.fillna , fillna function should support Series where the replacement happens at each index for all columns. Can you give an example? I would like the behavior to be as follows: import pandas as pd df = pd.DataFrame({'a': [1, 2, None], 'b': [None, None, 6]}) fill_value_ser = pd.Series([3,4,5]) filled_df = df.fillna(fill_value_ser) print(filled_df) a b 0 1.0 3.0 1 2.0 4.0 2 5.0 6.0 Essentially equal to the below logic: filled_df = pd.DataFrame()for col in df: filled_df[col]=df[col].fillna(fill_value_ser) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#27197?email_source=notifications&email_token=AAKAOIWOWGHORQIJLCIY6LTP5QM3JA5CNFSM4H5AQ67KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZDE5RI#issuecomment-507924165>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKAOIXQC5DQQPA3O5ZUDF3P5QM3JANCNFSM4H5AQ67A> .

VibhuJawa · 2019-07-03T03:52:21Z

Aah, got it. Sorry for the confusion .

Yeah, the logic with df.fillna(series, axis=1) is what i need.
I still fell the documentation is misleading though.

From the docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

value : scalar, dict, Series, or DataFrame

Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to
use for each index (for a Series) or column (for a DataFrame). (values not in the dict/Series/DataFrame
will not be filled). This value cannot be a list.

TomAugspurger · 2019-07-08T16:20:13Z

I'm going to close this as a duplicate of #4514.

We always welcome improvements to the docs if you want to make a PR for that.

TomAugspurger closed this as completed Jul 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Dataframe fill_na with series #27197

BUG: Dataframe fill_na with series #27197

VibhuJawa commented Jul 2, 2019

INSTALLED VERSIONS

TomAugspurger commented Jul 3, 2019

VibhuJawa commented Jul 3, 2019

TomAugspurger commented Jul 3, 2019

VibhuJawa commented Jul 3, 2019

TomAugspurger commented Jul 3, 2019 via email

VibhuJawa commented Jul 3, 2019

From the docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

TomAugspurger commented Jul 8, 2019

BUG: Dataframe fill_na with series #27197

BUG: Dataframe fill_na with series #27197

Comments

VibhuJawa commented Jul 2, 2019

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

TomAugspurger commented Jul 3, 2019

VibhuJawa commented Jul 3, 2019

From the docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

TomAugspurger commented Jul 3, 2019

VibhuJawa commented Jul 3, 2019

TomAugspurger commented Jul 3, 2019 via email

VibhuJawa commented Jul 3, 2019

From the docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

TomAugspurger commented Jul 8, 2019

Output of `pd.show_versions()`