Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

column-wise fillna with Series/dict NotImplemented #4514

Open
hayd opened this issue Aug 8, 2013 · 11 comments
Open

column-wise fillna with Series/dict NotImplemented #4514

hayd opened this issue Aug 8, 2013 · 11 comments
Assignees
Labels
Enhancement Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Comments

@hayd
Copy link
Contributor

hayd commented Aug 8, 2013

As per discussion on this SO question is NotImplementedError.

Solution/workaround is to transpose do transpose? This is used elsewhere in DataFrame.fillna method. just raise if inplace?

cc @cpcloud

In [9]: df = pd.DataFrame([[np.nan, np.nan], [np.nan, 4], [5, 6]], columns=list('AB'))

In [10]: df
Out[10]:
    A   B
0 NaN NaN
1 NaN   4
2   5   6

In [11]: df.mean(0)
Out[11]:
A    5
B    5
dtype: float64

In [12]: df.fillna(df.mean())
Out[12]:
   A  B
0  5  5
1  5  4
2  5  6

In [13]: df.mean(1)
Out[13]:
0    NaN
1    4.0
2    5.5
dtype: float64

In [14]: df.fillna(df.mean(1), axis=1)
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-14-aecc493431e2> in <module>()
----> 1 df.fillna(df.mean(1), axis=1)

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.pyc in fillna(self, value, method, axis, inplace, limit, downcast)
   3452             if isinstance(value, (dict, Series)):
   3453                 if axis == 1:
-> 3454                     raise NotImplementedError('Currently only can fill '
   3455                                               'with dict/Series column '
   3456                                               'by column')

NotImplementedError: Currently only can fill with dict/Series column by column
@jreback
Copy link
Contributor

jreback commented Aug 8, 2013

this is pretty straightforward after merging in series subclass NDFrame. This has to deal with possible dtype changes when doing column wise (which I think its not-implemented)

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Mar 19, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015
@jorisvandenbossche
Copy link
Member

Came up again at SO

@aileronajay
Copy link
Contributor

@jreback @jorisvandenbossche is this implemented yet? I was trying to do this result.fillna(result.mean(axis=1), axis =1 ) and got the same exception

@jreback
Copy link
Contributor

jreback commented Apr 23, 2017

issue is open

@aspiringguru
Copy link

my simple/crude workaround below. curious why this issue is still open in 2018.

colnames = list(df)

for colname in colnames:
df[colname].fillna(method='bfill', inplace=True)

@jorisvandenbossche
Copy link
Member

curious why this issue is still open in 2018.

Because nobody made the effort to implement it. But you are welcome to do so.

@malbahrani
Copy link

Faced the same issue. It is worth mentioning here, @hayd solution posted in StackOverflow

Thanks for the workaround Andy!

@aadarshsingh191198
Copy link

aadarshsingh191198 commented May 17, 2023

This isn't implemented even in the latest version of pandas (v2.0.1). Can we reopen this issue, please?

@aftersought
Copy link

To confirm, in my usage ffill(axis=1, inplace=True) works as long as all dtypes of the columns are the same; I get NotImplemented error if they have mixed dtypes.

@ShivnarenSrinivasan
Copy link
Contributor

I see this is still an open issue
Not sure if I will be able to implement this, but assigning to myself to build on top of previous PR attempts.

take

@kdebrab
Copy link
Contributor

kdebrab commented Sep 4, 2024

Solution/workaround is to do transpose?

We actually used transpose (i.e., df.T.fillna(df.mean(1)).T), but it proved to be really horrendous with regard to performance. This performance issue happens especially when the index of df is (much) longer than the number of columns. We solved it by using apply instead:

df_mean = df.mean(1)
df.apply(lambda col: col.fillna(df_mean))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet