Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame.clip sets all values to the lower_bound #2747

Closed
dhirschfeld opened this issue Jan 24, 2013 · 7 comments
Closed

BUG: DataFrame.clip sets all values to the lower_bound #2747

dhirschfeld opened this issue Jan 24, 2013 · 7 comments
Labels
Milestone

Comments

@dhirschfeld
Copy link
Contributor

The following unit-test fails:

def test_dataframe_clip(lb=-1, ub=1):
    df = pd.DataFrame(np.random.randn(1000,2))
    lb_mask = df.values <= lb
    ub_mask = df.values >= ub
    mask = ~lb_mask & ~ub_mask
    clipped_df = df.clip(lb, ub)
    assert (clipped_df.values[lb_mask] == lb).all()
    assert (clipped_df.values[ub_mask] == ub).all()
    assert (clipped_df.valus[mask] == df.values[mask]).all()
#
@jreback
Copy link
Contributor

jreback commented Jan 24, 2013

    def clip(self, upper=None, lower=None):
        """
        Trim values at input threshold(s)

        Parameters
        ----------
        lower : float, default None
        upper : float, default None

        Returns
        -------
        clipped : DataFrame

try this df.clip(upper=ub,lower=lb)
the arguments are 'backwards'

@dhirschfeld
Copy link
Contributor Author

Wow, that's confusing!

Even if there was a good argument for going against the numpy convention, it's not even consistent within pandas as Series inherits it's method from numpy.

Any chance this can be changed? I know it would break code already out there, but I'm betting there's incorrect code out there already where someone has assumed the numpy convention or tried to write generic Series/DataFrame code using clip.

np.clip(a, a_min, a_max, out=None)
pd.Series.clip(self, lower=None, upper=None, out=None)
pd.DataFrame.clip(self, upper=None, lower=None)

@jreback
Copy link
Contributor

jreback commented Jan 24, 2013

agreed...should prob be changed...

you can also accomplish what you are after like this (but prob more confusing)...though same syntax works on series

df.where(df>-1,-1).where(df<1,1)

@jreback
Copy link
Contributor

jreback commented Jan 24, 2013

closed in #2750

jreback added a commit to jreback/pandas that referenced this issue Jan 31, 2013
@jreback jreback closed this as completed Feb 11, 2013
@l736x
Copy link
Contributor

l736x commented Sep 2, 2013

Why not the same treatment for Series? (It just happened that I had a bug due to this)

@jreback
Copy link
Contributor

jreback commented Sep 2, 2013

what version?

@l736x
Copy link
Contributor

l736x commented Sep 3, 2013

Sorry, I missed that issue. I'm on 0.12, I just checked master and there is no sign of:

lower, upper = min(lower, upper), max(lower, upper)

in series.clip
If it's already fixed by 4324, forget it. Sorry for the confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants