Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: GH4633, bool(obj) behavior, raise on __nonzero__ always #4657

Merged
merged 2 commits into from
Aug 31, 2013

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Aug 23, 2013

closes #4633

this is a revert to #1073/#1069

now a call to __nonzero__ raises ValueError ALWAYS

The following is the behavior

In [17]: s = Series(randn(4))

In [18]: df = DataFrame(randn(10,2))

In [19]: s_empty = Series()

In [20]: df_empty = DataFrame()

In [5]: bool(s)
Out[5]: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [6]: bool(s_empty)
Out[6]:  ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [7]: bool(df)
Out[7]:  ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [8]: bool(df_empty)
Out[8]:  ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

And prevents these fun ones (same for Series/Panel)

In [4]: df1 = DataFrame(np.ones((4,4)))

In [5]: df2 = DataFrame(np.zeros((4,4)))

In [6]: df1 and df2
 ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [7]: df1 or df2
 ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [8]: not df1
 ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [9]: def f():
   ...:     if df1:
   ...:         print("this is cool")
   ...:         

In [10]: f()
 ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()


@jtratner
Copy link
Contributor

The only change I might make is to include the note you had previously about using 'empty', so maybe:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.empty(), a.any() or a.all().

@jreback
Copy link
Contributor Author

jreback commented Aug 23, 2013

will do

@jtratner
Copy link
Contributor

@jreback sorry, missed your earlier comment - anyways looks good though!

@jtratner
Copy link
Contributor

The one nice thing to do would be to add something to the docs that specifically quotes this exception so that someone using pandas for the first time can search for it. Either with this PR or as a todo for later on.

@jreback
Copy link
Contributor Author

jreback commented Aug 23, 2013

sure....where do you think should go? boolean indexing (maybe a mention of NOT using and and using '&' instead)
where else?

@jtratner
Copy link
Contributor

I think that's a good place - caveats and gotchas probably is another good place - http://pandas.pydata.org/pandas-docs/dev/gotchas.html . I'm thinking something like:

Using If/Truth Statements with Pandas
=============================

Pandas follows the numpy convention of raising an error when you try to convert something to a `bool`, (which is what happens in an `if` statement or when using `and` or `or`).  It's not clear what the result of

.. code-block:: python
    if Series([False, True, False]):

should be. Should it be `True` because it's not zero-length? False because there are False values? It's unclear, so instead, pandas raises a ValueError:

.. code-block:: python
    >>> if pd.Series([False, True, False]): print("I was true")
    Traceback
        ...
    ValueError: The truth value of an array with more than one element is ambiguous. Use a.empty(), a.any() or a.all().


If you see that, you need to explicitly choose what you want to do with it (e.g., use `any()`, `all()` or `empty`).  Often you might want to compare against `None`.

Note that comparison operators like `==` and `!=` will also return arrays (which is almost always what you want anyways):

.. code-block:: python
   >>> s = pd.Series(range(5))
   >>> s == 4
   0    False
   1    False
   2    False
   3    False
   4     True
   dtype: bool

And maybe link to docs on any, all, empty?

@jreback
Copy link
Contributor Author

jreback commented Aug 24, 2013

@jtratner
that was a nice explanation....it went in (in gotchase), with an expanded comparison section (for all/any/empty) in basics and links around

@@ -531,7 +531,8 @@ def empty(self):
return not all(len(self._get_axis(a)) > 0 for a in self._AXIS_ORDERS)

def __nonzero__(self):
return not self.empty
raise ValueError("The truth value of an array with more than one element is ambiguous. Use a.empty, a.any() or a.all()")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be nice to split this on to multiple lines. your call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [1]: s = Series(1,index=[1,2])

In [2]:  s and s
ValueError: The truth value of an array with more than one element is ambiguous.
            Use a.empty, a.any() or a.all()

@jreback
Copy link
Contributor Author

jreback commented Aug 24, 2013

@wesm ?

@hayd
Copy link
Contributor

hayd commented Aug 24, 2013

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

@jreback
Copy link
Contributor Author

jreback commented Aug 26, 2013

@wesm this going back to a more natural, less error prone API (that we did have in 0.11)

print("I was true")
Traceback
...
ValueError: The truth value of an array. Use a.empty, a.any() or a.all().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(missing) is ambiguous

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep..thkxs

BUG: GH4633, rever to GH1073, whereby __nonzero__ always raises for all NDFrame objects
@jreback
Copy link
Contributor Author

jreback commented Aug 31, 2013

any more comments on this before merging?

@jtratner
Copy link
Contributor

Looks good.

jreback added a commit that referenced this pull request Aug 31, 2013
API: GH4633, bool(obj) behavior, raise on __nonzero__ always
@jreback jreback merged commit 5148e90 into pandas-dev:master Aug 31, 2013
@hayd
Copy link
Contributor

hayd commented Sep 10, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Now boolean operators work with NDFrames?
4 participants