Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Distinguish between different types of boolean indexing #10492 #36869

Merged
merged 9 commits into from
Oct 8, 2020

Conversation

junjunjunk
Copy link
Contributor

Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @junjunjunk for the PR!

Some stylistic comments.

I also think it might be good to place this in a warning box following the discussion of boolean indexing with iloc

@@ -933,6 +933,20 @@ and :ref:`Advanced Indexing <advanced>` you may select along more than one axis

df2.loc[criterion & (df2['b'] == 'x'), 'b':'c']

``iloc`` distinguishes between different types of boolean indexing. If the type of boolean indexing is ``Series``,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iloc supports two kinds of boolean indexing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the indexer is a boolean Series, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -933,6 +933,20 @@ and :ref:`Advanced Indexing <advanced>` you may select along more than one axis

df2.loc[criterion & (df2['b'] == 'x'), 'b':'c']

``iloc`` distinguishes between different types of boolean indexing. If the type of boolean indexing is ``Series``,
an error will be raised. For instance, in the following example, ``df.iloc[sr.values, 1]`` is ok.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what type of error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see below it's a ValueError, just say that here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you need two backticks around ValueError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


.. ipython:: python

df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], list('abc'), ['column_1', 'column_2'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the column names 'A', 'B' (to make this look a bit cleaner)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@junjunjunk
Copy link
Contributor Author

Thanks @arw2019 .
This was a very helpful review. I made the changes accordingly.

@rhshadrach rhshadrach added Docs Index Related to the Index class or subclasses labels Oct 5, 2020

.. ipython:: python

df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], list('abc'), ['A', 'B'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specify index= and columns= here? I think it will be easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for review! That seems reasonable. Done.

.. ipython:: python

df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], list('abc'), ['A', 'B'])
sr = (df['A'] > 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like most of this doc uses s for Series. Recommend sticking with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

sr = (df['A'] > 2)
sr

df.loc[sr, df.columns[1]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 'B' instead of df.columns[1] would be better here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, two more comments.

.. warning::

``iloc`` supports two kinds of boolean indexing. If the indexer is a boolean ``Series``,
an error will be raised. For instance, in the following example, ``df.iloc[sr.values, 1]`` is ok.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sr -> s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick review! I forgot to fix the doc. Done.


``iloc`` supports two kinds of boolean indexing. If the indexer is a boolean ``Series``,
an error will be raised. For instance, in the following example, ``df.iloc[sr.values, 1]`` is ok.
This type of boolean indexing is array. But ``df.iloc[sr, 1]`` would raise ``ValueError``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is array" -> "is using an array". Alternatively, "The boolean indexer is an array."
sr -> s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@junjunjunk
Copy link
Contributor Author

would love to get a hacktoberfest tag if possible. #36837

@jreback jreback merged commit 5782dc0 into pandas-dev:master Oct 8, 2020
@jreback
Copy link
Contributor

jreback commented Oct 8, 2020

thanks @junjunjunk

@junjunjunk junjunjunk deleted the is10492 branch October 15, 2020 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: Distinguish between different types of boolean indexing
4 participants