Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Deprecate the convert parameter completely #17831

Merged

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Oct 10, 2017

Previously, we weren't issuing a warning if the user happened to pass in the original default of True,
which would cause downstream code to break.

Closes #17828

@gfyoung gfyoung added the Deprecate Functionality to remove in pandas label Oct 10, 2017
@gfyoung gfyoung added this to the 0.21.0 milestone Oct 10, 2017
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left a few comments, in general IMO it is not needed to explain this for the user, for them the default is still True

@@ -2135,13 +2135,18 @@ def _take(self, indices, axis=0, convert=True, is_copy=True):
axis : int, default 0
The axis on which to select elements. "0" means that we are
selecting rows, "1" means that we are selecting columns, etc.
convert : bool, default True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave this as True

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will fix.


Whether to convert negative indices into positive ones.
For example, ``-1`` would map to the ``len(axis) - 1``.
The conversions are similar to the behavior of indexing a
regular Python list.

If ``None`` is passed in, negative indices will be converted
to positive indices.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed IMO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will fix.

@@ -856,6 +856,10 @@ def test_take(self):
result = df.take(order, convert=False, axis=0)
assert_frame_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = df.take(order, convert=False, axis=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be True?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it should. Oops 😄

sp.take([1, 5], convert=False)

def test_numpy_take(self):
sp = SparseSeries([1.0, 2.0, 3.0])
indices = [1, 2]

# gh-17352: older versions of numpy don't properly
# pass in arguments to downstream .take() implementations.
warning = FutureWarning if _np_version_under1p12 else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is not the case anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I need to put that back. However, I different default will be needed, as convert=None breaks downstream compatibility with numpy:

https://travis-ci.org/pandas-dev/pandas/jobs/285909832

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the problem with None for numpy? You linked to a sparse failure, but in the sparse code you left the default as None.

If it is needed, another solution might be to push it into the kwargs? (so the -1 is not visible)

BTW, can you push new commits instead of squashing into the previous one? That makes it easier to say what you changed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, missed your message before I pushed.

The problem is that numpy's take implementation for slightly older versions passes in None for one of its parameters by position. This parameter corresponds to the convert parameter for our version of take. It's messy compatibility stuff that I had to wade through at some point that I patched on both sides.

@gfyoung gfyoung force-pushed the fully-deprecate-convert branch 2 times, most recently from 2f2db4f to c5d0f08 Compare October 10, 2017 08:07
@TomAugspurger TomAugspurger mentioned this pull request Oct 10, 2017
64 tasks
@gfyoung
Copy link
Member Author

gfyoung commented Oct 10, 2017

The new kwarg gets past the original failure but need a more robust comparator.

nv.validate_take(tuple(), kwargs)

if not convert:
def take(self, indices, axis=0, convert=-1, is_copy=True, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can’t u just use None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback : breaks compatibility with older numpy versions

Copy link
Contributor

@jreback jreback Oct 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how, you are stripping the kwarg before passing it on

convert = nv.validate_take_with_convert(convert, args, kwargs)

if not convert:
def take(self, indices, axis=0, convert=-1, *args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same this is very non intuitive to use a placeholder value that is not valid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, none is not a valid parameter for a boolean input either, and the compatibility issue still stands

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure it is, you need to set the default before you check the inputs. That's the point.

if convert is not None:
     warnings.warn(.....)
      validate(convert)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback : I do set the default beforehand, and everything is green now.

@codecov
Copy link

codecov bot commented Oct 11, 2017

Codecov Report

Merging #17831 into master will increase coverage by <.01%.
The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17831      +/-   ##
==========================================
+ Coverage   91.22%   91.22%   +<.01%     
==========================================
  Files         163      163              
  Lines       50014    50016       +2     
==========================================
+ Hits        45627    45629       +2     
  Misses       4387     4387
Flag Coverage Δ
#multiple 89.03% <88.88%> (+0.01%) ⬆️
#single 40.24% <55.55%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 92.09% <100%> (ø) ⬆️
pandas/core/sparse/series.py 95.28% <80%> (+0.01%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/indexing.py 92.82% <0%> (-0.19%) ⬇️
pandas/core/frame.py 97.77% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.48% <0%> (-0.1%) ⬇️
pandas/plotting/_converter.py 65.2% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 727ea20...762fd6e. Read the comment docs.

@codecov
Copy link

codecov bot commented Oct 11, 2017

Codecov Report

Merging #17831 into master will decrease coverage by 0.04%.
The diff coverage is 80%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17831      +/-   ##
==========================================
- Coverage   91.25%    91.2%   -0.05%     
==========================================
  Files         163      163              
  Lines       50038    50040       +2     
==========================================
- Hits        45661    45638      -23     
- Misses       4377     4402      +25
Flag Coverage Δ
#multiple 89.01% <80%> (-0.04%) ⬇️
#single 40.26% <20%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 92.2% <ø> (ø) ⬆️
pandas/core/sparse/series.py 95.28% <80%> (+0.01%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 63.38% <0%> (-1.82%) ⬇️
pandas/compat/numpy/function.py 92.12% <0%> (-1.22%) ⬇️
pandas/core/indexing.py 92.82% <0%> (-0.19%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e159ae...8f946b7. Read the comment docs.

@gfyoung
Copy link
Member Author

gfyoung commented Oct 11, 2017

Everything is passing. PTAL.

@bkmrkr
Copy link

bkmrkr commented Oct 11, 2017 via email

@jorisvandenbossche
Copy link
Member

Can I please unsubscribe

@bkmrkr If you scroll above, there is a button 'Unwatch' (or watch), if you click on the dropdown, you can choose which issue to follow.

@TomAugspurger
Copy link
Contributor

Thoughts @jreback? This is blocking the RC now.

@gfyoung could you explain in a bit more detail why making convert=None the default breaks compat w/ older NumPy? I would think that something like

def take(self, indices, axis=0, convert=None, is_copy=True, **kwargs):
    if convert is None:
        convert = True
    else:
        warnings.warn(...)

would work fine. Not a huge deal, just a little strange to use -1 instead of None I think.

@gfyoung
Copy link
Member Author

gfyoung commented Oct 11, 2017

@TomAugspurger : I've tried various logic (including that one), and older versions of numpy don't react well when you try to call np.take. The reason is that take=None is getting confused with older numpy's positional argument function calls internally. Trying to accommodate numpy results in me breaking compatibility with perfectly reasonable function calls to .take() in pandas.

If we were to drop support for numpy < 1.12, this wouldn't be an issue 😄

@jreback
Copy link
Contributor

jreback commented Oct 11, 2017

@gfyoung I think let's keep the signature with convert=None. If we break np.take(...) of a pandas expression, too bad. just fix the test (and you can put a note). putting in a -1 is really odd here.

@jreback
Copy link
Contributor

jreback commented Oct 11, 2017

If we were to drop support for numpy < 1.12, this wouldn't be an issue

not goning to happen :> (actually you can leave the test for >= 1.12 for this as it will work with convert=None

@gfyoung
Copy link
Member Author

gfyoung commented Oct 11, 2017

I think let's keep the signature with convert=None. If we break np.take(...) of a pandas expression, too bad.

Okay, will do.

not goning to happen :>

Oh, that's too bad. I thought there was a good chance we would do that 😄

@gfyoung gfyoung force-pushed the fully-deprecate-convert branch 3 times, most recently from 5ac5e38 to 29b6b64 Compare October 12, 2017 07:32
Previously, we weren't issuing a warning if the user
happened to pass in the original default of "True",
which would cause downstream code to break.

Closes pandas-devgh-17828.
@gfyoung
Copy link
Member Author

gfyoung commented Oct 12, 2017

Finally, all is green again. PTAL.

@@ -2172,6 +2172,7 @@ def _take(self, indices, axis=0, convert=True, is_copy=True):
selecting rows, "1" means that we are selecting columns, etc.
convert : bool, default True
.. deprecated:: 0.21.0
In the future, negative indices will always be converted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomAugspurger prob just as easy to fix this on merge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can fix that in #17858

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, why do you say that's leftover? Wasn't that added deliberately @gfyoung?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually nvm thought was a different comment

@jreback
Copy link
Contributor

jreback commented Oct 12, 2017

lgtm module a small doc comment..

@TomAugspurger TomAugspurger merged commit 92db5c9 into pandas-dev:master Oct 12, 2017
@gfyoung gfyoung deleted the fully-deprecate-convert branch October 13, 2017 01:29
ghost pushed a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017
Previously, we weren't issuing a warning if the user
happened to pass in the original default of "True",
which would cause downstream code to break.

Closes pandas-devgh-17828.
ghost pushed a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017
* upstream/master: (76 commits)
  CategoricalDtype construction: actually use fastpath (pandas-dev#17891)
  DEPR: Deprecate tupleize_cols in to_csv (pandas-dev#17877)
  BUG: Fix wrong column selection in drop_duplicates when duplicate column names (pandas-dev#17879)
  DOC: Adding examples to update docstring (pandas-dev#16812) (pandas-dev#17859)
  TST: Skip if no openpyxl in test_excel (pandas-dev#17883)
  TST: Catch read_html slow test warning (pandas-dev#17874)
  flake8 cleanup (pandas-dev#17873)
  TST: remove moar warnings (pandas-dev#17872)
  ENH: tolerance now takes list-like argument for reindex and get_indexer. (pandas-dev#17367)
  ERR: Raise ValueError when week is passed in to_datetime format witho… (pandas-dev#17819)
  TST: remove some deprecation warnings (pandas-dev#17870)
  Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) (pandas-dev#17843)
  BUG: merging with a boolean/int categorical column (pandas-dev#17841)
  DEPR: Deprecate read_csv arguments fully (pandas-dev#17865)
  BUG: to_json - prevent various segfault conditions (GH14256) (pandas-dev#17857)
  CLN: Use pandas.core.common for None checks (pandas-dev#17816)
  BUG: set tz on DTI from fixed format HDFStore (pandas-dev#17844)
  RLS: v0.21.0rc1
  Whatsnew cleanup (pandas-dev#17858)
  DEPR: Deprecate the convert parameter completely (pandas-dev#17831)
  ...
alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017
Previously, we weren't issuing a warning if the user
happened to pass in the original default of "True",
which would cause downstream code to break.

Closes pandas-devgh-17828.
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
Previously, we weren't issuing a warning if the user
happened to pass in the original default of "True",
which would cause downstream code to break.

Closes pandas-devgh-17828.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants