Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate non-keyword arguments for methods with inplace #41485

Closed
30 of 31 tasks
MarcoGorelli opened this issue May 15, 2021 · 35 comments
Closed
30 of 31 tasks

Deprecate non-keyword arguments for methods with inplace #41485

MarcoGorelli opened this issue May 15, 2021 · 35 comments
Labels
Deprecate Functionality to remove in pandas good first issue
Milestone

Comments

@MarcoGorelli
Copy link
Member

MarcoGorelli commented May 15, 2021

Background

Having many parameters with defaults makes static typing a lot harder when one of them changes the return type, as they require many* overloads.

There's still no broad agreement about whether or not to deprecate the inplace argument, so in the meanwhile, if we force default-arguments to be keyword-only, then we can at least set the correct return types for these methods without requiring all these overloads (as many as 38 in one case!).

What to do

  1. Choose a method from the following list:
  • pandas/core/computation/eval.py:164:0, eval (skip for now)
  • pandas/core/frame.py:4718:4, set_axis
  • pandas/core/frame.py:4743:4, drop
  • pandas/core/frame.py:5134:4, fillna
  • pandas/core/frame.py:5306:4, set_index
  • pandas/core/frame.py:5572:4, reset_index
  • pandas/core/frame.py:5809:4, dropna
  • pandas/core/frame.py:5958:4, drop_duplicates
  • pandas/core/frame.py:6196:4, sort_values
  • pandas/core/frame.py:6267:4, sort_index
  • pandas/core/generic.py:6432:4, ffill
  • pandas/core/generic.py:6495:4, bfill
  • pandas/core/generic.py:6700:4, interpolate
  • pandas/core/generic.py:7473:4, clip
  • pandas/core/generic.py:9080:4, where
  • pandas/core/generic.py:9234:4, mask
  • pandas/core/indexes/base.py:1530:4, set_names
  • pandas/core/indexes/multi.py:810:4, set_levels
  • pandas/core/indexes/multi.py:997:4, set_codes
  • pandas/core/resample.py:831:4, interpolate
  • pandas/core/series.py:1279:4, reset_index
  • pandas/core/series.py:2028:4, drop_duplicates
  • pandas/core/series.py:3228:4, sort_values
  • pandas/core/series.py:3438:4, sort_index
  • pandas/core/series.py:4475:4, set_axis
  • pandas/core/series.py:4490:4, drop
  • pandas/core/series.py:4713:4, fillna
  • pandas/core/series.py:5064:4, dropna
  • read_csv
  • read_table
  • concat
  1. Use pandas.util._decorators.deprecate_nonkeyword_arguments to issue a FutureWarning that the method's default arguments will be keyword-only in the next release.

  2. Fixup any tests / internal code which uses such default arguments positionally. See this gist for a helper-script which can help identify cases which need fixing, just replace "drop" with whatever method you're working on and then run it as python visitor.py.

  3. Add a new test, checking that a FutureWarning is raised if this method's default arguments are used positionally.

  4. Add a note to doc/source/whatsnew/v1.3.0.rst

See #41511 for an example of how to do this.

In most of these methods, the first argument is probably fine to be kept as is, it's the ones after that (such as axis) which should become keyword-only.


*(2^{number of default types preceding the one to overload} +2)


It's really common to call replace with both value and to_replace as positional, so I think we can make an exception for that one. It'll only require 6 overloads, which I'd argue isn't too many, especially if no other method requires more than 3 or 4


TODO: unify the whatsnew entries

@MarcoGorelli MarcoGorelli added Deprecate Functionality to remove in pandas good first issue labels May 15, 2021
@MarcoGorelli MarcoGorelli changed the title WIP Deprecate non-keyword arguments for methods with inplace Deprecate non-keyword arguments for methods with inplace May 15, 2021
@jmholzer
Copy link
Contributor

I would like to work on this issue, it would be my first contribution to Pandas.

It could take me a week or two. If that is unacceptable, feel free to assign this to someone else.

@jmholzer

This comment has been minimized.

@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented May 15, 2021

Awesome! @jmholzer please start by choosing one of the methods listed above, and comment here which one you've chosen

No need to comment 'take' here, many people will be working on this one together

@jmholzer
Copy link
Contributor

Sorry, didn't see the 'what to do' edit.

eval, as it's first ;)

@jmholzer
Copy link
Contributor

Adding the pandas.util._decorators.deprecate_nonkeyword_arguments decorator to eval seemed to cause it to break (could not run example in docs, tested before and after adding decorator, ensuring decorator was only change). I can open an issue if anyone else would like to confirm.

To sanity check my method, I went through the process above for drop_duplicates (in frame and series) without any problems, I have opened a pull request (#41500) for these changes.

@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented May 16, 2021

Adding the pandas.util._decorators.deprecate_nonkeyword_arguments decorator to eval seemed to cause it to break

Thanks for looking into this - I just tried it, and confirm that it breaks for me too.

Let's skip eval for now then, the others seems to be working fine

@Anupam-USP
Copy link
Contributor

Is someone already in this or can I try?

@MarcoGorelli
Copy link
Member Author

Is someone already in this or can I try?

Hey @Anupam-USP - feel free to take any of the non-ticked methods

@Jiezheng2018
Copy link
Contributor

@MarcoGorelli I am not confident that I did the pull request correctly. I only completed the first two instructions, and I didn't use the gist you created.

@jackzyliu
Copy link
Contributor

jackzyliu commented May 17, 2021

Hi @MarcoGorelli I'd like to work on the two .fillna functions (series and frame) if they are still open.

@MarcoGorelli
Copy link
Member Author

Ok, I will take interpolate then. The last one!

Hi @Anupam-USP - are you still working on resample.interpolate?

@Anupam-USP
Copy link
Contributor

@Anupam-USP Thanks!

This wasn't part of the original issue, but read_csv also has keyword argument which changes the return type, if you wanted to make a PR for that one too that would be welcome!

No @MarcoGorelli, you suggested me to do read_csv. So should I go for it now?

@MarcoGorelli
Copy link
Member Author

I wrote

if you wanted to make a PR for that one too

so if you wanted to do this one too, that would be welcome

@Anupam-USP
Copy link
Contributor

Okay! Surely I will try it

@tegardp
Copy link
Contributor

tegardp commented May 29, 2021

may I try working on the read_table?

@MarcoGorelli
Copy link
Member Author

may I try working on the read_table?

go ahead, thanks! see what's been done for read_csv as an example

@simonjayhawkins
Copy link
Member

@MarcoGorelli can you confirm just #41699 and #41718 outstanding.

@MarcoGorelli
Copy link
Member Author

@simonjayhawkins that's right (and the clean of up these whatsnew entries, but I can do that later today)

@Anupam-USP
Copy link
Contributor

Only eval is left. Should I take it?

@MarcoGorelli
Copy link
Member Author

Only eval is left. Should I take it?

if you can figure out how to do it, then sure - it's a bit tricker though, I'm not sure myself why it was failing when I tried it earlier

@simonjayhawkins
Copy link
Member

@MarcoGorelli Thanks for coordinating this effort. i'll close this, maybe open an issue for pandas/core/computation/eval.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas good first issue
Projects
None yet
Development

No branches or pull requests

10 participants