Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: make inplace= arguments take their default value from a global option #17968

Closed
rredburn opened this issue Oct 24, 2017 · 3 comments
Closed

Comments

@rredburn
Copy link

Code Sample, a copy-pastable example if possible

propose something like this:

pd.set_option('mode.inplace_operations_default', True)

Problem Proposal description

My team routinely works with DataFrames of 50GB and up where forgetting inplace=True on mutating operations quickly becomes detrimental to shared resources. We would like to be able to default all operations with an inplace argument to True and be forced to explicitly allow copies if needed. The proposal is to use a global option to determine the default inplace value vs. making it false in each function/method. It seems like this should be transparent to the overall Pandas community which expects inplace=False semantics (still the default) but gives flexibility to "advanced" users. From what we can tell in the docs, all inplace arguments are currently False, there are no divergent cases which would prevent a single global default. Even if there were (perhaps in the future), function authors could override the global default with their opposing local default preference (not great for consistency though).

Expected Output

If set to True, all functions/methods which can operate inplace will unless explicitly overriden with an inplace=False argument.

@toobaz
Copy link
Member

toobaz commented Oct 28, 2017

The idea, including the possibility of overriding (although this would require changing signatures), is intriguing.

Notice however that the meaning of inplace= does not strictly map to memory usage, but rather to programming style. Take for instance reset_index(): even with inplace=True it can copy data in memory - but the object will be the same.

When available (that is, when it makes sense and it is not a nightmare to implement - see e.g. reindex()), copy= corresponds more to what you have in mind.

@jreback
Copy link
Contributor

jreback commented Oct 28, 2017

so inplace= doesn't actually do anything, except copy internally then perform the operation and replace the original object reference. There is NO performance gain / memory gain at all (except maybe in 1 instance). So this is a completely useless option anyhow, should deprecate this: see #16529

so -1 on this idea.

@rredburn
Copy link
Author

I see. Definitely misleading then so +1 deprecation. I imagine real mutate in place operations would be a major effort so won't bother suggesting such in a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants