You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My team routinely works with DataFrames of 50GB and up where forgetting inplace=True on mutating operations quickly becomes detrimental to shared resources. We would like to be able to default all operations with an inplace argument to True and be forced to explicitly allow copies if needed. The proposal is to use a global option to determine the default inplace value vs. making it false in each function/method. It seems like this should be transparent to the overall Pandas community which expects inplace=False semantics (still the default) but gives flexibility to "advanced" users. From what we can tell in the docs, all inplace arguments are currently False, there are no divergent cases which would prevent a single global default. Even if there were (perhaps in the future), function authors could override the global default with their opposing local default preference (not great for consistency though).
Expected Output
If set to True, all functions/methods which can operate inplace will unless explicitly overriden with an inplace=False argument.
The text was updated successfully, but these errors were encountered:
The idea, including the possibility of overriding (although this would require changing signatures), is intriguing.
Notice however that the meaning of inplace= does not strictly map to memory usage, but rather to programming style. Take for instance reset_index(): even with inplace=True it can copy data in memory - but the object will be the same.
When available (that is, when it makes sense and it is not a nightmare to implement - see e.g. reindex()), copy= corresponds more to what you have in mind.
so inplace= doesn't actually do anything, except copy internally then perform the operation and replace the original object reference. There is NO performance gain / memory gain at all (except maybe in 1 instance). So this is a completely useless option anyhow, should deprecate this: see #16529
I see. Definitely misleading then so +1 deprecation. I imagine real mutate in place operations would be a major effort so won't bother suggesting such in a new issue.
Code Sample, a copy-pastable example if possible
propose something like this:
ProblemProposal descriptionMy team routinely works with DataFrames of 50GB and up where forgetting
inplace=True
on mutating operations quickly becomes detrimental to shared resources. We would like to be able to default all operations with an inplace argument to True and be forced to explicitly allow copies if needed. The proposal is to use a global option to determine the default inplace value vs. making it false in each function/method. It seems like this should be transparent to the overall Pandas community which expectsinplace=False
semantics (still the default) but gives flexibility to "advanced" users. From what we can tell in the docs, all inplace arguments are currently False, there are no divergent cases which would prevent a single global default. Even if there were (perhaps in the future), function authors could override the global default with their opposing local default preference (not great for consistency though).Expected Output
If set to True, all functions/methods which can operate inplace will unless explicitly overriden with an
inplace=False
argument.The text was updated successfully, but these errors were encountered: