-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Keep positional arguments for pivot #51359
Comments
Hi, thanks for your report. Static typing wasn't the intention here (not completely anyway). index was the second argument and had a default, columns was the third and also had a default, but could never be not specified because the value is non-optional. Making them keyword only allows us to swap them and make columns non-optional |
I think it makes sense that the order is index, columns, value, especially if you are going to specify all of them. And that is exactly the time when you don't want to have to specify them as kwargs. Swapping columns to be first just because it is required seems like it's missing the point of pivot. How often do people use pivot without specifying index? AFAIK, everywhere else in pandas index is specified before columns, or at least that's the most common pattern, so I don't see why their order should be swapped here (or that their order should be taken away). Yes, it is annoying that when you only need to specify columns that isn't the first argument, but changing to kwarg-only doesn't help here, it just gives the user an error, which is effectively the same thing that happens right now if you tried to run pivot with only one argument right now: the user gets an error. I do see how there is value in updating the signature such that it is clear what arguments are optional and which are not, but I don't think it more valuable than having a semantically clear order. Truth be told, I didn't even know it was possible to specify columns without an index. Whenever I've needed to use the existing index in a pivot, I've always dropped the index and then pivoted on index/columns (I'm not actually sure if I've ever needed to do that, but that's how I would have done it). Really, in terms of pivot logic, using the existing index is just a convenience. You always need to specify what the new index/columns will be --- it just happens there is a natural default for index whereas there isn't for columns. Instead of just discarding the ability to specify a pivot positionally, perhaps it just makes sense to update the error message when the user only specifies one positional argument? Currently TypeError: pivot() missing 1 required argument: 'columns' Which personally I think is good enough, but maybe something like this would alleviate any confusion this change is trying to resolve without removing useful (and used) functionality? def pivot(
data: DataFrame,
index: IndexLabel | None = None,
columns: IndexLabel | None = None,
values: IndexLabel | None = None,
) -> DataFrame:
if columns is None:
if index is None and values is None:
raise TypeError("pivot() given no arguments; missing 1 required argument: 'columns'")
elif index is None:
raise TypeError("pivot() given 'index'; missing 1 required argument: 'columns'")
elif values is None:
raise TypeError("pivot() given 'values'; missing 1 required argument: 'columns'") |
Keeping all this aside:
Using pivot like this requires everyone that tries to understand your code to know exactly in which order the arguments are specified. There is no way to know that the first argument is the index without looking up the function signature. Not specifying the argument-names here can be very confusing real quick. |
Sort of... That's true for all positional arguments. That justifies a style guide, not deprecation. AFAIK in pandas, the order is always index, columns in signatures that contain both. It's axis 0 vs axis 1. I'd wager a fair number of people could intuit which one is which (assuming they know what a pivot table is). As a real world example of something that is emitting the warning, I have: piv = counts.pivot(['pkg_version'], ['abi_tag', 'os', 'arch'], 'count') I'd prefer in this case to keep the line length shorter and not have to be forced into making a choice between spreading this across vertical space piv = counts.pivot(
index=['pkg_version'],
columns=['abi_tag', 'os', 'arch'],
values='count') or having an extra long horizontal space. piv = counts.pivot(index=['pkg_version'], columns=['abi_tag', 'os', 'arch'], values='count') I haven't seen any arguments claiming that positional arguments are blocking important changes or increasing maintenance burden. From what I understand the motivation is stylistic, and I don't think that justifies deprecating positional arguments of pivot. |
agree, it'd be just as unreadable as let's close then, it sounds like the motivation for keeping it is also stylistic, so let's favour readibility (future readers of any code you write using |
Strong disagree. Index, columns, values have a natural order in the context of a row-major arrays. The choice of ascending versus decending is different (and tangentally if the call looked like I'll accept that nobody else seems to be raising the issue, and if I'm the only one who would see value in this, then it's not worth it. However, I will end the thread by 1. noting that the modivation for removing is also stylistic and 2. voicing my opinion that I find this moderately annoying and I think this is a mistake. |
I want to voice that my intuition when using this method has always been to provide index, followed by column(s), followed by value(s) (the singular vs plural alone drives me crazy when trying to specify the keywords). Only recently have I started to omit values for convenience. As for the example of Finally, the documentation provides the ideal order as index, columns, values. I believe this would be faster to learn by users. Example: |
…labeling Without this fix the run_30_Placement.sh will fail. Related to pandas-dev/pandas#51359
For me this change away from positional arguments makes pandas a lot more cumbersome to use. I have scripts with ~40 calls to And having to add these keywords into every new call of pivot just adds a lot of unwanted friction. |
Feature Type
Problem Description
I'd like to request the
pivot
method keep its positional arguments. Currently when you use pivot it warns:I posted a question to the mailing list asking about the rational of deprecating the positional arguments in most methods. I received a reply pointing me to this issue: #41485 that seems to suggest the motivation for this was due to static analysis.
I think this is a good change for most methods, but in the case of
pivot
, you generally specify all 3, and forcing the user to specify them as keyword arguments is a bit cumbersome. At least I find it to be cumbersome.Because there are only 3 arguments to begin with in this case, I don't think removing flexibility provides enough additional simplification to justify forcing keyword only arguments here.
Feature Description
Simply remove the FutureWarning from
pivot
and keep the code as-is.Alternative Solutions
The alternative solution is to ignore me and remove them anyway. I'd be a little sad, but I'd eventually get over it.
Additional Context
No response
The text was updated successfully, but these errors were encountered: