-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow function in allowduplicates in unstack #2998
Conversation
Yeah I have to admit the name is a bit weird for passing a function. So you'd call the argument |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
It is hard to say what is best. Let us first decide if we want the API the way I proposed (i.e. |
I have pushed the branch using
Probably things can be further optimized but I think it is already OK. The only decision is about the name of the argument. Do we deprecate |
This approach is fast when the number of duplicates is large, but it's hard to know whether that's the case in general. Sometimes you might have only a few duplicates. I guess there's no way to be efficient all the time, except by allowing users to choose the algorithm. Maybe not a big deal.
Yeah that's probably better. |
I would lean towards names that reference |
Most of the time it will be reducer, but if you e.g. pass |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
That sounds similar to |
Yes - internally we call For now I have proposed to call the keyword argument |
how about |
The issue is that operation does not have to be aggregation. We allow any operation. But maybe indeed something like |
|
So you propose to use |
I would also put in competition |
|
I'm not sure, I was just thinking out loud. I'm trying to find a similar case in the existing API, but it turns out most of the time we don't use keyword arguments for functions. Maybe just |
This is also what I have checked. And for positional arguments If we feel |
I think |
I am ok with |
bump (as otherwise we will forget what we discussed). The question is if we accept the Thank you! |
I was going to say that combiner is OK but then I read the docstring again and I noticed we speak a lot about "combinations" when describing this argument (and others), and yet these "combinations" have nothing to do with the "combiner" (i.e. it doesn't combine values from different combinations). So maybe we should find another term to avoid the confusion. Maybe |
Unless no other comment is made on the best choice in a few days I will switch the implementation to use |
|
currently the argument for values is called |
@nalimilan - I think we need to close the discussion and make a decision (naming is always super hard unfortunately). I think |
The plural sounds indeed better given that the function will get passed all values for a given combination. Regarding the positional argument, it matters less, but note that we also use the singular for |
I am aware of |
The PR is updated. |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Thank you! |
That is true but those uses of "combinations" are all informal and not part of the API. I think it is more important to keep the formal usage of |
@adkabo - but what is your proposal for a name of this keyword argument then? Do you propose CC @nalimilan |
IIUC it is exactly a |
Well |
Yes - I think |
I think any of these proposals is better than |
I opened #3184 to keep track of it. |
Follow up to #2995
Replaces #1181
What I would discuss if
allowduplicates
is a good name for this keyword argument now. Maybe we should introduce a new keyword argument (a single one) and deprecateallowduplicates
(in a long term deprecation fashion i.e. we do need to remove it any time soon)