RFC: Added find_extrema, ind_extrema #7327

kmsquire · 2014-06-19T22:17:07Z

Just adding parity:

findmin/indmin <=> minimum
findmax/indmin <=> maximum
findextrema/indextrema <=> extrema

~~It's named find_extrema because findextrema (and particularly indextrema) seemed less readable to me.~~ (Names no longer contain underscores.)

Just checking first if this is wanted. If so, I'll add docs. If not, please close.

Edit: large bikeshedding spreadsheet here

ViralBShah · 2014-06-20T01:45:53Z

Should we avoid the underscores in the name?

kmsquire · 2014-06-20T04:13:39Z

I commented on this above. findextrema is probably fine, but I parse indextrema as index-trema, and it seemed strange to have one with an underscore and one without.

But I can remove the underscores altogether if that's preferred.

ViralBShah · 2014-06-20T06:27:41Z

I somehow missed the comment. Sorry about that. I think it is ok without the underscore. @StefanKarpinski ?

kmsquire · 2014-06-20T08:42:58Z

I removed the underscores, and fixed the implementation to properly ignore nans (like findmin and findmax)

kmsquire · 2014-06-20T08:43:34Z

Still needs docs and tests. Shall I go ahead?

ViralBShah · 2014-06-20T09:01:31Z

lgtm

tknopp · 2014-06-20T10:29:59Z

Is the ind abbreviation really necessary? Having indextrema and indexin in base does not look so consistent.

kmsquire · 2014-06-20T17:34:53Z

So, the current naming scheme is something like this:

value	(value, index)	index
minimum	findmin	indmin
maximum	findmax	indmax
extrema	findextrema	indextrema

The indexin/indextrema issue is why I originall added underscores.

JeffBezanson · 2014-06-20T17:36:24Z

The names are inconsistent. extrema should clearly be called extremamum.

StefanKarpinski · 2014-06-20T17:44:25Z

I think the indmin and indmax names are lousy in the first place. These should be generalized and renamed to argmin and argmax. Also, extrema(x) computes minimum(x), maximum(x) while minmax(x) computes min(x), max(x) so there's a bit of inconsistency here already.

JeffBezanson · 2014-06-20T17:50:08Z

minmax only accepts 2 arguments, so I think we're ok there. There is a full separation of comparison min/max and reduction min/max.

I'm also in favor of argmin and argmax. Those are totally standard names for those functions in mathematics.

quinnj · 2014-06-20T17:52:23Z

Numpy uses argmin and argmax as well.

On Fri, Jun 20, 2014 at 1:50 PM, Jeff Bezanson notifications@github.com
wrote:

minmax only accepts 2 arguments, so I think we're ok there. There is a
full separation of comparison min/max and reduction min/max.

I'm also in favor of argmin and argmax. Those are totally standard names
for those functions in mathematics.

—
Reply to this email directly or view it on GitHub
#7327 (comment).

johnmyleswhite · 2014-06-20T17:53:12Z

I'm ok with argmin/argmax, but those standard names don't mean what indmin,indmax mean. They return the element maximizing a function, not its index. They're effectively only applied to sets in mathematics.

kmsquire · 2014-06-20T17:58:09Z

@johnmyleswhite, if you twist your thinking a little, array f is a function mapping an index to a value, so argmin/argmax on f would give you the argument maximizing the "function".

tknopp · 2014-06-20T17:59:53Z

I have to say that I had hard times finding indmax when searching for argmax that I was using in python.

johnmyleswhite · 2014-06-20T18:00:29Z

Yeah, I understand the contortion that gets used here. And I'm ok with it. Just noting that we're actually breaking consistency with math, not syncing up with it. I think the better argument is consistency with other systems, which started this wacky tradition.

kmsquire · 2014-06-20T18:01:12Z

It would be really nice if extrema could be minmax, and the other functions could be findminmax and argminmax.

Is the minmax function on two values really that useful? I do understand the parallels with min and max, but I would argue for its removal.

(Of course, I was just getting used to extrema, which I've been using a lot recently...)

kmsquire · 2014-06-20T18:03:48Z

So for right now, how about the following:

value	(value, index)	index(es)
minimum	findmin	argmin
maximum	findmax	argmax
extrema	findminmax	argminmax

(with cross references in the documentation among at least extrema, findminmax, and argminmax)

JeffBezanson · 2014-06-20T18:06:46Z

As a data point, minmax is not used anywhere in Base.

kmsquire · 2014-06-21T02:58:10Z

I've implemented and documented the changes above:

indmin/indmax -> argmin/argmax
For extrema indices, findminmax and argminmax are implemented

The main exception is that there is no findminmax along a dimension of an array. First it's unclear what the result should be (an array of tuples? a tuple of arrays?). Second, even if that were decided, findmin and findmax along a dimension are done with some of @timholy's and @lindahua's metaprogramming that I'm not familiar enough with to use.

nalimilan · 2014-06-21T07:14:19Z

I agree that the extrema/minmax distinction is not immediately clear (you have to look at the docs). But it's consistent with the fact that minimum and maximum are reductions, while min and max operate element-wise (extrema being the short form of minimummaximum). So if you want to change that, you'd need to rename this functions, or it would get really confusing: findminmax is clearly not an element-wise function.

Another thing this could be confusing is that find, findin, findfirst and friends all return a single index. It's a bit inconsistent that findmin and findmax return a (minval, index) tuple. Can't we find another name?

As a data point, argmax is called which.max in R (but I'm fine with argmax).

kmsquire · 2014-06-21T14:43:32Z

Can't we find another name?

Suggestions welcome! :-)

It's somewhat challenging to come up with a useful, consistent set of names here. This doesn't mean we shouldn't try, of course.

One challenge is that we're naming separate concepts that most other languages don't concern themselves with, so we mostly can't just borrow names.

At least with the names I proposed, we have consistency within and between arg{min,max,minmax} and find{min,max,minmax}.

But I agree that these are not consistent with min/max/minmax and minimum/maximum/extrema, plus the various find* and index* functions.

I'll post a fuller list in a little while, so we can get the big picture and have a more meaningful discussion.

kmsquire · 2014-06-21T22:38:59Z

Okay, the list of current min/max related functions are here, along with functions related to search/find for items in an array, since searching for a minimum or maximum element is a special case of this.

I've filled in similar functions in matlab, python/numpy, and have a column for R. Could someone who knows R fill in this column?

Some general concepts are below. I think it would be nice to reform the search/min/max API around these.

Iterable/array A (to be searched) may be sorted or unsorted
A can be searched for
1. a specific value x
2. multiple values xs
3. item(s) matching a conditional (function)
4. context specific values (e.g., minimum or maximum)
Search could be global, or could stop when the first item is found
There may be a single return item or multiple return items
For found items, the search may return
1. item value(s)
2. item index(es) in A
3. item value(s) and index(es) in A
4. item index(es) in xs
For n-D arrays (n > 1), when indexes are returned, they may be linear indexes, or tuples corresponding to different coordinates (e.g., (I,J), where I contains dimension-1 coordinates, and J contains corresponding dim-2 coordinates)
Returning indexes only applies to arrays or other data structures which can be linearly indexed.

Some specific notes:

Currently, search seems to focus on the array (search array A for something), whereas find seems to focus on the elements being looked for (find all xs in A, find all nonzero values, etc)
Numpy uses almost opposite meanings to Julia for min/minimum and company. In Numpy, minimum does element-wise comparison in arrays, and min finds the minimum in the array. If Julia switched to this convention, it would make consistency in this PR much easier!
findfirst and findnext could probably be reduced to find with a default start index of 1

BobPortmann · 2014-06-22T02:26:35Z

I think the solution to this is to only define min, max etc on iterables and get rid of minimum, maximum, etc. Thus if you want to find min(x,y) where x and y are scalars you will be forced to use min((x,y)). Not a big deal in my opinion, since no one would use this approach for huge lists of variables. This is the exact situation for sum where you cannot do sum(x,y) and are forced to do sum((x,y)) if x and y are scalars.

kmsquire · 2014-06-22T02:41:33Z

For previous discussions, I should be linking to #4235, #5257, #5275. @BobPortmann, please see those discussions for why getting rid of minimum and maximum probably wound't work.

BobPortmann · 2014-06-22T02:43:03Z

Of course, it would be nice if there was an operator version of min and max although I'm not sure what symbols are a good choice (%<?).

BobPortmann · 2014-06-22T02:45:29Z

@kmsquire Yes, I've seen those. But if min and max only worked for iterables, those issues would go away.

lindahua · 2014-06-22T03:11:24Z

The convention of writing min(x, y) to pick the smaller one is entrenched, and forcing min(x, y) to min((x, y)) would cause widespread breakage. I don't think it is worth it.

To me, I don't see why there is a big problem of using minimum to get the minimum element, and indmin to get its index. There has not been a lot of complaints about this minor naming inconsistency.

JeffBezanson · 2014-06-22T04:32:06Z

The problem is this: when min((x, y)) does a reduction over its iterable argument to find the minimum, what function is it reducing with? That function is scalar min, so the function exists anyway and ought to have a name.

timholy · 2014-07-07T15:10:45Z

Just FYI:

help?> findmax
INFO: Loading help data...
Base.findmax(itr) -> (x, index)

   Returns the maximum element and its index.

Base.findmax(A, dims) -> (maxval, index)

   For an array input, returns the value and index of the maximum over
   the given dimensions.

I guess it was called dims rather than region. Sorry about the confusion.

Regarding how findmin should work across dimensions, cartesian indexes would be very desirable. But it is more memory-consuming, so here I went with linear indexes. Not sure that's the best choice, however---in general I think it's a good idea to try to move away from linear indexing wherever possible.

Jutho · 2014-07-07T15:21:45Z

My apologies. I was not confused by regions versus dims. I must have looked in the old version of the manual, even though I have bookmarked the latest version.

On 07 Jul 2014, at 17:10, Tim Holy notifications@github.com wrote:

Just FYI:

help?> findmax
INFO: Loading help data...
Base.findmax(itr) -> (x, index)

Returns the maximum element and its index.

Base.findmax(A, dims) -> (maxval, index)

For an array input, returns the value and index of the maximum over
the given dimensions.
I guess it was called dims rather than region. Sorry about the confusion.

Regarding how findmin should work across dimensions, cartesian indexes would be very desirable. But it is more memory-consuming, so here I went with linear indexes. Not sure that's the best choice, however---in general I think it's a good idea to try to move away from linear indexing wherever possible.

—
Reply to this email directly or view it on GitHub.

Jutho · 2014-07-07T22:24:47Z

I seemed to have been confused by more than just this today. Trying to do two things at the time ...
I hope I am not polluting this post (too much) with my noise; I certainly have gained a better understanding and now see the complexity of minimum etc, which perform a reduction but at the same time define a predicate that might need to be found. The other functions are either about finding a solution, or performing a reduction, not both.

* For parity with findmin/indmin, findmax/indmax * indmin/indmax -> argmin, argmax

kmsquire · 2014-08-12T04:50:58Z

Bump.

Is anyone willing to take on min/max/search/find bikeshedding in v0.4? (I'm not volunteering right now.) If so, I suggest starting from the google doc here and opening another issue.
Is this PR useful? If so, please merge. If not, please either close, or suggest how to change it.

Cheers!

dhoegh · 2014-11-10T12:05:05Z

Bump.
Is this going to be merged?
Today I where in the situation that I used indmax and I would like to have indices in the array instead of the linear index.

julia>a = [1 2;0 1]
        ind2sub(size(a), indmax(a))
(1,2)

Could we make a method of argmax that returned the indices in the array instead of the linear index? If not then at least mention ind2sub in the argmax documentation to point people that need indices in the array in the right directions it took me some time to find ind2sub.

goretkin · 2014-12-04T06:27:41Z

I'd say not to assume that argminimum or whatever it ends up being called is only to be used on integer-indexable collections. One use of these functions, which returns a non-index extremizer (and makes names like "indmax" awkward) is the ApproxFun package. (The package currently does use [] for function evaluation, but only because () cannot be overridden in 0.3)

@johnmyleswhite I don't really see why argmin is inconsistent with math. I see \argmin_x f(x) often, like you said, and probably \argmin_k v[k] pretty often too. I don't know what it means to apply the argmin to a set.

mturok · 2015-03-11T17:53:00Z

I haven't read through this entire thread, but what do people think about matlab's semantics for max?

http://www.mathworks.com/help/matlab/ref/max.html

Michael

mbauman · 2015-03-11T18:15:17Z

See #4235 for a history of the design. A more recent discussion is at #9439 (comment). I (as well as most folks here, I'd imagine) see Matlab's use of max(A,[],dim) as a signal to do something entirely different as a substantial wart in the design. I'm not sure that our current division between max and maximum is optimal, but I like it a whole lot better.

StefanKarpinski · 2015-03-11T18:53:11Z

The basic issue is that Matlab's max function is both a binary operator like + and a reducer like sum; we ended up separating these two aspects of max into different functions: the binary operator max and the reducer maximum. I frequently type max when I want maximum but I do think the clean separation is a good thing.

vtjnash · 2019-06-26T21:31:01Z

It looks like Viral approved this PR already, and the dust has (mostly?) settled on the naming surrounding min/max/minimum/extrema/minmax (e.g. eliminating indmin). So this just needs a rebase.

kmsquire · 2019-06-27T04:27:25Z

Based on the original conversation, should I change findminmax and argminmax back to findextrema and argextrema, respectively?

StefanKarpinski · 2019-07-01T22:07:35Z

Yes, that seems consistent.

This is a simple extension of extant `findmin` and `findmax` methods. Depending on context (cost of `f`; whether reduction is over dims; size of array) the speedup increase is somewhere between 1.0-1.6 (no regressions). Interestingly, I noticed but could not locate a `findextrema`; there is some [mention](JuliaLang#7327) of it, but nothing in Base. If it was deemed unworthy, please excuse this errant PR.

kmsquire changed the title ~~RFC: Added find_extrema, for parity with findmin, findmax~~ RFC: Added find_extrema, ind_extrema Jun 19, 2014

Added findminmax, indminmax; min/max bikeshedding

ff29bcc

* For parity with findmin/indmin, findmax/indmax * indmin/indmax -> argmin, argmax

kmsquire mentioned this pull request Aug 26, 2014

Sorted Containers JuliaCollections/DataStructures.jl#52

Closed

jiahao force-pushed the master branch 3 times, most recently from 6c7c7e3 to 1a4c02f Compare October 11, 2014 22:06

jiahao force-pushed the master branch from cdde4df to 7fdc860 Compare October 28, 2014 04:20

MikeInnes force-pushed the master branch from 5c60996 to b1c3df3 Compare November 14, 2014 17:07

nalimilan mentioned this pull request Mar 20, 2015

Unifying search & find functions #10593

Closed

tkelman mentioned this pull request Aug 2, 2015

Doc hard, and with a vengeance #11943

Merged

kmsquire mentioned this pull request May 26, 2017

indextrema, findextrema ? #22072

Closed

nalimilan mentioned this pull request Nov 30, 2017

Rename findmin and findmax? #24865

Closed

vtjnash mentioned this pull request Jun 3, 2021

Unintuitive findmin and findmax #39203

Closed

JeffBezanson closed this Jun 3, 2021

DilumAluthge deleted the kms/find_extrema branch August 24, 2021 05:21

andrewjradcliffe mentioned this pull request Jun 22, 2022

findextrema: compute findmin and findmax in single pass #45783

Open

RFC: Added find_extrema, ind_extrema #7327

RFC: Added find_extrema, ind_extrema #7327

Conversation

kmsquire commented Jun 19, 2014

ViralBShah commented Jun 20, 2014

kmsquire commented Jun 20, 2014

ViralBShah commented Jun 20, 2014

kmsquire commented Jun 20, 2014

kmsquire commented Jun 20, 2014

ViralBShah commented Jun 20, 2014

tknopp commented Jun 20, 2014

kmsquire commented Jun 20, 2014

JeffBezanson commented Jun 20, 2014

StefanKarpinski commented Jun 20, 2014

JeffBezanson commented Jun 20, 2014

quinnj commented Jun 20, 2014

johnmyleswhite commented Jun 20, 2014

kmsquire commented Jun 20, 2014

tknopp commented Jun 20, 2014

johnmyleswhite commented Jun 20, 2014

kmsquire commented Jun 20, 2014

kmsquire commented Jun 20, 2014

JeffBezanson commented Jun 20, 2014

kmsquire commented Jun 21, 2014

nalimilan commented Jun 21, 2014

kmsquire commented Jun 21, 2014

kmsquire commented Jun 21, 2014

BobPortmann commented Jun 22, 2014

kmsquire commented Jun 22, 2014

BobPortmann commented Jun 22, 2014

BobPortmann commented Jun 22, 2014

lindahua commented Jun 22, 2014

JeffBezanson commented Jun 22, 2014

timholy commented Jul 7, 2014

Jutho commented Jul 7, 2014

Jutho commented Jul 7, 2014

kmsquire commented Aug 12, 2014

dhoegh commented Nov 10, 2014

goretkin commented Dec 4, 2014

mturok commented Mar 11, 2015

mbauman commented Mar 11, 2015

StefanKarpinski commented Mar 11, 2015

vtjnash commented Jun 26, 2019

kmsquire commented Jun 27, 2019 • edited Loading

StefanKarpinski commented Jul 1, 2019

kmsquire commented Jun 27, 2019 •

edited

Loading