Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cumsum inconvenient for RowVector #20041

Open
dpsanders opened this issue Jan 15, 2017 · 48 comments
Open

cumsum inconvenient for RowVector #20041

dpsanders opened this issue Jan 15, 2017 · 48 comments
Labels
arrays [a, r, r, a, y, s] needs decision A decision on this change is needed

Comments

@dpsanders
Copy link
Contributor

It feels like a RowVector should act one-dimensionally in the following:

julia> w = [1, 2, 3]'
1×3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w)
1×3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

julia> accumulate(+, w)
1×3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

cf. #4774 (comment)

@Shade5
Copy link
Contributor

Shade5 commented Jan 15, 2017

So I just add a special case for a row vector, overriding the default?

@ararslan
Copy link
Member

ararslan commented Jan 15, 2017

Note that cumsum, accumulate, and accumulate! actually take a dimension argument, and when set to 2 (1 is the default), all work as expected on a RowVector. I think the current behavior makes sense, given that the dimension of accumulation is in all cases being provided, and a row vector by definition has a sense of multidimensionality. (Edit: Just saw Stefan's comment about how 1 should not be the default--seems reasonable.)

@ararslan
Copy link
Member

If we did decide to treat it as one-dimensional for these functions, we'd probably want to do the same for a larger group of functions to ensure consistency.

@ararslan ararslan added the linear algebra Linear algebra label Jan 15, 2017
@dpsanders
Copy link
Contributor Author

Isn't the point that it should be treated as one-dimensional when possible? As reflected in its name, [Row]Vector.

@johnmyleswhite
Copy link
Member

johnmyleswhite commented Jan 15, 2017

That wasn't my understanding given the inheritance structure:

julia> RowVector <: AbstractVector
false

julia> RowVector <: AbstractMatrix
true

@kmsquire
Copy link
Member

kmsquire commented Jan 15, 2017 via email

@dpsanders
Copy link
Contributor Author

dpsanders commented Jan 16, 2017

What (i.e. which methods) is (are) actually inherited by the "inheritance structure"?
As far as I understand, the fact that RowVector <: AbstractMatrix is more a technical implementation detail than a philosophical position.

@Shade5
Copy link
Contributor

Shade5 commented Jan 16, 2017

So treat 1D row vector as column vector for cumsum? Which other functions can be modify like this?

@tkelman
Copy link
Contributor

tkelman commented Jan 16, 2017

Given

julia> size([1, 2, 3]')
(1,3)

I'd say these are not acting 1-dimensional at all. There's a row orientation behavior here.

@alanedelman
Copy link
Contributor

in any number of discussions around mit, we felt cumsum of a row vector should be a row vector.
The philosophy is that a row vector might act like a matrix for broadcast and hcat/vcat but otherwise it's one dimensional. Not sure what large set of functions have to be thought about. Certain scan like functions though.

At one point I wanted to argue against row vectors acting two dimensionally even for broadcast, hcat and vcat, but after listening to Steven Johnson, I couldn't find a strong argument.

So far my list of the triple status of a Julia row vector is something like this:

  • A row vector does the linear algebra thing for star and transpose
  • A row vector does the 2d container thing for hcat, vcat, and broadcast
  • A row vector does the 1d thing for scan like functions

My opinions (this week):
I don't love size returning (1,3) by the way.
and I think I'd really love for A[1,:] to return a row vector only when A is 2d (apl indexing otherwise)
but I can imagine that's perhaps not in the cards.

@tkelman
Copy link
Contributor

tkelman commented Jan 16, 2017

If we want to change this, we should be thorough and do it on all dimension-reducing functions that take a dimension argument. Has anyone collated a list of all of those? Having to special-case all of them rather than dispatching to the AbstractArray behavior would increase the maintenance burden and number of methods we need to write for RowVector, and every package that implements dimension-reducing argument functions of its own would also have to write additional methods for RowVector if the generic fallbacks are not consistent.

Collecting such a list might be a good opportunity for some rationalization and to decide whether we want to continue the Matlab-ism of having dimensions as a positional argument in many places (which leads to a number of ambiguities and inconsistent APIs), or make reduction calculation along specific dimensions more explicit via a keyword argument - assuming the performance would be comparable.

@andreasnoack
Copy link
Member

Notice that accumulate is not dimension reducing. For mapreducesim like functions, I think the only option is to use treat RowVector as a matrix. I agree that we should try to define clearly which functions should be 1-d like on RowVectors. Shape preserving (independently of value) functions might be a criterion. The number of such functions should hopefully not grow too large. Otherwise RowVector <: AbstractMatrix might be more problematic than expected. I don't expect that to be the case, though.

@andreasnoack andreasnoack added needs decision A decision on this change is needed and removed linear algebra Linear algebra labels Jan 16, 2017
@andyferris
Copy link
Member

In preparing the PR, I was leaving all the behavior of vec.' similar to now except for what * and .' does afterwards (as well as the copy-vs-view semantic). So for almost all functions, it is just a matrix, and that was intentional.

So that said, I'm not surprised by the by the behavior reported in the OP.

This doesn't mean we shouldn't aim for improvement in the future. But given that "soft" feature-freeze was 18 days ago, and what is the current implemented behavior is the minimally breaking change to get the linear algebra properties desired, it was my assumption that further breaking changes had to wait until post v0.6. If this is not the consensus, we should decide that ASAP. But if we rush we risk that the result may be somewhat incoherent.

@StefanKarpinski
Copy link
Member

The primary problem here is that scan functions default to dim=1 when – I would argue that's wrong and inconsistent with sum which sums the entire array when called without a dimension argument. We departed from Matab's default dimension behavior for sum et al. for a reason.

Instead we could define scan functions to do something that's useful for row matrices or column matrices and consistent for both. I think a useful definition which fits the bill is to e.g. have cumsum compute this for matrices:

julia> [sum(A[1:i,1:j]) for i=1:size(A,1), j=1:size(A,2)]
3×4 Array{Float64,2}:
 0.470239  0.960547  1.77296  2.7498
 1.03861   2.18006   3.8101   5.68694
 1.16206   2.33594   4.02084  6.01088

With this behavior cumsum(x) for a vector, column matrix, row matrix or row vector are all useful and consistent and do what you want.

@andyferris
Copy link
Member

Yes, I was thinking the same thing, actually!

@StefanKarpinski
Copy link
Member

Note that this computation is independent of memory layout modulo floating-point associativity. I.e. it would be independent if computations were exact, but we'll want to do the computation in memory order, scanning along the first column and then scanning along the second column, adding the corresponding element of the first column, etc. Regardless, I think that's good enough.

@GunnarFarneback
Copy link
Contributor

Also known as Summed Area Table and Integral Image.

@martinholters
Copy link
Member

Also, if we want to generalize this to accumulate (which sounds plausible?), the order would need to be specified in case someone passes a non-associative function.

@andyferris
Copy link
Member

Right, if the operation is non-associative (or commutative) then does the formula on the wiki page Gunnar posted hold still?

image

@martinholters
Copy link
Member

Continuing the thought of generalizing this idea to accumulate(op, A), I'd suggest it should mimic [reduce(op, A[1:i,1:j]) for i=1:size(A,1), j=1:size(A,2)] as per #20041 (comment) (and likewise for N≠2 dimensions), noting that reduce makes no guarantees regarding associativity. However, above formula could obviously not be applied as it would need the inverse of op. But unless I'm mistaken, it should still be possible to evaluate the result in O(length(A)) time and O(1) extra memory in addition to the output, right?

@StefanKarpinski
Copy link
Member

See also #19451, #23542.

@ViralBShah
Copy link
Member

Resolving #23018 first would make this better perhaps?

@JeffBezanson JeffBezanson added the arrays [a, r, r, a, y, s] label Nov 20, 2017
@dpsanders
Copy link
Contributor Author

dpsanders commented Dec 26, 2017

I believe this can now be closed? Current behaviour is as follows:

julia> w = [1, 2, 3]'
1×3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w)
┌ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
│   caller = top-level scope
└ @ Core :0
1×3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w, 1)
1×3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w, 2)
1×3 Adjoint{Int64,Array{Int64,1}}:
 1  3  6

@StefanKarpinski
Copy link
Member

Just to be sure: is cumsum([1,2,3]') going to be an error or equal to [1,3,6]'?

@Sacha0
Copy link
Member

Sacha0 commented Dec 27, 2017

I imagine I misunderstand your question, so let's iterate :).

My understanding: If we remove but do not replace the deprecated single-argument accumulate-like methods in 1.0, then cumsum([1,2,3]') should yield a MethodError. Alternatively, between 0.7 and 1.0 we can introduce an accumulate specialization for Adjoint/Transpose vectors that behaves as described in #19451 (comment) 's second list item, in which case cumsum([1,2,3]') should yield [1,3,6]'. I do not know what the working plan is. Thoughts? :)

@StefanKarpinski
Copy link
Member

I'm fine with deprecating and adding full functionality latere – just checking 👍

@JeffBezanson
Copy link
Member

This is deprecated, so moving off the milestone. In 1.x we can implement the summed area table interpretation if we want.

@JeffBezanson JeffBezanson modified the milestones: 1.0, 1.x Jan 3, 2018
@Sacha0
Copy link
Member

Sacha0 commented Jan 3, 2018

This thread's concerns might still apply to Adjoint/Transpose; I haven't thought it through. Perhaps label triage to make certain?

@StefanKarpinski
Copy link
Member

They all seem to be deprecated so I think we're fine for now:

julia> cumsum(rand(3)')
┌ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
│   caller = top-level scope
└ @ Core :0
1×3 Adjoint{Float64,Array{Float64,1}}:
 0.635384  0.89566  0.875083

julia> cumsum(Adjoint(rand(3)))
┌ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
│   caller = top-level scope
└ @ Core :0
1×3 Adjoint{Float64,Array{Float64,1}}:
 0.58616  0.56026  0.552915

julia> cumsum(Transpose(rand(3)))
┌ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
│   caller = top-level scope
└ @ Core :0
1×3 Transpose{Float64,Array{Float64,1}}:
 0.380938  0.0475339  0.624499

@mbauman
Copy link
Member

mbauman commented Jan 4, 2018

Are there any other functions like cumsum that we should consider here? diff?

@StefanKarpinski
Copy link
Member

Yes, that's a good point: diff, cumprod (seems already ok), accumulate (already an error for everything but vectors). So I think that diff is the only one still left with the undesirable behavior:

julia> diff([1,2,3]')
0×3 Array{Int64,2}

whereas it should behave more like this:

julia> diff([1,2,3]', 2)
1×2 Array{Int64,2}:
 1  1

however, I think that should probably produce a row vector when the input is a row vector.

@Sacha0
Copy link
Member

Sacha0 commented Jan 4, 2018

I think that should probably produce a row vector when the input is a row vector.

Adding to the list :). Though IIRC deprecating diff was on the table recently?

@StefanKarpinski
Copy link
Member

Deprecating it entirely? But it seems like such a basic, useful function. Note that n-d diff can compute the inverse operation of the partial sums table proposed for cumsum which would have the pleasant property that diff(cumsum(A)) == A[1:end-1,...,1:end-1] would always be true. Of course, it might be nicer to have a design where you don't have to slice off the last of each dimension in order for that identity to hold.

@Sacha0
Copy link
Member

Sacha0 commented Jan 4, 2018

IIRC both deprecating diff in favor of a more precise name and deprecating diff entirely were discussed. IIRC the primary motivating concern was the name diff's generality combined with parties' desire to use that name for other purposes outside base. Best!

@mbauman
Copy link
Member

mbauman commented Jan 4, 2018

I'm rather surprised to find that diff is only used as the Base.diff() function once in all of base. It's used as a different binding 107 times. I'd also be sad to see it go — but I've often seen it abused as a derivative, in which case off-by-one (or off-by-half) mistakes and dimension mismatches are extremely common. E.g., the property is that diff(cumsum(A)) == A[2:end,…,2:end], not 1:end-1.

All that said, my preference would be to fix this behavior and let it be.

@Sacha0
Copy link
Member

Sacha0 commented Jan 6, 2018

I'm rather surprised to find that diff is only used as the Base.diff() function once in all of base. It's used as a different binding 107 times. I'd also be sad to see it go — but I've often seen it abused as a derivative, in which case off-by-one (or off-by-half) mistakes and dimension mismatches are extremely common. E.g., the property is that diff(cumsum(A)) == A[2:end,…,2:end], not 1:end-1.

These observations seem like a strong argument for at least a more appropriate / specific name?

@StefanKarpinski
Copy link
Member

decumulate(-, A)?

@Sacha0
Copy link
Member

Sacha0 commented Jan 7, 2018

As an instance of decumulate[!](op, [y,] x::AbstractVector) and decumulate[!](op, [B,] A, dim)? Intriguing :).

@StefanKarpinski
Copy link
Member

Of course, one might just want to call decumulate by the name diff and allow passing an operator as a generalization :)

@Sacha0
Copy link
Member

Sacha0 commented Jan 8, 2018

One might, but that 107:1 ratio... 😄

@nalimilan
Copy link
Member

diff exists in R, Numpy and MATLAB, so I'd rather keep that name.

mbauman added a commit that referenced this issue Jan 8, 2018
Akin to `cumsum`, `diff` now requires a `dim` argument in cases where the array has more than one dimension. Final piece of #20041 for the 1.0 milestone.
@Sacha0
Copy link
Member

Sacha0 commented Jan 8, 2018

diff exists in R, Numpy and MATLAB, so I'd rather keep that name.

The evidence strongly suggests that this is something Julia can do better :).

fredrikekre pushed a commit that referenced this issue Jan 9, 2018
* Deprecate diff(::AbstractMatrix), require a dim argument

Akin to `cumsum`, `diff` now requires a `dim` argument in cases where the array has more than one dimension. Final piece of #20041 for the 1.0 milestone.

* Add reference to this PR in NEWS.md
@StefanKarpinski
Copy link
Member

The evidence strongly suggests that this is something Julia can do better :)

We're going to start calling you "Sacha, breaker of eggs" soon... 😝

@Sacha0
Copy link
Member

Sacha0 commented Jan 9, 2018

We're going to start calling you "Sacha, breaker of eggs" soon... 😝

Regrettably I am ignorant of the reference; enlighten me? :)

@Sacha0
Copy link
Member

Sacha0 commented Jan 9, 2018

Is Julia an omelette? 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests