Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Row)Vector equality with Matrices #21998

Closed
staticfloat opened this issue May 20, 2017 · 15 comments
Closed

(Row)Vector equality with Matrices #21998

staticfloat opened this issue May 20, 2017 · 15 comments
Labels
arrays [a, r, r, a, y, s]
Milestone

Comments

@staticfloat
Copy link
Member

Right now, RowVectors are implicitly embedded within Matrices, but not so for Vectors. This leads to funky behaviors such as:

julia> a = Vector(1:2)
2-element Array{Int64,1}:
 1
 2

julia> am = Matrix(2, 1); am[:] = 1:2; am
2×1 Array{Any,2}:
 1
 2

julia> a == am
false

julia> a' == am'
true

There are many ways we could try to fix this. Two take two extremal approaches, we could:

  • On the conservative side, we could just define ==(Matrix, Vector) and be done with it

  • Personally I'd rather push for automatic "do what makes sense" equality checking for types that have natural isomorphisms. E.g. if I have an Array{T,N}, I think it makes sense to define equality with an Array{T,N+1} that has a trailing singleton dimension, as the former is naturally embedded within the space of the second.

I'd be happy to submit a PR implementing whatever we reach consensus on. Hopefully this is a small change to just equality and doesn't get sidetracked into anything more fundamental.

Pinging a random subset of the LinAlg posse @stevengj @jiahao @mbauman

@andyferris
Copy link
Member

Yeah - a bit more consistency on whether arrays with singleton indices are the "same" or not would be awesome.

RowVevtor complicates things because it has a non-trailing singleton dimension. This also gets interesting (to me, at least) when you combine this with non-one based indexing.

I for one hated that MATLAB made all these things equivalent and enjoyed moving to Julia where vectors and matrices were different. Broadcasting seems to cover the cases (that I need) where you want to be relaxed about this, but I see that sometimes you probably want (at least some variant of) == to be relaxed about this also, which is more efficient than all(a .== b).

@StefanKarpinski
Copy link
Member

RowVectors are naturally embedded into Matrices as well, so you'd consider [1,2,3]' == [1 2 3] by the same criteria. I'm not too concerned about this making vectors and column matrices "the same" since they're still quite easily distinguishable, e.g. via dispatch.

@staticfloat
Copy link
Member Author

Seeing as RowVector already passes the equality check, I wasn't thinking about how to include that within my proposal. :)

@staticfloat
Copy link
Member Author

staticfloat commented May 21, 2017

@StefanKarpinski coming back to this, I disagree with what you're saying about [1,2,3]' == [1,2,3]. If that is true, then we either break transitivity of == or we have some truly funky business going on with matrices where a == a' for non-square matrices:

If [1, 2, 3] == Matrix([1,2,3]), and [1,2,3]' == Matrix([1,2,3])', then we would have to have that Matrix([1,2,3]) == Matrix([1,2,3])', which seems like it should fail, since we shouldn't have Matrices equal to eachother with different shapes.

A RowVector and a Vector, when viewed from the 2-dimensional viewpoint should be qualitatively different objects. It's only when we view them from the 1-dimensional viewpoint that they collapse down into the same thing, IMO.

EDIT: I misread Stefan's post and this is all wrong

@staticfloat
Copy link
Member Author

staticfloat commented May 21, 2017

EDIT: After writing that last sentence, I see that we have an ambiguity as to whether [1,2,3] == [1,2,3]' is viewing the objects from a 1-dimensional or 2-dimensional viewpoint. Dang. My gut reaction is to say "no", but I don't have a good reason behind why yet.

Also wrong

@StefanKarpinski
Copy link
Member

StefanKarpinski commented May 22, 2017

I did NOT say that [1,2,3]' == [1,2,3] – that's crazy talk. I wrote [1,2,3]' == [1 2 3] – note the spaces on the right, not commas. I.e. a row vector is equal to a row matrix with equal contents.

@ararslan ararslan added the arrays [a, r, r, a, y, s] label May 22, 2017
@staticfloat
Copy link
Member Author

Oh wow, mental parsing failure. Yes, I agree with everything you said then.

@mbauman
Copy link
Member

mbauman commented May 22, 2017

I'm in favor of this change.

I know you're hoping to avoid a fundamental design discussion, but I think this decision is intrinsically coupled with the decision at #14770. Allowing vectors to equal column matrices currently matches their behavior since we allow them to be indexed with trailing singletons. Equality matches behavior.

But if we disallow indexing with trailing singletons, then that means to me that we're deciding that vectors should not behave like matrices, so neither should they be equal.

Last I checked, my position on #14770 was the minority.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented May 22, 2017

Let me clarify my position:

  • There should be conceptual embeddings T^0 --> T^1 --> T^2 --> T^3 --> ... from n-dimensional arrays to (n+1)-dimensional arrays for all n, into the leading dimensions of the higher space.

  • There should be a conceptual embedding of T* --> T^2 of row vectors into matrices as the second dimension.

  • This embedding guides how indexing should behave and how equality should behave – i.e. we should make the behavior match as much as possible.

  • In particular, this issue: vectors should be == to column matrices with the same contents, and row vectors should be == to row matrices with the same contents.

  • Also, indexing (aka deprecate (then remove) generalized linear indexing #14770):

    • I do not have a problem with omitting an index into a trailing singleton dimension
    • I do not have a problem with indexing past the last dimension of an array with a 1
    • I want to get rid of "generalized linear indexing" in the sense of indexing into an array with fewer indices than it has dimensions with the last index used as a linear index into the remaining dimensions – unless the omitted dimensions all have size 1 (in which case it is ok)
  • If you ask for the size of an array past its number of dimensions you get 1 (as we have forever).

I believe this is a pretty coherent view in which the embedding of lower dimensions into higher ones is maintained but code unexpectedly "working" for the wrong dimension of argument is avoided. Yes, you can potentially pass an (n+1)-array into a loosely typed routine that expects an n-array and it might not error – but only if the last dimension is singleton, and then it will actually do exactly what you expect. What you cannot do under these rules, is pass an (n+1)-array with a non-singleton last dimension and have that second dimension effectively ignored, leading to unexpected results and hard-to-find bugs instead of errors. I believe there was quite a bit of support for this view in #14770.

[Edited to generalize the example at the end since non-column matrices can be linearly indexed, so it was a bad example. Eliminating linear indexing altogether is whole other can of worms.]

@mbauman
Copy link
Member

mbauman commented May 22, 2017

Great, we're in complete agreement. It should be a relatively small patch to get indexing behaviors the rest of the way there.

@andyferris
Copy link
Member

andyferris commented May 22, 2017

Yes, I think what Stefan said makes the most sense - it's the right balance between scruffy and neat, seems to be what people expect, and is easy to use while catching the majority of "conceptual" size problems (not all, e.g. interaction with linear indexing when n == 1 in Stefan's post).

The thing which gets me is the special role of the index (not size) of 1 as being equivalent to singleton. Maybe I misunderstood, but wasn't there a push to not have assumptions about 1-based indexing in AbstractArray? Should we support singleton dimensions in zero-based arrays, for instance, for indexing and broadcasting behaviors? Or do we say that these are niche enough that singleton dimensions in these cases are corner cases that we won't officially support?

@mbauman
Copy link
Member

mbauman commented May 22, 2017

I don't think this is hard to generalize: it all depends on what the array returns for indices(A, d) where d > N. so long as your index value matches that, we can support it.

@StefanKarpinski
Copy link
Member

I think that we have to assume something about absent dimensions, which seems like it has to be that they range from 1:1. On the flip side, if you don't index into a dimension, as long as there's only one possible choice of index values – be it 1, 0, 76234, or -7 – then we can safely only assume that's the index you meant.

@mbauman
Copy link
Member

mbauman commented Sep 15, 2017

@StefanKarpinski, @JeffBezanson and I have been slowly mulling this one over for the past few weeks. While we often allow vectors to behave like 1-column matrices, that's not always the case. A salient example is APL indexing: A[ones(5), ones(5)] is very different from A[ones(5,1), ones(5,1)]. Another example is how we allow appending elements to vectors but not column matrices. The data structures are different, even if they sometimes behave similarly.

I believe the general rule here is that we allow vectors to participate in linear algebra as 1-column matrices, but in other respects their dimensionality leads to behaviors that are different, observable, and distinct.

Even more compelling is the fact that I really don't want to generalize a notion of equality that ignores trailing singleton dimensions. That would lead directly to a complaint that three dimensional "matrices" cannot participate in linear algebra — even though ones(3,2,1) would be equal to ones(3,2).

On the other hand the primary purpose of a RowVector is to participate in the linear algebra of matrices. In my view, the status quo is correct.

@mbauman mbauman closed this as completed Sep 22, 2017
@mbauman
Copy link
Member

mbauman commented Sep 22, 2017

Resolved: we aren't going to allow for equality between different dimension arrays.

@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Sep 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s]
Projects
None yet
Development

No branches or pull requests

5 participants