Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make var and std work for Vector{Vector{T}} #23897

Merged
merged 2 commits into from
Oct 2, 2017
Merged

Make var and std work for Vector{Vector{T}} #23897

merged 2 commits into from
Oct 2, 2017

Conversation

andreasnoack
Copy link
Member

by removing Number restriction from some signatures as well as using broadcasting in std.

I'd also like to do something similar for cov but hit the issue that broadcast is not doing the right thing for Vector{Vector{T}}. E.g.

julia> x = [[2,4,6],[4,6,8]];

julia> x .- [1,1,1]
ERROR: DimensionMismatch("arrays could not be broadcast to a common size")
Stacktrace:
 [1] _bcs1 at ./broadcast.jl:69 [inlined]
 [2] _bcs at ./broadcast.jl:63 [inlined]
 [3] broadcast_shape at ./broadcast.jl:57 [inlined]
 [4] broadcast_indices at ./broadcast.jl:49 [inlined]
 [5] broadcast_c at ./broadcast.jl:310 [inlined]
 [6] broadcast(::Function, ::Array{Array{Int64,1},1}, ::Array{Int64,1}) at ./broadcast.jl:434

Do we have an issue for that?

Fixes #23884

…ction

from some signatures as well as using broadcasting in std. Fixes #23884
@DNF2
Copy link

DNF2 commented Sep 27, 2017

Does this not go in the opposite direction to the strategy of not auto-vectorizing functions?

Why not just use: std.(x) and cov.(x)? It seems odd that std should support vectors of vectors, since v^2 and sqrt(v) makes no sense for vectors.

@andreasnoack
Copy link
Member Author

Why not just use: std.(x) and cov.(x)?

It is not the same thing.

julia> x = [[2,4,6],[4,6,8]];

julia> var(x)
3-element Array{Float64,1}:
 2.0
 2.0
 2.0

julia> var.(x)
2-element Array{Float64,1}:
 4.0
 4.0

You can compute the variance of a vector. In Julia, we have decided that it is just the diagonal of the covariance matrix. The standard deviation is a bit tricky, though. It could either be the diagonal of the matrix square root of the covariance matrix or the element wise square root of the diagonal of the covariance matrix. This PR uses the latter definition.

@mschauer
Copy link
Contributor

@DNF2 The empirical covariance is a function which maps a collection of equal-length vectors to a matrix (this involves the outer product v*v', not on v^2). In order to work on arrays it interprets the array as collection of vectors (therefore the vardim argument, as there are two possible interpretations of a matrix as vector of vectors). std.( ) does something entirely different (computing many standard deviations)

@DNF2
Copy link

DNF2 commented Sep 27, 2017

Hm. In #23884 I guess I misread the answer to

What do you expect to get?

probably because the example vectors were so similar. I thought the request was to auto-broadcast.

Thanks for the explanation.

@mschauer
Copy link
Contributor

broadcast is not doing the right thing

Cases like this could be caught in the signature, right? E.g.

julia> Vector{Vector} <: S where {S<:AbstractVector{T}} where {T<:AbstractVector}
true

but maybe it is better to fix that at the root.

@TotalVerb
Copy link
Contributor

@andreasnoack We do have an issue for it: #18379

@andyferris
Copy link
Member

I have to admit, I find this a bit surprising. I'm trying to put my finger on it - but I think that it has something to do with the fact the the properties of Vectors as objects obey certain change-of-basis relationships. The examples above seem to be some kind of automatic element wise thing (no, not var.(...) but something else) and is definitely basis dependent.

If you were to ask me what I thought the "variance" of a set of vectors was, I would answer the covariance or it's determinant or the RMS L2 distance from the mean, not some element wise calculation. The covariance works generically with number as sum(v*v' for v in vecs) and last one like sum(v'*v for v in veces) (ignoring mean).

@mschauer
Copy link
Contributor

Returning the "diagonal of the covariance matrix" is basically component-wise application of the var function. So that is a case of automatic broadcasting which is not easily expressed in dot operations for Vector{Vector}s (also not easily expressed in other implicit form, e.g. var.(collect(zip(vectors...))) is not so nice)

@andreasnoack
Copy link
Member Author

andreasnoack commented Sep 29, 2017

I'd also prefer the covariance matrix but we have kind of already made the decision since

julia> X = randn(10,2);

julia> var(X, 1)
1×2 Array{Float64,2}:
 0.69424  0.894716

whereas

julia> cov(X, 1)
2×2 Array{Float64,2}:
  0.69424    -0.0762704
 -0.0762704   0.894716

@bramtayl
Copy link
Contributor

This seems like a job for a zip-like iterator.

@kshyatt kshyatt added the maths Mathematical functions label Sep 29, 2017
@andreasnoack
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@andreasnoack andreasnoack merged commit b10833b into master Oct 2, 2017
@andreasnoack andreasnoack deleted the anj/var branch October 2, 2017 14:59
@mbauman
Copy link
Member

mbauman commented Feb 22, 2018

I just discovered this — it seems quite strange to me. How are we defining var? Is it not just sum(abs2, v - mean(v)) / (length(v) - 1)? Why are we (effectively) defining abs2(v::Vector) = abs2.(v)?

@mbauman
Copy link
Member

mbauman commented Feb 23, 2018

Ok, after discussing this more with Andreas, I'm on board with the behaviors here… but we might want to look into updating the docstring… and possibly moving away from broadcasting. Basically, cov([[1,2,3],[4,5,6]]) is a well-defined operation based upon multiplication of the transpose — that works just fine for vectors of vectors. And the variance can be defined as the diagonal of the covariance matrix.

Moreover, this is an operation that is good to support because we implicitly do it with var(X, 1) — that's effectively var([X[:,1],X[:,2],…]).

Using broadcasting here still tastes a little funny to me, because it is effectively imposing diag(X*X') == abs2.(X) for all types. If a collection were to support broadcasting and addition/subtraction (but not multiplication), should we still use this same definition?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants