implement Ac_mul_B(Vector, Matrix) to avoid temporary #360

goszlanyi · 2016-08-16T16:53:44Z

Let mx be a 3x3 matrix, and vv be an nx3 matrix.

Up to now v=vv[row,:] was a row vector, i.e. a 1x3 matrix,
so the matrix multiplication v*mx worked, and provided a 1x3 row vector.

In 0.5.0 Julia this has changed,
v=vv[row,:] rows are now real 1-dimensional vectors.

This is welcome change.
But as a side effect, it became more obvious and painful
that v*mx does not work for real 1-dimensional vectors.

I know that transpose(mx)*v is a workaround,
but I think this code rewrite is an unnecessary requirement.

Considering the original definition of matrix multiplication,
summing inner indices should naturally work in this case.
A good example is Fortran's matmul(v,mx): it just works fine.

I would greatly appreciate your opinion,
and any pointers to your future plans.

The text was updated successfully, but these errors were encountered:

andreasnoack · 2016-08-16T18:00:17Z

What about either doing

v'mx

or

v=vv[row:row,:]

?

goszlanyi · 2016-08-16T18:09:23Z

Yes, these are all working tricks.
But conceptually not clean.

I think Julia is still struggling with some Matlab-like behaviour.
Having true 1-dimensional vectors is very important,
and with 0.5.0 Julia made a big step in this direction.
It would be nice reaching a really consistent state of its own.

andreasnoack · 2016-08-16T18:19:13Z

How does Fortran handle complex conjugation in matmul(Vector,Matrix)? Is x*A the same as A'x or conj(A'x)?

ararslan · 2016-08-16T18:19:28Z

Personally I don't see v'mx as a trick; it seems like the sensible way to go about this. If you think of two vectors in some space, the proper way to multiply them would be to multiply one by the transpose of the other, as vectors from a purely mathematical standpoint don't necessarily have an "orientation" (i.e. row or column).

I think Julia is still struggling with some Matlab-like behaviour.

Actually, what we had prior to APL-style slicing was more similar to Matlab, which makes a distinction between column and row vectors.

goszlanyi · 2016-08-16T18:34:11Z

@andreasnoack

As far as I see, there is no implicit complex conjugation involved.

This is what the Metcalf-book says:
matmul(matrix_a,matrix_b) performs matrix multiplication.
For numeric arguments three cases are possible:

i) matrix_a has shape (n,m) and matrix_b has shape (m,k).
The result has shape (n,k) and element (i,j) has the value
sum( matrix_a(i,:) * matrix_b(:,j) )

ii) matrix_a has shape(m) and matrix_b has shape (m,k).
The result has shape (k) and element (j) has the value
sum( matrix_a * matrix_b(:,j) )

iii) matrix_a has shape (n,m) and matrix_b has shape (m).
The result has shape (n) and the elemeny (i) has the value
sum( matrix_a(i,:) * matrix_b )

andreasnoack · 2016-08-16T18:40:48Z

Hm. Thanks. That was a surprising choice. In any case, I think this is very much #42 territory.

goszlanyi · 2016-08-16T18:45:37Z

Also note, that in Fortran transpose is defined only for an array of rank two.
In Fortran-speak creating a one-column matrix from a 1-dimensional vector is not a transpose.

goszlanyi · 2016-08-16T18:50:13Z

@ararslan
I agree that from the current options v'mx seems to be the best.
But is there a performance penalty?

ararslan · 2016-08-16T19:03:32Z

Performance penalty compared to what?

goszlanyi · 2016-08-16T19:11:05Z

Compared to storing predefined mxT=mx' and using mxT*v,
that also returns a nice 1-dimensional vector.

ararslan · 2016-08-16T19:14:58Z

I wouldn't think so offhand (though I'm certainly no expert), but you could check with @time. :)

KristofferC · 2016-08-16T19:18:28Z

No performance penalty, no.

goszlanyi · 2016-08-16T19:32:13Z

I have just tested.
There is a significant performance penalty.

n = Int(10e6)
vv = rand(n,3)
mx = rand(3,3)
mxT = mx'

f1(vv,mx) = (@time for i=1:n; y=vv[i,:]'*mx; end)
f2(vv,mxT) = (@time for i=1:n; y=mxT*vv[i,:]; end)

f1(vv,mx)
  4.843726 seconds (120.00 M allocations: 5.215 GB, 8.36% gc time)
  4.836846 seconds (120.00 M allocations: 5.215 GB, 8.33% gc time)

f2(vv,mxT)
  2.714492 seconds (80.00 M allocations: 3.278 GB, 9.10% gc time)
  2.709794 seconds (80.00 M allocations: 3.278 GB, 9.07% gc time)

andreasnoack · 2016-08-16T19:46:26Z

Just checked and v'A actually calls the fallback in operators.jl so it allocates an unnecessary temporary. That could easily be avoided though.

goszlanyi · 2016-08-16T19:46:35Z

I have made a mistake.
Of course n should also be passed as a function argument.

f1(n,vv,mx) = (@time for i=1:n; y=vv[i,:]'*mx; end)
f2(n,vv,mxT) = (@time for i=1:n; y=mxT*vv[i,:]; end)

It brings a speedup, but the performance penalty becomes even worse:

f1(n,vv,mx)
  3.529541 seconds (100.00 M allocations: 4.768 GB, 11.86% gc time)
  3.434869 seconds (100.00 M allocations: 4.768 GB, 8.80% gc time)

f2(n,vv,mxT)
  1.490498 seconds (60.00 M allocations: 2.831 GB, 11.13% gc time)
  1.507337 seconds (60.00 M allocations: 2.831 GB, 10.67% gc time)

ararslan · 2016-08-16T19:48:24Z

Dunno if it'll make a difference but you should try moving @time outside of the functions and using @inbounds on the loops.

goszlanyi · 2016-08-16T19:51:05Z

@ararslan
I have done it. The results are basically the same.

goszlanyi · 2016-08-16T19:52:08Z

@andreasnoack
Does it mean that you can cure the performance penalty while keeping the current syntax?

andreasnoack · 2016-08-16T19:53:38Z

If you by y=vv[i,:]'*mx mean the current syntax then it should be a yes.

goszlanyi · 2016-08-16T19:55:31Z

Yes.
At present your suggestion seems to be the best,
and if there is no performance penalty involved, then I would happily use it.

andreasnoack · 2016-08-16T20:05:07Z

Hm. Maybe I spoke too soon. It seems like OpenBLAS' GEMM is slower than GEMV

julia> A = randn(3,3);

julia> x = randn(3);

julia> xt = x';

julia> let A = A, x = x
       @time for i in 1:10^6; BLAS.gemv('T', 1.0, A, x); end
       end
  0.113836 seconds (2.00 M allocations: 122.070 MB, 8.78% gc time)

julia> let A = A, xt = xt
       @time for i in 1:10^6; BLAS.gemm('N', 'N', 1.0, xt, A); end
       end
  0.188009 seconds (2.00 M allocations: 137.329 MB, 7.56% gc time)

which I don't think is reasonable.

goszlanyi · 2016-08-16T20:29:45Z

Even so, your testcode is instructive for me.

andreasnoack · 2016-08-17T02:05:46Z

I've updated the title. I think the remaining aspects of this discussion is fully covered by #42.

goszlanyi · 2016-08-17T20:11:23Z

@andreasnoack @StefanKarpinski
My original issue (and title) was about making v*mx work
when v is a real 1-dimensional vector and not a 1xn matrix,
and returning the result as a real 1-dimensional vector.

Is this your present target?

Or is it just making v'mx work efficiently (without creating a temporary)
but still returning the result as a 1xn matrix?

goszlanyi · 2016-08-17T23:15:25Z

It seems that this simple code does what I miss.
It is fast and it returns a real 1-dimensional vector.

import Base.*
*(v::Vector{Float64},mx::Matrix{Float64}) = BLAS.gemv('T',mx,v)

v = rand(3)
mx = rand(3,3)

v*mx
3-element Array{Float64,1}:
 0.786774
 0.821454
 0.444747

Update:
Unfortunately, this is not the general solution.
Integer vectors and matrices are not handled by BLAS.gemv.

andreasnoack · 2016-08-19T14:45:55Z

Or is it just making v'mx work efficiently (without creating a temporary) but still returning the result as a 1xn matrix?

Yes. The idea was that this issue tracks the problem with the temporary. For you other suggestion please refer to #42

Jutho · 2016-08-21T06:33:56Z

I don't understand this issue. v*mx is not standard mathematical notation right. Why not just write it as you suggest to implement it: BLAS.gemv('T',mx,v) is obtained by writing mx.'*v or just mx'*v if your working over the reals.

goszlanyi · 2016-08-21T06:50:46Z

I respectfully disagree.

v*mx is standard mathematical notation that contracts
the last (and only) index of v and the first index of mx.

Otherwise we should also leave mx*v undefined
in the case when v is a true 1-d vector.

Jutho · 2016-08-21T20:32:10Z

I don't think there is a or one standard mathematical definition for *; it's highly context dependent. But you are right to suggest other definitions than the one currently adopted in julia, which I agree is indeed a bit hard to capture in one `mathematical' operation.

Can I assume that your suggestion generalises to using it in a kind of multilinear algebra context as some default tensor contractions operator: contract the last index of the first object with the first index of the second, irrespective of the rank of the two tensors involved?

goszlanyi · 2016-08-21T20:50:58Z

I did not dare to go as far as multilinear algebra.
I have just suggested to clearly separate the three types of products in linear algebra.
See #42.

Jutho · 2016-08-22T08:25:18Z

What are the three types of product in linear algebra? I still don't see where vector times matrix is some standard mathematical product in any given context.

Either one works in MATLAB style where everything is matrices and * always denote matrix multiplication without discussion, or you work in some more abstract setting and you indeed need to define what * means. In abstract linear algebra, I know of inner and outer products, but most of the time * in code is neither of those two.

In an abstract setting, matrices correspond to linear mappings from vectors to vectors. So in mx*v I read * as function application. Composing such linear mappings gives rise to matrix matrix multiplication. So in m1*m2 I read * as function composition ◦. Given that these two different operations are rather close, I think it is ok to use the same symbol for both.

There is also the concept of linear forms / covectors which map vectors to numbers. These can be represented as one-dimensional objects and are isomorphic to vectors, namely using the inner product. To any linear form f, there corresponds a vector w such that f(v) = dot(w,v) for any v. You can also compose such a linear form with a linear mapping (represented by a matrix), which I think is the operation I think you are referring to. But I don't see why this would be a standard use of * in mathematics. Such linear forms are isomorphic to vectors, but not identical, and thus julia currently forces you to act with the transpose on a vector in order to get the corresponding covector (well, not exactly, but that is indeed the point of issue #42 ).

I am not defending that this choice is ideal or optimal, but I also don't see how v*mx would in any way be more standard? If any, I would argue that you actually need to use the dot product for this operation, so that this should be written as dot(v,mx). Maybe that's actually a possibility which is open, since currently dot is not defined for a combination of a vector and a matrix argument.

Jutho · 2016-08-22T08:26:14Z

My apologies, wrong button.

goszlanyi · 2016-08-22T11:27:54Z

I find instructive all what you wrote.
Maybe it should be copied to #42, and keep the present issue to track
just the problem with the temporary, as suggested by @andreasnoack.

Jutho · 2016-08-22T12:00:49Z

Yes indeed. My apologies for leading the original discussion astray. I'll make a similar post there and agree that, for now, v'mx should work efficiently and produce 1xn matrix.

andreasnoack · 2017-02-20T12:02:44Z

Fixed by JuliaLang/julia#19670

andreasnoack changed the title ~~vector-matrix product à la Fortran's matmul~~ Implement Ac_mul_B(Vector, Matrix) to avoid temporary ~~vector-matrix product à la Fortran's matmul~~ Aug 17, 2016

andreasnoack changed the title ~~Implement Ac_mul_B(Vector, Matrix) to avoid temporary ~~vector-matrix product à la Fortran's matmul~~~~ Implement Ac_mul_B(Vector, Matrix) to avoid temporary (was "vector-matrix product à la Fortran's matmul") Aug 17, 2016

andreasnoack added the linear algebra label Aug 17, 2016

StefanKarpinski changed the title ~~Implement Ac_mul_B(Vector, Matrix) to avoid temporary (was "vector-matrix product à la Fortran's matmul")~~ implement Ac_mul_B(Vector, Matrix) to avoid temporary Aug 17, 2016

goszlanyi mentioned this issue Aug 19, 2016

Vector-Matrix product again JuliaLang/julia#18136

Closed

goszlanyi mentioned this issue Aug 20, 2016

Taking vector transposes seriously #42

Closed

Jutho closed this as completed Aug 22, 2016

Jutho reopened this Aug 22, 2016

StefanKarpinski added help wanted Extra attention is needed and removed help wanted Extra attention is needed labels Oct 27, 2016

andreasnoack closed this as completed Feb 20, 2017

KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024

implement Ac_mul_B(Vector, Matrix) to avoid temporary #360

implement Ac_mul_B(Vector, Matrix) to avoid temporary #360

Comments

goszlanyi commented Aug 16, 2016 • edited Loading

andreasnoack commented Aug 16, 2016

goszlanyi commented Aug 16, 2016 • edited Loading

andreasnoack commented Aug 16, 2016

ararslan commented Aug 16, 2016

goszlanyi commented Aug 16, 2016 • edited Loading

andreasnoack commented Aug 16, 2016

goszlanyi commented Aug 16, 2016

goszlanyi commented Aug 16, 2016

ararslan commented Aug 16, 2016

goszlanyi commented Aug 16, 2016 • edited Loading

ararslan commented Aug 16, 2016

KristofferC commented Aug 16, 2016

goszlanyi commented Aug 16, 2016 • edited Loading

andreasnoack commented Aug 16, 2016 • edited Loading

goszlanyi commented Aug 16, 2016

ararslan commented Aug 16, 2016

goszlanyi commented Aug 16, 2016

goszlanyi commented Aug 16, 2016

andreasnoack commented Aug 16, 2016

goszlanyi commented Aug 16, 2016

andreasnoack commented Aug 16, 2016

goszlanyi commented Aug 16, 2016

andreasnoack commented Aug 17, 2016

goszlanyi commented Aug 17, 2016 • edited Loading

goszlanyi commented Aug 17, 2016 • edited Loading

andreasnoack commented Aug 19, 2016

Jutho commented Aug 21, 2016 • edited Loading

goszlanyi commented Aug 21, 2016

Jutho commented Aug 21, 2016

goszlanyi commented Aug 21, 2016 • edited Loading

Jutho commented Aug 22, 2016

Jutho commented Aug 22, 2016

goszlanyi commented Aug 22, 2016 • edited Loading

Jutho commented Aug 22, 2016

andreasnoack commented Feb 20, 2017

goszlanyi commented Aug 16, 2016 •

edited

Loading

goszlanyi commented Aug 16, 2016 •

edited

Loading

goszlanyi commented Aug 16, 2016 •

edited

Loading

goszlanyi commented Aug 16, 2016 •

edited

Loading

goszlanyi commented Aug 16, 2016 •

edited

Loading

andreasnoack commented Aug 16, 2016 •

edited

Loading

goszlanyi commented Aug 17, 2016 •

edited

Loading

goszlanyi commented Aug 17, 2016 •

edited

Loading

Jutho commented Aug 21, 2016 •

edited

Loading

goszlanyi commented Aug 21, 2016 •

edited

Loading

goszlanyi commented Aug 22, 2016 •

edited

Loading