-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use eigen elementwise methods for addition and subtraction #427
Use eigen elementwise methods for addition and subtraction #427
Conversation
Also it would be rad to have this in the launch because I expect it would make some models a good deal faster (brms ones in particular if Paul changes the loop he currently has to do elewise multiplication). Though we hit the bug fix period so if that's a no-no it's totally reasonable |
if | ||
is_matrix (second es) && (is_matrix (first es)) | ||
then pp_scalar_binary ppf "(%a@ *@ %a)" "(%a.array()@ *@ %a.array())" es | ||
else pp_scalar_binary ppf "(%a@ *@ %a)" "elt_multiply(@,%a,@ %a)" es |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a smarter way to write this?
Hmm okay this will be a bit tricker than I thought. I need it to look like this where
assign(mu, nil_index_list(),
stan::model::deep_copy(mu).array() +
rvalue(r_1_1, cons_list(index_multi(J_1), nil_index_list()),
"r_1_1").cwiseProduct(Z_1_1).array() +
rvalue(r_1_2, cons_list(index_multi(J_1), nil_index_list()),
"r_1_2").cwiseProduct(Z_1_2).array() +
rvalue(r_1_3, cons_list(index_multi(J_1), nil_index_list()),
"r_1_3").cwiseProduct(Z_1_3).array() +
rvalue(r_2_1, cons_list(index_multi(J_2), nil_index_list()),
"r_2_1").cwiseProduct(Z_2_1).array() +
rvalue(r_2_2, cons_list(index_multi(J_2), nil_index_list()),
"r_2_2").cwiseProduct(Z_2_2).array() +
rvalue(r_2_3, cons_list(index_multi(J_2), nil_index_list()),
"r_2_3").cwiseProduct(Z_2_3).array() +
rvalue(r_3_1, cons_list(index_multi(J_3), nil_index_list()),
"r_3_1").cwiseProduct(Z_3_1).array() +
rvalue(r_3_2, cons_list(index_multi(J_3), nil_index_list()),
"r_3_2").cwiseProduct(Z_3_2).array() +
rvalue(r_3_3, cons_list(index_multi(J_3), nil_index_list()),
"r_3_3").cwiseProduct(Z_3_3).array(),
"assigning variable mu"); But! Fiddling around with a large example on my computer I can confirm this is a pretty dang nice speedup! |
Think I got it! Though it's a little hackey. Is there a one liner I can run for the cmdstan perf tests or do I just need to run this the old fashioned way? I'm not sure how many of the perf tests use elewise functions so not sure if an effect would show up there |
Ack, it looks like I need to fix some stuff in math before we cam use this generally. Gonna close for now and see how much that would be |
On Jan 13, 2020, at 6:46 PM, Steve Bronder ***@***.***> wrote:
Also it would be rad to have this in the launch because I expect it would make some models a good deal faster (brms ones in particular if Paul changes the loop he currently has to do elewise multiplication). Though we hit the bug fix period so if that's a no-no it's totally reasonable
My understanding is that we've hit code freeze, so no new features for this coming release. The point of having regular releases is so that it's never too long a wait for the next release. Back when we tried to keep adding just one more thing, the release schedule got out of hand.
I understand the code you're generating, but not how it'll get generated in general. Will addition always return the expression template for +? Does it mean we should get rid of the functions add() and elt_multiply() for matrices? Is every function, etc., OK taking in these expression templates or might something break elsewhere from this?
|
Yep makes sense! I'm fine with waiting
Yes + at the top level would return a CWiseBinaryOp. That feels good to me because it will be a vectorized op that can be combined with other expression
Out base add and elt_multiply just call + and cwiseProduct under the hood and immediately cast them to a eigen matrix. I think they would still get called for arrays so we should keep them Is every function, etc., OK taking in these expression templates or might something break elsewhere from this? Some break which is why I closed this for now. Stan math function would have to be able to take in eigen expressions. Most can but i think something funky is happening with transpose that wont compile |
Errors are like
https://jenkins.mc-stan.org/job/stanc3/view/change-requests/job/PR-427/4/consoleFull So the lower level math ops need to be able to handle eigen expressions better |
Right now vectorized code using
.*
and other elementwise methods is usually slower than just using a raw for loop (see #825 in brms).This
where
J_1
is a vector of ints producesMakes c++ like
This is not good because
add
returns back anEigen::Matrix<>
and doesn't allow for the operations to build up into an expression tree. But in stan math we've addedvar
andfvar
as custom scalar types in eigen so we should be able to build up an expression tree here of eigen operations with something likeassign(mu, nil_index_list(), stan::model::deep_copy(mu) + rvalue(r_1_1, cons_list(index_multi(J_1), nil_index_list()), "r_1_1").array() * Z_1_1.array() + rvalue(r_1_2, cons_list(index_multi(J_1), nil_index_list()),"r_1_2").array() * Z_1_2.array(), "assigning variable mu");
This PR adds this for addition of two matrices and elementwise multiplication. If people think this is cool and good I can add it for other elementwise ops for eigen matrices
Tests
I added elementwise add and multiply to the mother model because they weren't there before
License under BSD3