-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tuple type member matrix vs. simple matrix performance difference #13816
Comments
Try hoisting the field access outside of the loop: tvb = tv.b
for i in 1:10^6
tvb+tvb*2.0;
end |
@tkelman that does fix it, thanks! But how should one know to do that ? and why did it work ? As an end user, I am afraid their mite be many such small tricks which are not in the docs, makes me doubt the performance of my code. What I mean is, I expect compiler to do such low-level optimizations. |
Ref #9755 - I would think with an immutable this should be working already? It's a known issue that will hopefully be fixed with future compiler optimizations (#3440), but there may have been a regression here, or maybe it only happens automatically if you use |
This is due to the slowness of accessing global variables. If you put the two loops inside a function, there is no difference:
|
closing as a milder-than-usual duplicate of #8870 |
Lets say I define a tuple type with two members a,b
next, I also define a simple matrix
such that
typeof(tc)
andtypeof(tv.b)
give the same result.Now benchmarking them I get the following results:
The tuple type version is 15%-20% slower. Is this due to the overhead of using tuples and is that constant or does it grow with problem size ? I don't understand how tuple types work internally, but I read somewhere that there is no overhead if you are using the same concrete types in the computation. Does that statement apply here as-well ? Pardon me if this questions sounds naive, am trying to understand the reason for performance difference.
Assuming that the overhead of using tuple types is constant, then if my computation within the loop is complex, the relative performance different should decrease, which will be nice...
I am on Julia 0.4
thanks,
Nitin
The text was updated successfully, but these errors were encountered: