-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SharedArray performance overhead #21957
Comments
I see the same timing difference on master. The slowdown disappears is we pass the mapped array directly, i.e.,
Any ideas as to what is causing the overhead? |
Similar case previously reported and corrected - #17060 . In that case the underlying array was not being initialized at construction time and hence the slowdown. |
Perhaps #9755? Manually hoisting the |
I think I found the cause in the context of this particular issue. Defining additional indexing methods helps. With
defined, timings are similar for regular arrays and shared arrays. |
If those really solve it, they are essentially a bandaid; I think @KristofferC has probably identified the root cause. Is there any reason to keep SharedArrays mutable? For example, is |
In general maybe, however I don't think it is the root cause for this issue. Mutability/Immutability does not seem to make a difference with the below test code.
One other thing I observed - Without the additional indexing methods |
It's because you need to define What happens if you use |
With Shared Arrays without additional indexing methods defined:
Shared Arrays with additional indexing methods defined:
Regular arrays:
|
Here are the explicit code that does everything. Took me awhile to put all the pieces together, so might as well post it:)
|
So, prior to porting some code to SharedArrays, I wanted to see what the overhead of using this data structure was for the parts of the codebase that would remain single threaded. It turned out to be vastly higher than I expected. Consider the following (rather contrived) code example:
Using 1000x1000 random matrices for
A
andB
and a 1000-element random vector forv
,@benchmark
yields 4.525 ms on my machine (median from 1071 samples). Copying the same data to SharedArrays and calling the same function yields 7.361 ms, so an overhead of 1.63x.This gets weirder still. The original code I tested (probably too large to paste here) ran for ~90 ms using normal arrays (also 1000x1000). Transferring the same data to SharedArrays produced an overhead of ~3.3x. I would have expected the overhead to be smaller for a longer running calculation, not considerably larger.
Is this a known issue? Is there a way to decrease the overhead or are SharedArrays this slow for a good reason?
EDIT: in case it matters, this was all tested on v0.5.1, on Windows 10 x64.
The text was updated successfully, but these errors were encountered: