-
Notifications
You must be signed in to change notification settings - Fork 50
Distributing data
Brandon Holt edited this page Nov 27, 2012
·
3 revisions
Situations where giving no control over distribution of data poses performance concerns. List solutions on top of Grappa and/or solutions within Grappa memory allocation.
- A forall_local approach will preserve ordering but not consecutive elements on one node.
- Could cache adjacent elements in computation--they are not guaranteed to be local, but in most cases would be. This is similar to the future story for forall_local where we'll cache items so that if an object spans more than one block, it'll still work out.
- e.g., prefix sum
- The arrays may be distributed differently, disabling straightforward forall_local approach.
- Within-Grappa solutions
- Provide construct to malloc an array of same type T with the same start node as another array. For allocator simplicity this might be done in a way that wastes space.
- Atop-Grappa solutions
- Allocate the second array to be large enough so that the effective start pointer can be to the same node as the first array.
- Struct-of-arrays to array-of-structs transformation, to force the two arrays to be distributed alike.