-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve offsets-generation and uniformity offset count for the N-1 case. #219
Improve offsets-generation and uniformity offset count for the N-1 case. #219
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning this up Petros. The changes to GetOffsetArraySequential
look like a big improvement in hygiene; thanks for contributing that. One minor typo in variable name, but otherwise it all looks good to me.
I'm having a harder time understanding the changes to GetOffsetArrayRandom
to divide it into two parts. I would feel better if someone else with more late-night brainpower could take a closer look to really understand the second phase. @JulianKunkel or @johnbent ?
If nobody is available I'll see if someone at NERSC can look at it.
offsetArray[k] += | ||
(i * test->numTasks * test->blockSize) | ||
+ (pretendRank * test->blockSize); | ||
IOR_offset_t next_offset, incrament, trips; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incrament
should be increment
no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. Not sure what I was thinking. Must have been a bad substitute. Thanks for catching it.
offsetArray[i] = next_offset; | ||
next_offset += incrament; | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a '/* single shared file */' comment after '} else {'.
offsetArray[i] = next_offset; | ||
next_offset += incrament; | ||
} | ||
} else { /* usual case is segmentCount=1 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need an explicit test for segmentCount==1? What happens here if it isn't?
/* reorder array */ | ||
for (i = 0; i < offsets; i++) { | ||
value = rand() % offsets; | ||
if (offsets < 2*first) first = offsets; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is special about the number 2 here?
} | ||
} | ||
/* start with same array of offsets as in sequential case */ | ||
offsetArray = GetOffsetArraySequential(test, pretendRank); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are in the random branch? But we start with sequential? Does this make it less random than it used to be?
value = start[j]; | ||
tmp = offsetArray[value+k]; | ||
offsetArray[value+k] = offsetArray[i+seq]; | ||
offsetArray[i+seq] = tmp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the three previous lines would look a bit cleaner with a helper function:
swapArrayElements(offsetArray,value+k,i+seq);
I think this would look better and be slightly easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite a bunch of changes.
I believe we should have a unit test for the function to allow proper testing of these changes.
That would help to verify in the long run that the code remains stable.
I would be grateful if you could add such a test.
@JulianKunkel Is there a document or format that I can reference for a unit test? |
We've struggled to decide on a proper testing framework, and at present, it is limited to what's in I am not enough of a computer scientist to know a reasonable way to test randomness in the offset generation subroutine. Hopefully someone else can help. Worst case, |
I had started the testing under src/test when the API was created. I have simplified adding a test now and added one for the "sequential". Should hopefully be easier to add more tests now. |
To collect some more evidence, I have done some benchmarking on my laptop regarding the current strategy in order to reconsider. At the moment, all the offsets are precomputed (even for sequential access), this increases the memory footprint and the runtime. When using stonewalling with very high numbers it may cause performance issues. IMHO it would be best to integrate the calculation of the offsets into the processing instead of pre-computing them. That would avoid the memory consumption and the overall need. |
I have now implemented the new behavior in the PR listed above and done quite some testing.
|
I'm closing this in favor of: Fix offset integration #270 |
Patch to improve the efficiency of the offsets-generation and uniformity of offset count for the N-1 case.
This patch addresses the following:
IOR generates an array for the sequence of offsets. The GetOffsetArraySequential() and GetOffsetArrayRandom() function calls are included in the timed test. For the Random offsets, this can potentially be a significant overhead with very small offsets to very large files and impact the reported performance. Both of the GetOffsetArray functions are rewritten to improve efficiency.
For the random offset array generated for the single-shared-file (N-1) case, this includes a random number of offsets for processes such that the load balance is not uniform. With the change to GetOffsetArrayRandom(), the single-shared-file case now has a uniform number of offsets per rank.