Bugfix MDTest calculation of multiple iterations was incorrect. #281

JulianKunkel · 2020-11-26T12:49:05Z

Fix the bug reported by Rick to increase clarity. Thanks!

The previous offset calculation when using multiple iterations was:
for (i = start; i < stop; i++) // i = table position == test number
for (k=0; k < size; k++)
for (j = 0; j < iterations; j++)
value = all[(k * tableSize * iterations) + (j*tableSize) + i];

Note that the mean and min/max was then computed over these values.
But as the values were stored in memory in the order: iteration, rank, table
the correct term is: value = all[j * tableSize * size + k * tableSize + i];

Assume iterations = 2 and size = 3, the value for the test i=0 was computed from:
all[0 * 2 *tbl + 0 * tbl] = 0tbl
all[0 * 2 *tbl + 1 * tbl] = 1tbl
all[1 * 2 *tbl + 0 * tbl] = 2tbl
all[1 * 2 *tbl + 1 * tbl] = 3tbl
all[2 * 2 *tbl + 0 * tbl] = 4tbl
all[2 * 2 *tbl + 1 * tbl] = 5tbl

A more clear traversal would have been:
all[0 * 3 *tbl + 0 * tbl] = 0tbl
all[0 * 3 *tbl + 1 * tbl] = 1tbl
all[0 * 3 *tbl + 2 * tbl] = 2tbl
all[1 * 3 *tbl + 0 * tbl] = 3tbl
all[1 * 3 *tbl + 1 * tbl] = 4tbl
all[1 * 3 *tbl + 2 * tbl] = 5tbl

In that sense, it wasn't a functional bug but it decreased readability and now that we want to print the performance of the individual ranks, it is useful to fix this.

adilger · 2020-11-28T00:42:59Z

src/mdtest.c

-                for (j = 0; j < iterations; j++) {
-                    curr = all[(k*tableSize*iterations)
-                               + (j*tableSize) + i];
+            for (j = 0; j < iterations; j++) {


Using better variable names than "i" and "j" and "k", like "iter" and "op" might avoid this kind of bug in the future.

Also, putting the "j*tableSize*size + k*tableSize + i" calculation into a small helper function like:

int calc_allreduce_index(int index, int iter, int op)

(or whatever) and using it in the places where all[] is accessed would also avoid similar usage bugs.

adilger · 2020-11-28T00:44:57Z

The previous offset calculation when using multiple iterations was:

This kind of information would be very useful in the commit message of the patch itself, rather than just the PR, since it will be easily available with the code in the future.

Backmerged #281 to fix iteration number

…for 3.3 release branch (#299)

Bugfix MDTest calculation of multiple iterations was incorrect.

4377aeb

adilger reviewed Nov 28, 2020

View reviewed changes

JulianKunkel added 2 commits November 28, 2020 10:40

Integrate review feedback.

11c784c

Merge branch 'master' into fix-mdtest-iter

ae06908

JulianKunkel merged commit 4a3e480 into master Nov 30, 2020

JulianKunkel deleted the fix-mdtest-iter branch November 30, 2020 14:17

glennklockwood mentioned this pull request Nov 30, 2020

release 3.3.0 #285

Closed

2 tasks

JulianKunkel added a commit that referenced this pull request Dec 2, 2020

Backmerged #281 to fix iteration number

7a490c8

JulianKunkel added a commit that referenced this pull request Dec 3, 2020

Merge pull request #291 from hpc/3.2-281

6efd7e6

Backmerged #281 to fix iteration number

glennklockwood mentioned this pull request Dec 18, 2020

backport #281 into 3.3 branch #298

Closed

glennklockwood added a commit to glennklockwood/ior that referenced this pull request Dec 18, 2020

backport hpc#281 to 3.3 release branch; resolves hpc#298

e8d8c2f

glennklockwood mentioned this pull request Dec 19, 2020

fix #281 ("MDTest calculation of multiple iterations was incorrect") for 3.3 release branch #299

Merged

glennklockwood added a commit that referenced this pull request Dec 22, 2020

fix #281 ("MDTest calculation of multiple iterations was incorrect") …

e92a0cd

…for 3.3 release branch (#299)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix MDTest calculation of multiple iterations was incorrect. #281

Bugfix MDTest calculation of multiple iterations was incorrect. #281

JulianKunkel commented Nov 26, 2020 •

edited

Loading

adilger Nov 28, 2020 •

edited

Loading

adilger commented Nov 28, 2020

Bugfix MDTest calculation of multiple iterations was incorrect. #281

Bugfix MDTest calculation of multiple iterations was incorrect. #281

Conversation

JulianKunkel commented Nov 26, 2020 • edited Loading

adilger Nov 28, 2020 • edited Loading

Choose a reason for hiding this comment

adilger commented Nov 28, 2020

JulianKunkel commented Nov 26, 2020 •

edited

Loading

adilger Nov 28, 2020 •

edited

Loading