-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mdtest: -I and -n behavior #106
Comments
Please check the bugfix branch which seems to address the bug. Regarding the documentation, this will be done. |
Thanks for the quick fix! The issues with
That makes sense and it looks like mdtest is doing exactly that. So actually only the computation for the throughput is incorrect. It only relies on Line 1265 in f4afa63
|
Thx for checking.
Just uploaded a temporary fix. The code needs a major redesign as there are
several new features and some option combinations might be weird.
Am Mi., 17. Okt. 2018 um 15:18 Uhr schrieb Marc-André Vef <
notifications@github.com>:
… Thanks for the quick fix! The issues with -I and the floating point
exception were fixed. You are right, it should create 4 files per process.
What it does when specifying e.g.:
./src/mdtest -a POSIX -z 1 -b 3 -i 1 -d /tmp/test -n 40 -I 4 -F
It will create batches of subdirectories containing 4 items (-I 4) until
40 (-n 40) is satisfied.
That makes sense and it looks like mdtest is doing exactly that. So
actually only the computation for the throughput is incorrect. It only
relies on items which is set by -N. It doesn't take num_dirs_in_tree into
account albeit having the correct value (same for stating the number of
files in the output). See e.g.,
https://github.com/hpc/ior/blob/f4afa63ebffa7611044a307690223d4b23170a24/src/mdtest.c#L1265
for throughput calculation.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AE1uyrxS83xRvIk1saIiJTCwk6YKUuMXks5ulzw8gaJpZM4Xjocv>
.
--
Dr. Julian Kunkel
Lecturer, Department of Computer Science
+44 (0) 118 378 8218
http://www.cs.reading.ac.uk/
https://hps.vi4io.org/
|
Great, thanks! In terms of correctness, code and behavior look good to me. |
Thanks for fixing this so quickly @JulianKunkel. Can you merge the fix branch into master? I'll cherry-pick it up into RC for the 3.2.0 release from there. |
I've been testing the current master for mdtest and I ran into some issues with the -I and -n parameters which seem to have fundamentally changed with commit 0870ad7 causing wrong output and floating point exceptions. Below I compare the current behavior with prior behavior.
-I argument
For instance, a concurrent file creation benchmark in a single directory could be run like this:
mpiexec -n 4 src/mdtest -a POSIX -z 0 -b 1 -i 1 -d /tmp/test -I 10 -F
with a depth of 0 resulting in 10 files per process being created in the same directory. The resulting mdtest output is the following:In fact, the 40 file workload is correctly run by mdtest, but because
items_per_dir
is set by the-I
parameter and is no longer assigned to theitems
variable (used to calculate the throughput), mdtest shows 0 in all tests. The reason is that-I
is now dependent on-u
(see https://github.com/hpc/ior/blob/master/src/mdtest.c#L2340-L2342 ), which I am not sure why this has been added? However,-u
creates a completely different workload (each process operating on its own directory instead of all in the same directory). I understand that I could just use-n 10
instead of-I 10
. Nevertheless,-I
without-u
still shows different evaluation than to what was actually executed.-n argument
Floating point exception
When I execute
mpiexec -n 4 src/mdtest -a POSIX -z 1 -b 3 -i 1 -d /tmp/test -n 4 -F
, mdtest should create 4 directories (1 root 3 leaf) and then the workload is distributed across all directories, i.e., each process creates,stat,removes 1 file in each directory and this is how it worked in the past. At the moment, mdtest exits with a floating point exception (divide-by-zero. see error below) becausenum_dirs_in_tree
is 0 (used to calculate the number of files each process should process in each directory). Prior this variable was set correctly and the test ran as expected.Error message (click)
Representation of results
When I execute
mpiexec -n 4 src/mdtest -a POSIX -z 1 -b 3 -i 1 -d /tmp/test -n 4 -I 4 -F
, I am actually not quite sure what I am telling mdtest to do. I am a bit confused about the interaction of-n
and-I
now in general. Prior, both parameters couldn't be used at the same time. Perhaps the documentation could make it more clear? With above command, mdtest creates,stats,removes 4 files per process in each directory, i.e., 16 files in total per process and 64 files in total. However, mdtest tells me that the workload is 16 files instead of 64 files:In this context, I am not sure if the calculation of the throughput is even correct then. In general, I think it would be a good idea to explicitly output the number of files/directories in total, per process, and per directory to avoid confusion.
Sorry for the long write-up. I've been using mdtest for some time now and I am a bit confused on this new behavior as it possibly breaks scripts many users who use these parameters.
Thanks!
The text was updated successfully, but these errors were encountered: