Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fio takes long time to start processes #963

Open
jcv20 opened this issue Apr 16, 2020 · 15 comments
Open

fio takes long time to start processes #963

jcv20 opened this issue Apr 16, 2020 · 15 comments
Labels
triaged Issue cause is understood but a patch is needed to fix it

Comments

@jcv20
Copy link

jcv20 commented Apr 16, 2020

Hi,

I'm trying to run a test with >500 jobs and it takes more than 20 mins for fio to start doing IO. Attached a screenshot to show where it takes time. Is this normal when launching large number of processes or anything can be done to improve. Thanks.

fio_524procs

uname -r

4.18.0-147.el8.ppc64le

cat /etc/redhat-release

Red Hat Enterprise Linux release 8.1 (Ootpa)

fio -v

fio-3.19

cat rand.fio

[global]
name=randwrite
ioengine=libaio
iodepth=32
rw=randwrite
randrepeat=0
bs=2Mi
direct=1
ramp_time=0
runtime=600
time_based
group_reporting

[job 1]
filename=/dev/sdaee
[job 2]
filename=/dev/sdaes
[job 3]
filename=/dev/sdadf
....
....
....
[job 523]
filename=/dev/sdhx
[job 524]
filename=/dev/sdid

@axboe
Copy link
Owner

axboe commented Apr 16, 2020

Try and add norandommap to the global section.

@jcv20
Copy link
Author

jcv20 commented Apr 16, 2020

tried norandommap. fio is still taking >20 min to start doing IO.

'# time fio /tmp/rand.fio
.....
real 22m32.359s
user 4m52.282s
sys 21m12.710s

fio runtime is only 60s.

'# cat /tmp/rand.fio
[global]
name=randwrite
ioengine=libaio
iodepth=32
rw=randwrite
randrepeat=0
bs=2Mi
direct=1
ramp_time=0
runtime=60
time_based
group_reporting
norandommap

[job 1]
filename=/dev/sdaee
[job 2]
filename=/dev/sdaes
[job 3]
filename=/dev/sdadf
....
....
....
[job 523]
filename=/dev/sdhx
[job 524]
filename=/dev/sdid

@axboe
Copy link
Owner

axboe commented Apr 16, 2020

You can try and do a:

# perf record -ag -- sleep 5

while it's starting up, and then do:

# perf report -g --no-children

and see what is going on in the system. If it's just the one busy fio thread, which it looks like, I'd fire up top and find the busy pid, then do:

# perf record -g -p <pid from above>

and then run the same perf report on that. How big are the sdXXX devices?

@jcv20
Copy link
Author

jcv20 commented Apr 16, 2020

each device is 10T.

During startup: # perf record -ag -- sleep 5

perf_report-2

top shows one busy fio thread (pid 61705)

top

gathered trace (10sec) for that pid. Looks like it's waiting to write to memory ? and acquire/release mutex?
# perf record -g -p 61705

perf_report-pid

@axboe
Copy link
Owner

axboe commented Apr 16, 2020

For that last one, click the memset and mutex lock/unlock to get an expanded call trace, that'll give us a better idea of where it's happening.

@jcv20
Copy link
Author

jcv20 commented Apr 16, 2020

perf_report-pid_deatil

@axboe
Copy link
Owner

axboe commented Apr 16, 2020

OK, that makes sense, it's around the iostats setup. I'll try and take a look at this.

@sitsofe sitsofe added the triaged Issue cause is understood but a patch is needed to fix it label Apr 26, 2020
@sitsofe
Copy link
Collaborator

sitsofe commented Aug 1, 2020

(Pinging @axboe on this one)

@sitsofe
Copy link
Collaborator

sitsofe commented Jan 18, 2021

(@axboe ping)

@sekar-wdc
Copy link

Any update on this ? With 32 ns per drive on an NVMeOF setup and 8 drives, It takes about 4 minutes for fio to start even for numjobs=1. I'm running fio-3.28.

@sekar-wdc
Copy link

(@axboe ping)

@itayalroy
Copy link

itayalroy commented Jun 8, 2023

Same for me, a 250 job test takes > 2 mins to start traffic with fio 3.28.

@sekar-wdc
Copy link

OK, that makes sense, it's around the iostats setup. I'll try and take a look at this.

@axboe Any update on this ?

@vincentkfu
Copy link
Collaborator

OK, that makes sense, it's around the iostats setup. I'll try and take a look at this.

@axboe Any update on this ?

As a temporary work around, try running with --disk_util=0 if you don't need the disk utilization statistics.

@sekar-wdc
Copy link

@

OK, that makes sense, it's around the iostats setup. I'll try and take a look at this.

@axboe Any update on this ?

As a temporary work around, try running with --disk_util=0 if you don't need the disk utilization statistics.

Thanks for this @vincentkfu ! This appears to be working much more quickly even for large I/O.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue cause is understood but a patch is needed to fix it
Projects
None yet
Development

No branches or pull requests

6 participants