Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow ipfs add #1216

Closed
pr0d1r2 opened this issue May 9, 2015 · 20 comments
Closed

Slow ipfs add #1216

pr0d1r2 opened this issue May 9, 2015 · 20 comments
Labels
topic/perf Performance

Comments

@pr0d1r2
Copy link

pr0d1r2 commented May 9, 2015

Recently i started adding whole rubygems to IPFS (see: https://twitter.com/Pr0d1r2/status/596668070489264129/photo/1 & https://twitter.com/Pr0d1r2/status/596739849652203522/photo/1)

ipfs version 0.3.3

Whole process suppose to take 36h which is a lot of time for 155GB of data.

This is strange because machine is dedicated server (older version of https://www.hetzner.de/hosting/produkte_rootserver/ex40):
8 core: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
32GB of RAM (non ECC)
RAID1: /dev/md3 on /home type ext4 (rw,relatime,user_xattr,acl,barrier=1,data=ordered)

hdparm -tT /dev/md3

/dev/md3:
 Timing cached reads:   25722 MB in  2.00 seconds = 12876.63 MB/sec
 Timing buffered disk reads: 240 MB in  3.02 seconds =  79.59 MB/sec

CPU usage by ipfs daemon and ipfs add -r . processes is 5-15% (out of 800%):
screen shot 2015-05-09 at 05 47 54

Using recent version of go (go version go1.4.2 linux/amd64), downloaded from https://storage.googleapis.com/golang/go1.4.2.linux-amd64.tar.gz

@whyrusleeping
Copy link
Member

I would expect a slightly higher CPU utilization than that for adding... Is it actually making progress?

@whyrusleeping
Copy link
Member

Also, that RAM usage is VERY concerning...

@pr0d1r2
Copy link
Author

pr0d1r2 commented May 9, 2015

@whyrusleeping yes, it is making steady progress (estimated 36h at start looks exact).

@whyrusleeping
Copy link
Member

interesting, to be honest, ive never added anything larger than 8GB, and that was only a couple files in a single directory. Nothing on the scale that youre testing. Could you give me a few more statistics on the data youre adding? estimated total number of files, average file size, maximum/average directory depth, and any other statistics you feel relevant.

Thank you for doing this! This is very good information for us.

@pr0d1r2
Copy link
Author

pr0d1r2 commented May 9, 2015

My idea is to have complete permanent internet backup of useful things like archive.org does but decentralised. Nice testing repository will be: ftp://ftp.ncbi.nlm.nih.gov ;)

@jbenet
Copy link
Member

jbenet commented May 9, 2015

@pr0d1r2

My idea is to have complete permanent internet backup of useful things like archive.org does but decentralised.

great!

those added stats will really help.

@whyrusleeping also, i think all of rubygems is a huge forest with mostly shallow trees. lots of small files.

@jbenet
Copy link
Member

jbenet commented May 9, 2015

@whyrusleeping this seems blocked on network io -- providing? AFAICR, dagservice no longer blocks on network, because the blockservice's worker has a 0 buffer [0] -- but perhaps it is somehow blocking there [1], [2], [3]

[0] 0 buffer https://github.com/ipfs/go-ipfs/blob/master/blockservice/blockservice.go#L31
[1] blockservice blocking https://github.com/ipfs/go-ipfs/blob/master/blockservice/blockservice.go#L73
[2] https://github.com/ipfs/go-ipfs/blob/master/blockservice/worker/worker.go#L67
[3] bitswap blocking https://github.com/ipfs/go-ipfs/blob/master/exchange/bitswap/bitswap.go#L233

@whyrusleeping
Copy link
Member

I would like to get some real stats on the perf here, perhaps someone can take a better look at profiling large adds in the next sprint (cpuprofiles, blockprofiles, etc)

@pr0d1r2
Copy link
Author

pr0d1r2 commented May 11, 2015

@whyrusleeping I can provide some stats if you want. In which tool's stats are you interested:

  • sysstat
  • collectl
  • dstat
  • strace
  • tcpdump
  • iotop
  • perf
  • systemtap
  • traceroute
  • nethogs
  • mtr
  • ktap
  • mpstat
  • nicstat
  • pidstat
  • sar
  • blktrace
  • slabtop
  • dtrace - dtrace4linux
  • tcpretransmit.d
  • ntop
  • ss
  • lsof
  • oprofile
  • gprof
  • kcachegrid
  • valgrind
  • google profiler
  • nfsiostat
  • cifsiostat
  • latencytop
  • powertop
  • LLTng

@whyrusleeping
Copy link
Member

@pr0d1r2 youre giving me too many options, this is like a kid in a candy store. Gonna have to think about this...

@whyrusleeping
Copy link
Member

info from iotop (to see how much we're writing to disk) would be nice, outgoing/incoming bandwidth from nethogs might also be useful to see. I would also like to see: find $DIR | wc -l where DIR= the directory youre adding (to get an idea of the total number of files involved.

@whyrusleeping whyrusleeping mentioned this issue May 11, 2015
30 tasks
@vitzli
Copy link
Contributor

vitzli commented May 16, 2015

Debian Stretch inside virtualbox, 2 GB RAM, 1 CPU core of Core 2 Quad 9550, OS image (/) and data partition (/srv) are on separate physical hdds. Ext4 on /srv, journal is disabled. /srv dir on SATA2 HDD.
ipfs version 0.3.4 from github repo, go version go1.4.2 linux/amd64
On /srv/test1:

user@deb-stretch-ipfs:/srv/test1$ time dd if=/dev/urandom of=random.dat bs=1M count=2048
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 192.864 s, 11.1 MB/s

real  3m13.038s
user  0m0.008s
sys   3m7.528s

Writing a file full of 1s:

user@deb-stretch-ipfs:/srv/test1$ tr '\0' '\377' < /dev/zero | dd bs=1M count=2048 of=random2.dat iflag=fullblock
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 20.6104 s, 104 MB/s

Using sha256sum as a baseline:

user@deb-stretch-ipfs:/srv/test1$ time sha256sum random.dat 
2dc9095156bf916fc1cd07d98bf9e0d330cfbefedf9f037a65e378cbc9e70ab1  random.dat

real    0m19.886s
user    0m2.872s
sys 0m15.868s

user@deb-stretch-ipfs:/srv/test1$ time sha256sum random2.dat 
6f300f29ee99e1ea432f72e7637a3c15b6304a0c8af839ef8eb925b516fa55fb  random2.dat

real    0m19.032s
user    0m3.208s
sys 0m14.344s

sha256 does about 105 MB/s.

Adding file random.dat (2GB of random data) from local hdd to ipfs (no ipfs daemon):

real    5m56.194s
user    0m32.688s
sys 0m42.068s

gives performance about 5.61 MB/s.
Adding file random2.dat (2GB of 1s), same conditions:

real    0m33.813s
user    0m9.460s
sys 0m20.384s

has performance about 59.17 MB/s.
With 2 CPU cores:
adding random.dat (urandom data): 2m20.049s, 14.2 MB/s
adding random2.dat (0xFF data): 24.2 s, 82.6 MB/s

Adding pool dir from deb-multimedia repository, 2CPU cores, 2 GB RAM, 2.06GB, over 1 Gbps-ish network, mounted as cifs share, 1946 files, directory depth approx. 4:

user@deb-stretch-ipfs:/mnt/smb/deb-multimedia$ time ipfs add -r pool/
real    4m50.657s

gives perf. 7.10 MB/s

Adding non-free part from debian repository, same vm config, 10.39GB of data, 1497 files, same depth:

user@deb-stretch-ipfs:/mnt/smb/debian/pool$ time ipfs add -r non-free/
real    39m6.316s
user    14m8.600s
sys 5m59.212s

gives about 4.42 MB/s
iotop shows 6-8 M/s DISK WRITE for ipfs add -r

@whyrusleeping
Copy link
Member

@vitzli thank you for running these! this is very helpful!

@whyrusleeping
Copy link
Member

@vitzli could you try those same tests again using the code from #1225 ?

@vitzli
Copy link
Contributor

vitzli commented May 17, 2015

I think I've made a mistake, I got inconsistent results for the test, so I added sync and sleep 10 commands, and rerun tests for 1 and 2 core configs for 10 times (here is the script):
1 core:

TEST 1  2m3.316s
TEST 2  2m12.859s
TEST 3  2m17.953s
TEST 4  2m16.594s
TEST 5  2m24.002s
TEST 6  2m23.246s
TEST 7  2m26.740s
TEST 8  2m28.487s
TEST 9  2m21.002s
TEST 10 2m24.794s

Average time 2m19.8s, 14.3 MB/s
2 cores:

TEST 1  2m15.375s
TEST 2  2m16.610s
TEST 3  2m21.425s
TEST 4  2m25.180s
TEST 5  2m28.545s
TEST 6  2m25.405s
TEST 7  2m24.538s
TEST 8  2m25.867s
TEST 9  2m25.004s
TEST 10 2m27.947s

Average time 2m23.58s, 13.9 MB/s

@pr0d1r2
Copy link
Author

pr0d1r2 commented May 18, 2015

I have killed machine's disk by doing that :>

Some stats from iotop (now adding directory again, some files existing, looks pretty fast, 1h30m estimated time to finish):

Total DISK READ:      20.93 M/s | Total DISK WRITE:     589.75 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
21439 be/4 pr0d1r2   139.07 K/s   34.27 K/s  0.00 % 46.20 % ipfs daemon
21442 be/4 pr0d1r2   135.48 K/s   10.76 K/s  0.00 % 41.17 % ipfs daemon
21443 be/4 pr0d1r2    78.10 K/s    2.39 K/s  0.00 % 23.72 % ipfs daemon
22656 be/4 pr0d1r2     6.34 M/s    0.00 B/s  0.00 % 15.37 % ipfs add -r .
22657 be/4 pr0d1r2     4.92 M/s    0.00 B/s  0.00 % 12.39 % ipfs add -r .
21444 be/4 pr0d1r2    77.30 K/s  118.75 K/s  0.00 % 10.29 % ipfs daemon
22154 be/4 pr0d1r2    82.48 K/s   40.64 K/s  0.00 %  9.07 % ipfs daemon
22664 be/4 pr0d1r2    25.90 K/s   96.43 K/s  0.00 %  7.05 % ipfs daemon
22662 be/4 pr0d1r2     3.37 M/s    0.00 B/s  0.00 %  7.01 % ipfs add -r .
21446 be/4 pr0d1r2    58.97 K/s   23.11 K/s  0.00 %  5.29 % ipfs daemon
22651 be/4 pr0d1r2     3.10 M/s    0.00 B/s  0.00 %  4.86 % ipfs add -r .
22654 be/4 pr0d1r2     2.63 M/s    0.00 B/s  0.00 %  3.92 % ipfs add -r .

Nethogs shows: 13.740 KB/sec

@jbenet jbenet mentioned this issue May 19, 2015
52 tasks
@whyrusleeping
Copy link
Member

General add performance should be much better now. I'll try adding a huge (>100GB directory later today and report back)

@whyrusleeping
Copy link
Member

hrm, estimated ~2Hrs for 100GB. better, but still not where i'd like it.

@daviddias
Copy link
Member

@whyrusleeping made a lot of performance improvements on dev0.4.0, could you run the tests there? Thank you!

@daviddias daviddias added the topic/perf Performance label Jan 2, 2016
@daviddias daviddias changed the title slow add of 155GB of data Slow ipfs add Jan 2, 2016
@whyrusleeping
Copy link
Member

I think this has been resolved, ipfs add has gotten drastically faster since this issue was reported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/perf Performance
Projects
None yet
Development

No branches or pull requests

5 participants