Skip to content
This repository has been archived by the owner on Sep 16, 2020. It is now read-only.

Add benchmark result to README.md #17

Closed
AkihiroSuda opened this issue Jul 29, 2017 · 0 comments
Closed

Add benchmark result to README.md #17

AkihiroSuda opened this issue Jul 29, 2017 · 0 comments

Comments

@AkihiroSuda
Copy link
Owner

FILEgrain benchmark

FILEgrain version: 969524a

amount of blobs

openjdk:8@sha256:5da842d59f76009fa27ffde06888ebd560c7ad17607d7cd1e52fc0757c9a45fb

$ ../du.sh
Pure blobs (excludes continuity): 704288157 [671.6615266799927MiB]
Tarred blobs (excludes continuity): 718243840 [684.970703125MiB]
Tarred + Gzipped blobs (excludes continuity): 273990124 [261.2973442077637MiB]
FILEgrain (gzipped): 273344520 [260.68164825439453MiB]
  • Mount: 2 blobs, 5.416MiB
  • then sh: 8 blobs, 7.31MiB
  • then java -version: 30 blobs, 88.18MiB
  • then javac HelloWorld.java: 50 blobs, 137.3MiB

kdeneon/all@sha256:e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247

$ ../du.sh
Pure blobs (excludes continuity): 5123620038 [4.771743005141616GiB]
Tarred blobs (excludes continuity): 5228851200 [4.869747161865234GiB]
Tarred + Gzipped blobs (excludes continuity): 2236129577 [2.082557954825461GiB]
FILEgrain (gzipped): 2235640216 [2.0821022018790245GiB]
  • Mount: 2 blobs, 34.49MiB
  • then sh: 8 blobs, 36.73MiB
  • then DISPLAY=:1 startkde, with host-side Xephyr -screen 1024x768 :1: 4267 blobs, 742.7MiB
  • then start Firefox via the KDE start-menu: 4506 blobs, 866.6MiB

kaggle/python@sha256:335103c998aea22a5608c2eeca7dcf109e0828ed233b75f5098182c5b058fe98

$ ../du.sh
Pure blobs (excludes continuity): 8937194028 [8.323410551995039GiB]
Tarred blobs (excludes continuity): 9025382400 [8.405542373657227GiB]
Tarred + Gzipped blobs (excludes continuity): 3818209353 [3.5559845650568604GiB]
FILEgrain (gzipped): 3822555054 [3.5600318145006895GiB]
  • Mount: 2 blobs, 38.18MiB
  • then sh: 8 blobs, 40.14MiB
  • then ipython -c 'print("hello")': 1033 blobs, 75.4MiB
  • then ipython -c 'import nltk: 2779 blobs, 352MiB

deduplication benchmark

$ (cd kdeneon-all-sha256-e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247/blobs/sha256; find .) > /tmp/a
$ (cd kaggle-python-sha256-335103c998aea22a5608c2eeca7dcf109e0828ed233b75f5098182c5b058fe98/blobs/sha256; find .) > /tmp/b
$ wc -l /tmp/a /tmp/b
  156916 /tmp/a
  131552 /tmp/b
  288468 total
$ cat /tmp/a /tmp/b | sort | uniq | wc -l
279749
$ cat /tmp/a /tmp/b | sort | uniq -D | uniq | wc -l
8719
$ echo $((156916 + 131552 - 8719))
279749
$ sum=0; for f in $(cat /tmp/a /tmp/b | sort | uniq -D | uniq);do let s=$(stat -c %s kdeneon-all-sha256-e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247/blobs/sha256/$f); sum=$(($sum + $s)); done; echo $sum
79064496 [75.40177917480469MiB]

These are totally different images but have 75MiB of common Debian files.

FUSE

(on Fedora 26, 2 vCPUs, 2GB RAM, VMware Fusion on MacBookPro)

Result of export TIMEFORMAT=%R; for f in $(seq 1 10); do bash -c "cd /; time tar cf - usr | tar tvf - > /dev/null"; done on openjdk:8.

docker run -it --rm:

9.238
9.950
10.098
10.446
6.487
7.425
3.004
0.846
0.775
0.714

FILEgrain without FOPEN_KEEP_CACHE (old commit: b33bc29):

35.777 [pull & cache blobs]
20.870
13.877
19.071
18.319
18.053
19.357
14.154
22.630
17.400

FILEgrain with FOPEN_KEEP_CACHE (not so effective?):

28.318 [pull & cache blobs]
15.833
15.014
16.962
18.809
17.566
15.545
17.971
18.071
15.742

Docker Registry I/O (TODO)

N/A because current FILEgrain does not support Docker Registry API yet.
TODO: integrate FILEgrain into containerd and do real benchmark

Appendix

du.sh

#!/bin/sh
set -e

echo -n "Pure blobs (excludes continuity): "
du -bs $(../print-du-exclude-extra-blobs.py) ./blobs | awk '{print $1}'

echo -n "Tarred blobs (excludes continuity): "
tar cf - $(../print-du-exclude-extra-blobs.py) ./blobs | wc -c

echo -n "Tarred + Gzipped blobs (excludes continuity): "
tar cf - $(../print-du-exclude-extra-blobs.py) ./blobs | gzip -9 | wc -c

sum=0
for f in $(find ./blobs -type f); do
    sum=$(($sum + $(gzip -9c $f | wc -c )))
done
echo "FILEgrain (gzipped): $sum"

print-du-exclude-extra-blobs.py

#!/usr/bin/python3
# Usage: du -bs $(this.py) ./blobs
import json

def dig2blobpath(s):
    spl = s.split(':')
    algo, heks =spl[0], spl[1]
    return 'blobs/' + algo+'/'+heks

excludes = []
for m_entry in json.load(open('index.json'))['manifests']:
    m_blob = dig2blobpath(m_entry['digest'])
    excludes.append(m_blob)
    m = json.load(open(m_blob))
    excludes.append(dig2blobpath(m['config']['digest']))
    for l in m['layers']:
        excludes.append(dig2blobpath(l['digest']))

for f in excludes:
    print('--exclude '+f)
AkihiroSuda added a commit that referenced this issue Aug 9, 2017
Fix #17

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant