Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add an option to test Blobs. #32

Merged
merged 7 commits into from
Apr 14, 2017
Merged

[WIP] Add an option to test Blobs. #32

merged 7 commits into from
Apr 14, 2017

Conversation

jamadden
Copy link
Member

This is quite slow, and it adds some overhead to the other case as well, but the cases that were uber-fast before (hot and steamin) are impacted a lot.

Before merging, I'd like to see if we can do better without hurting the code clarity too much.

non-blobs now

** concurrency=2 **
"Transaction",               fs
"Add 200 Objects",            25024
"Update 200 Objects",         20041
"Read 200 Warm Objects",      70337
"Read 200 Cold Objects",      44731
"Read 200 Hot Objects",       36080
"Read 200 Steamin' Objects", 919433

non-blobs on revision ifad60907dabd74eeacf9e8b9935695f05845981f

** concurrency=2 **
"Transaction",               fs
"Add 200 Objects",             26391
"Update 200 Objects",          20666
"Read 200 Warm Objects",      161081
"Read 200 Cold Objects",       46013
"Read 200 Hot Objects",        36792
"Read 200 Steamin' Objects", 1090373

% change

1.05
1.03
2.29
1.02
1.02
1.19

blobs

** concurrency=2 **
"Transaction",               fs
"Add 100 Objects",            1618
"Update 100 Objects",         1831
"Read 100 Warm Objects",      4822
"Read 100 Cold Objects",      4413
"Read 100 Hot Objects",       4092
"Read 100 Steamin' Objects", 12704

Fixes #29

@coveralls
Copy link

coveralls commented Apr 13, 2017

Coverage Status

Coverage decreased (-0.9%) to 91.471% when pulling 66fa359 on issue29-blobs into fad6090 on master.

@coveralls
Copy link

coveralls commented Apr 13, 2017

Coverage Status

Coverage decreased (-1.7%) to 90.62% when pulling c5c9073 on issue29-blobs into fad6090 on master.

This is quite slow, and it adds some overhead to the other
case as well, but the cases that were uber-fast before (hot and
steamin) are impacted a lot.

I'd like to see if we can do better without hurting the code clarity
too much.

non-blobs

** concurrency=2 **
"Transaction",               fs
"Add 200 Objects",            25024
"Update 200 Objects",         20041
"Read 200 Warm Objects",      70337
"Read 200 Cold Objects",      44731
"Read 200 Hot Objects",       36080
"Read 200 Steamin' Objects", 919433

non-blobs on revision ifad60907dabd74eeacf9e8b9935695f05845981f

** concurrency=2 **
"Transaction",               fs
"Add 200 Objects",             26391
"Update 200 Objects",          20666
"Read 200 Warm Objects",      161081
"Read 200 Cold Objects",       46013
"Read 200 Hot Objects",        36792
"Read 200 Steamin' Objects", 1090373

% change

1.05
1.03
2.29
1.02
1.02
1.19

blobs

** concurrency=2 **
"Transaction",               fs
"Add 100 Objects",            1618
"Update 100 Objects",         1831
"Read 100 Warm Objects",      4822
"Read 100 Cold Objects",      4413
"Read 100 Hot Objects",       4092
"Read 100 Steamin' Objects", 12704

Fixes #29
Numbers are now basically indistinguishable from master, within the
margins of noise.

Non-blobs now:

** concurrency=2 **
"Transaction",               fs
"Add 200 Objects",            25597
"Update 200 Objects",         19976
"Read 200 Warm Objects",     126218
"Read 200 Cold Objects",      51588
"Read 200 Hot Objects",       45032
"Read 200 Steamin' Objects", 1071992

Here's master:

** concurrency=2 **
"Transaction",               fs
"Add 200 Objects",             25117
"Update 200 Objects",          20788
"Read 200 Warm Objects",      118992
"Read 200 Cold Objects",       50992
"Read 200 Hot Objects",        40026
"Read 200 Steamin' Objects", 1184664

Difference:

0.98
1.04
0.94
0.99
0.88
1.10

Blob numbers are also a little improved:

** concurrency=2 **
"Transaction",               fs
"Add 100 Objects",            1692
"Update 100 Objects",         2065
"Read 100 Warm Objects",      6711
"Read 100 Cold Objects",      4834
"Read 100 Hot Objects",       4728
"Read 100 Steamin' Objects", 16064
@coveralls
Copy link

coveralls commented Apr 13, 2017

Coverage Status

Coverage decreased (-1.7%) to 90.62% when pulling ad2c98e on issue29-blobs into fad6090 on master.

1 similar comment
@coveralls
Copy link

coveralls commented Apr 13, 2017

Coverage Status

Coverage decreased (-1.7%) to 90.62% when pulling ad2c98e on issue29-blobs into fad6090 on master.

jamadden added a commit to zopefoundation/ZODB that referenced this pull request Apr 14, 2017
Profiling (zodb/zodbshootout#32) showed that
this method was the only blob-related method that showed up in a test
of creating blobs, other than those that actually performed IO.

With this change its total and cumulative time drops from 0.003/0.004
to 0.001/0.002 in a small benchmark. Blobs created per second shows a
small but consistent improvement.

Before:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.005    0.000    0.005    0.000 {built-in method rename}
      100    0.004    0.000    0.004    0.000 {function BlobFile.close at 0x1080d3a60}
      200    0.003    0.000    0.004    0.000 blob.py:576(oid_to_path)
      101    0.003    0.000    0.003    0.000 {built-in method mkdir}
      100    0.002    0.000    0.002    0.000 blob.py:333(__init__)
      402    0.002    0.000    0.005    0.000 {method 'dump' of '_pickle.Pickler' objects}
        1    0.002    0.002    0.034    0.034 Connection.py:553(_store_objects)
      201    0.002    0.000    0.002    0.000 {built-in method stat}
     5633    0.001    0.000    0.002    0.000 {built-in method isinstance}

After:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.005    0.000    0.005    0.000 {built-in method rename}
      101    0.005    0.000    0.005    0.000 {built-in method mkdir}
      100    0.004    0.000    0.004    0.000 {function BlobFile.close at 0x10636aa60}
      402    0.002    0.000    0.005    0.000 {method 'dump' of '_pickle.Pickler' objects}
      100    0.002    0.000    0.002    0.000 blob.py:333(__init__)
        1    0.002    0.002    0.035    0.035 Connection.py:553(_store_objects)
      201    0.002    0.000    0.002    0.000 {built-in method stat}
     4033    0.001    0.000    0.001    0.000 {built-in method isinstance}
   ....
      200    0.001    0.000    0.002    0.000 blob.py:576(oid_to_path)
Instead pre-cache all the random data we'll need.

Turns out that this was a major performance drag for larger random
sizes, as shown by the blob tests. It was in the top part of the
profile, now its not.

This also ensures that each test-rep within a run uses consistent
random data so compression/hashing/whatever effects should be more
consistent.
@coveralls
Copy link

coveralls commented Apr 14, 2017

Coverage Status

Coverage decreased (-1.6%) to 90.728% when pulling 2c72f05 on issue29-blobs into fad6090 on master.

@jamadden jamadden merged commit 7b1a5a8 into master Apr 14, 2017
@jamadden jamadden deleted the issue29-blobs branch April 14, 2017 15:03
jimfulton pushed a commit to zopefoundation/ZODB that referenced this pull request Apr 14, 2017
Profiling (zodb/zodbshootout#32) showed that
this method was the only blob-related method that showed up in a test
of creating blobs, other than those that actually performed IO.

With this change its total and cumulative time drops from 0.003/0.004
to 0.001/0.002 in a small benchmark. Blobs created per second shows a
small but consistent improvement.

Before:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.005    0.000    0.005    0.000 {built-in method rename}
      100    0.004    0.000    0.004    0.000 {function BlobFile.close at 0x1080d3a60}
      200    0.003    0.000    0.004    0.000 blob.py:576(oid_to_path)
      101    0.003    0.000    0.003    0.000 {built-in method mkdir}
      100    0.002    0.000    0.002    0.000 blob.py:333(__init__)
      402    0.002    0.000    0.005    0.000 {method 'dump' of '_pickle.Pickler' objects}
        1    0.002    0.002    0.034    0.034 Connection.py:553(_store_objects)
      201    0.002    0.000    0.002    0.000 {built-in method stat}
     5633    0.001    0.000    0.002    0.000 {built-in method isinstance}

After:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.005    0.000    0.005    0.000 {built-in method rename}
      101    0.005    0.000    0.005    0.000 {built-in method mkdir}
      100    0.004    0.000    0.004    0.000 {function BlobFile.close at 0x10636aa60}
      402    0.002    0.000    0.005    0.000 {method 'dump' of '_pickle.Pickler' objects}
      100    0.002    0.000    0.002    0.000 blob.py:333(__init__)
        1    0.002    0.002    0.035    0.035 Connection.py:553(_store_objects)
      201    0.002    0.000    0.002    0.000 {built-in method stat}
     4033    0.001    0.000    0.001    0.000 {built-in method isinstance}
   ....
      200    0.001    0.000    0.002    0.000 blob.py:576(oid_to_path)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants