ARROW-2076: [Python] Display slowest test durations #1541

pitrou · 2018-01-31T21:56:56Z

No description provided.

wesm · 2018-01-31T22:48:34Z

Thanks @pitrou! Well it's pretty clear cut:

Python 3.6:

236.05s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_serialization.py::test_custom_serialization
41.78s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_subscribe_deletions
38.78s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_serialization.py::test_primitive_serialization
28.87s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_subscribe
9.01s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_serialization.py::test_complex_serialization
8.90s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_serialization.py::test_serialize_to_buffer
8.30s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_use_one_memory_mapped_file
6.90s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_store_full
5.32s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_many_hashes
4.88s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_create_with_metadata
4.59s setup    pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_use_one_memory_mapped_file
3.34s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_get
2.92s call     pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_create_existing
2.82s teardown pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_use_one_memory_mapped_file
2.54s teardown pyarrow-test-3.6/lib/python3.6/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_store_full

and Python 2.7

285.93s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_serialization.py::test_custom_serialization
44.93s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_serialization.py::test_primitive_serialization
42.07s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_subscribe_deletions
29.15s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_subscribe
12.06s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_serialization.py::test_serialize_to_buffer
12.06s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_serialization.py::test_complex_serialization
8.30s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_use_one_memory_mapped_file
6.74s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_store_full
5.35s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_many_hashes
4.60s setup    pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_use_one_memory_mapped_file
4.52s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_create_with_metadata
3.33s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_get
2.80s teardown pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_use_one_memory_mapped_file
2.68s call     pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_create_existing
2.55s teardown pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_plasma.py::TestPlasmaClient::test_store_full

Not sure exactly what's happening (swapping?) but it looks like we ought to be able to trim 6-7 minutes off by doing something about these tests. cc @robertnishihara @pcmoritz

robertnishihara · 2018-01-31T22:53:15Z

Thanks, we'll look into it.

xhochy

+1, merging as want to have this output continuously in Travis.

robertnishihara · 2018-02-02T06:38:54Z

Any ideas about speeding this up would be appreciated. Note that the tests run very quickly locally (<1s for test_serialization.py and 10s for test_plasma.py) and the tests also run quickly on the MacOS Travis build.

I tried compiling with -DCMAKE_BUILD_TYPE=Debug locally instead of Release, but wasn't able to reproduce the slowness (locally).

Also tried making large arrays in test_serialization.py smaller, but that didn't change anything.

pcmoritz · 2018-02-02T06:48:31Z

I think the problem is that in test_serialization.py, Bar contains a copy of PRIMITIVE_OBJECTS + COMPLEX_OBJECTS and then Qux contains a bunch of copies of Bar, so we are serializing PRIMITIVE_OBJECTS + COMPLEX_OBJECTS a lot of times. If this is slower on travis (due to swapping, VM overhead or anything else, the whole test is slowed down a lot.

So let's slim down the objects that Bar contains!

robertnishihara · 2018-02-02T07:04:12Z

That doesn't explain why test_primitive_serialization is slow.. may need to just remove objects/code until it gets fast and see which change mattered.

pitrou force-pushed the slowest-test-durations branch from d3199bc to 6d5146e Compare February 1, 2018 07:27

[Python] Display slowest test durations

cf5e9c8

pitrou force-pushed the slowest-test-durations branch from 6d5146e to cf5e9c8 Compare February 1, 2018 08:01

xhochy changed the title ~~[Python] Display slowest test durations~~ ARROW-2076: [Python] Display slowest test durations Feb 1, 2018

xhochy approved these changes Feb 1, 2018

View reviewed changes

xhochy closed this in c1d77a1 Feb 1, 2018

pitrou deleted the slowest-test-durations branch September 19, 2018 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARROW-2076: [Python] Display slowest test durations #1541

ARROW-2076: [Python] Display slowest test durations #1541

pitrou commented Jan 31, 2018 •

edited

Loading

wesm commented Jan 31, 2018

robertnishihara commented Jan 31, 2018

xhochy left a comment

robertnishihara commented Feb 2, 2018

pcmoritz commented Feb 2, 2018

robertnishihara commented Feb 2, 2018

ARROW-2076: [Python] Display slowest test durations #1541

ARROW-2076: [Python] Display slowest test durations #1541

Conversation

pitrou commented Jan 31, 2018 • edited Loading

wesm commented Jan 31, 2018

robertnishihara commented Jan 31, 2018

xhochy left a comment

Choose a reason for hiding this comment

robertnishihara commented Feb 2, 2018

pcmoritz commented Feb 2, 2018

robertnishihara commented Feb 2, 2018

pitrou commented Jan 31, 2018 •

edited

Loading