Perf check on upcoming v0.11.0 vs 0.10.1 #3326

ghost · 2013-04-12T12:51:13Z

Here are all the vbenches which differ by more then 15 percent,
best of 5.

I hope to have this automated and realtime by the time the
next release comes around, along with bisection.

1st run

λ cat r1/report.txt 
Worse
getattr_dataframe_index              2.000000
frame_multi_and_no_ne                1.319728
series_constructor_ndarray           1.329545
ctor_index_array_string              1.651376
frame_wide_repr                     48.154345
groupby_sum_booleans                 1.152285
indexing_dataframe_boolean_rows      1.168337
series_getitem_scalar                1.862069
dataframe_getitem_scalar             1.214286
datamatrix_getitem_scalar            1.190476
concat_small_frames                  1.159975
series_align_left_monotonic          1.289482
reindex_daterange_backfill           1.190402
reindex_daterange_pad                1.185000
timeseries_large_lookup_value      235.037234
dtype: float64

Better
frame_multi_and_st               0.580615
frame_multi_and                  0.597748
frame_fancy_lookup               0.792336
frame_get_dtype_counts           0.000404
frame_fancy_lookup_all           0.797565
series_string_vector_slice       0.801726
frame_reindex_upcast             0.523595
frame_reindex_axis0              0.509034
groupby_first_float32            0.043276
groupby_last_float32             0.044075
groupby_transform                0.413400
indexing_dataframe_boolean_st    0.094886
indexing_dataframe_boolean       0.095269
frame_to_csv                     0.737868
frame_to_csv2                    0.121081
frame_to_csv_mixed               0.381196
write_csv_standard               0.198206
append_frame_single_mixed        0.805745
reindex_frame_level_align        0.790248
reindex_frame_level_reindex      0.787495

2nd run

Worse
frame_multi_and_no_ne                1.322615
series_constructor_ndarray           1.284091
ctor_index_array_string              1.557522
frame_wide_repr                     47.085119
indexing_dataframe_boolean_rows      1.158785
series_getitem_scalar                1.896552
dataframe_getitem_scalar             1.214286
datamatrix_getitem_scalar            1.190476
series_align_left_monotonic          1.293073
reindex_daterange_backfill           1.192585
reindex_daterange_pad                1.176850
frame_reindex_columns                1.231402
timeseries_large_lookup_value      370.207254
dtype: float64

Better
frame_multi_and_st               0.584512
frame_multi_and                  0.592406
frame_fancy_lookup               0.794964
frame_get_dtype_counts           0.000393
series_string_vector_slice       0.793987
frame_reindex_upcast             0.562601
frame_reindex_axis0              0.512411
groupby_first_float32            0.042893
groupby_last_float32             0.043322
groupby_transform                0.412138
indexing_dataframe_boolean_st    0.093816
indexing_dataframe_boolean       0.094429
frame_to_csv                     0.721259
frame_to_csv2                    0.120910
frame_to_csv_mixed               0.399276
write_csv_standard               0.197096
append_frame_single_mixed        0.795502
reindex_frame_level_align        0.787276
reindex_frame_level_reindex      0.786307
dtype: float64

Until test_perf gets validated in it's compare mode, re instability

#!/bin/bash

# profile current HEAD, against the commit
# specified on the command line

# assume you're running in a venv, and
# that upstream pandas is a git remote named
# "upstream"

PREV_VER=$1
THRESH=0.15
NITER=5
UPSTREAM=upstream/master

git reset --hard $UPSTREAM
H1=$(git log --format="%h" -1)
python setup.py develop
./test_perf.sh -H -N $NITER -c 1 -d $PWD/HEAD-$H1.pickle "$2"

git reset --hard $PREV_VER
H2=$(git log --format="%h" -1)
git checkout upstream/master vb_suite setup.py # bring back the updated suite and test_perf, and build cache
python setup.py develop
./test_perf.sh -H -N $NITER -c 1 -d $PWD/PREV-$H2.pickle "$2"

# back to master
git reset --hard $UPSTREAM
python setup.py develop

SCR=$(tee <<EOF
import pandas as pd
H=(pd.load("HEAD-$H1.pickle").min(1)/pd.load("PREV-$H2.pickle").min(1))
print "Worse"
print H[(H-1)>$THRESH]
print "\nBetter"
print H[(1-H)>$THRESH]
EOF
)
python -c "$SCR"

The text was updated successfully, but these errors were encountered:

ghost · 2013-04-12T12:57:19Z

jreback · 2013-04-12T13:07:55Z

can you show the start/end commits that you used?

ghost · 2013-04-12T13:14:13Z

λ ll /tmp/pandas/r1/
total 44
drwxr-xr-x 2 user1 user1  4096 Apr 12 15:46 ./
drwxr-xr-x 4 user1 user1  4096 Apr 12 15:46 ../
-rw-r--r-- 1 user1 user1 12511 Apr 12 15:46 HEAD-4618675.pickle
-rw-r--r-- 1 user1 user1 12511 Apr 12 15:46 PREV-31ecaa9.pickle
-rw-r--r-- 1 user1 user1  1560 Apr 12 15:46 report.txt

ghost · 2013-04-12T13:14:56Z

Would be good if you could indep. confirm the results on your own machine with the script above.

jreback · 2013-04-12T14:45:15Z

did you check series_ndarray_constructor?

lodagro · 2013-04-12T14:54:16Z

Is there an easy way to run vbench only for one benchmark, like frame_wide_repr?

ghost · 2013-04-12T14:56:50Z

In bash, from repo root dir

git reset --hard <commit hash> 
git checkout upstream/master vb_suite setup.py; 
python setup.py clean; python setup.py develop; # git clean -f if you're going back far enough, careful with that
./test_perf.sh -H  -N 5 -u 1  -c 5 -n 1 -r frame_wide_repr

ghost · 2013-04-12T14:58:58Z

@jreback , just bisected down to the offending commit, not the cause, go for it.

jreback · 2013-04-12T15:01:23Z

ahh...that's what the check is for!
yah...that ones going to be tough...i will look though...

jreback · 2013-04-12T15:02:43Z

#3329 should fix ctor_index_array_string (and helps on some others actually)
my comments are getting wiped on the editing....

ghost · 2013-04-12T15:02:51Z

you're right. dechecked it.
post a comment, don't edit.

lodagro · 2013-04-12T15:09:56Z

Reason why frame_wide_repr takes longer, compared to 0.10.1 is that now, outside interactive mode there is never a info/wrapped repr, always full view. Probably best to run the benchmark in context of simulated interactive mode, i`ll see if i can change it..

ghost · 2013-04-12T15:24:10Z

@lodagro , that doesn't make sense, #2241 was merged for 0.9.1, and the bisected commit
8ad9598 is new to this release. The branch name in that commit points to a dtype issue.

lodagro · 2013-04-12T15:51:59Z

@y-p any way i added the test such that a wide frame repr is tested both with/without interactive mode.
It is strange that on base (v0.10,1) there is no difference between the two, given #2241.

Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_wide_repr_interactive                  |   1.1680 | 276.7550 |   0.0042 |
frame_wide_repr                              | 13561.5706 | 273.1387 |  49.6509 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

jreback · 2013-04-12T15:56:49Z

@lodagro maye %prun this and see where its spending the time?

jreback · 2013-04-12T19:08:11Z

repr(df) display the concise format, how do I force it do display all rows? which I think is what this is testing? (frame_wide_repr)

jreback · 2013-04-12T19:09:55Z

@y-p merged in a change for series_constructor_ndarray #3333 (lucky num!)

ghost · 2013-04-12T19:13:51Z

Try pd.options.display.max_rows= infinity

lodagro · 2013-04-12T19:24:41Z

@jreback i also think that on 0.10.1 repr(df) results in a concise format (given that there is no performance difference between between with/without interactive mode on v.0.10.1 -- see above), on HEAD this is no longer the case. concise formats are only used in interactive mode.

ghost · 2013-04-12T20:27:13Z

@lodagro , my mistake, I bisected to the wrong changepoint (I used a collection going much further back).
The regression in wide_repe seems to be 7db1af4, which was just merged yesterday.
can you check it out?

jreback · 2013-04-12T20:37:30Z

one of the big timesinks in this is finding the max len of an array of strings, I have a function in cython to do this lib.max_len_string_array (I use in hdfstore), but not good for unicode. Maybe not for 0.11, suggest to replace the _strlen bizness and just do it in cython

lodagro · 2013-04-12T20:41:06Z

When forcing full view outside interactive mode on v.0.10.0, the ratio for frame_wide_repr drops from 50 to 1.2 (still a 20% increase though).

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_wide_repr_interactive                  |   1.2130 | 259.1007 |   0.0047 |
frame_wide_repr                              | 13969.2047 | 11723.8823 |   1.1915 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

So yes 7db1af4 definitely has an impact. Since i made sure that concise views are only used in interactive mode (before this was not always the case - i mentioned this also in #2881, this was not discussed further though).

ghost · 2013-04-12T20:53:37Z

ok, 7db1af4 made long frames print in full when in non-interactive mode,
where prior to that, you would just get the summary and that's the reason for the
perf difference. was that change intentional? what's the rationale?

ghost · 2013-04-12T20:58:53Z

I recall complaints about un-unix like behaviour in that we previously considered terminal
width in non-interactive mode. that was fixed in a prior release. is there a mailing list discussion
about the problem with summary views in non-interactive mode? I've seen no GH issues.

specifically, prior to 7db1af4 the user could opt in to full display by setting display.max_rows,
but now they can't opt-out, and they may be hit with a 13 sec (read: noticable) delay to boot.

ghost · 2013-04-14T04:38:32Z

@lodagro , seems the new repr code is ignoring the max_rows, max_col options,
and taking forever on large dataframes #3337.

jreback · 2013-04-15T18:09:12Z

@y-p can you post a rerun of what you did above when you have a chance? to just confirm the fixes?

ghost · 2013-04-15T18:18:55Z

I reran test_perf with your fixes and confirmed that the mean time for the vbench, over N runs, backed down
when compared with it's previous commit. some of the tests are noisy, so you need
a respectable N and some gut feeling.

with a fresh checkout of test_perf.py

test_perf.sh -q -S -H -N 100 - r "vbench_name"

will do 100 runs of the named vbench and will print out a describe() line with
summary stats for the 100 data points.

Use the build cache (now in scripts/use_build_cache.py) to jump around revisions quickly.

lodagro · 2013-04-15T18:57:14Z

@y-p yes the change was intentional. I had in mind the same 'un-unix-like' behavior discussion #1610 and i disabled concise formats entirenly in non-interactive mode. By using pd.options.mode.sim_interactive = True, interactive mode can be enforced.

ghost · 2013-04-15T18:59:02Z

We need to decide where this discussion is taking place.
the sim_interactive =True idea makes no sense at all. it's a testing feature.

ghost · 2013-04-15T19:01:26Z

#1610 was about console width and made sense (I implemented the fix), I disagree that the same
argument works in the row side of things.

ghost · 2013-04-15T19:03:44Z

In fact, if you read #1610, it basically states the exact opposite then what you're suggesting.

lodagro · 2013-04-15T19:12:44Z

#1610 requests that the the default in non-interactive mode is that there is no limit on the width/max_columns, this is still the case and i extended this to rows also. User can change it also.

For me no problem to keep only a vertical limit, so in non-interactive mode there can be a limit on height/max_rows but not on on width/max_columns, this will fix the script mentioned in #3337.

ghost · 2013-04-15T19:16:22Z

The problem was that we were relying on terminal dimensions in non-interactive mode.
That's a different matter then ignoring display options, such as max_rows. I think
those should be respected at all times. users are free to set them according to their needs.
I'm sure some users who want long-form output have already done so in their scripts.

From #1610
"""
In non-interactive usage the stdout should depend only on the print options currently in effect.
"""

I agree.

lodagro · 2013-04-15T19:31:07Z

So both max_rows/max_columns should always be used and options display.width, display.height, display.expand_frame_repr only in interactive mode.

ghost · 2013-04-15T19:38:46Z

max_rows and max_columns should always be respected, just like in v0.10.1.
expand_frame_repr_only too I think, since users might like to have output
folded across page even in scripts, and they can opt out.

height and width are new options, we're free to choose their meaning.
Since they're disabled by default I think they should also be respected always.
if a user explicitly sets them in a script, why not?

terminal height/width detection should play no role in non-interactive mode.

That's the breakdown as I see it.

ghost · 2013-04-20T09:56:06Z

remaining issue to be discussed/closed in #3395

jreback mentioned this issue Apr 12, 2013

PERF: fixed int64 indexing perf issue when conversion to int64 #3327

Merged

ghost mentioned this issue Apr 12, 2013

max_columns=0 no longer switches over to summary view #2856

Closed

jreback mentioned this issue Apr 12, 2013

PERF/CLN: infer Period in infer_dtype, later index inference is faster #3329

Merged

This was referenced Apr 12, 2013

PERF: fixed int64 indexing perf issue when conversion to int64 #3331

Merged

PERF/CLN: infer Period in infer_dtype, later index inference is faster #3332

Merged

ghost mentioned this issue Apr 13, 2013

FIX? df.__repr__ takes for ever with big data #3337

Closed

ghost closed this as completed Apr 20, 2013

This issue was closed.

Perf check on upcoming v0.11.0 vs 0.10.1 #3326

Perf check on upcoming v0.11.0 vs 0.10.1 #3326

Comments

ghost commented Apr 12, 2013

ghost commented Apr 12, 2013

jreback commented Apr 12, 2013

ghost commented Apr 12, 2013

ghost commented Apr 12, 2013

jreback commented Apr 12, 2013

lodagro commented Apr 12, 2013

ghost commented Apr 12, 2013

ghost commented Apr 12, 2013

jreback commented Apr 12, 2013

jreback commented Apr 12, 2013

ghost commented Apr 12, 2013

lodagro commented Apr 12, 2013

ghost commented Apr 12, 2013

lodagro commented Apr 12, 2013

jreback commented Apr 12, 2013

jreback commented Apr 12, 2013

jreback commented Apr 12, 2013

ghost commented Apr 12, 2013

lodagro commented Apr 12, 2013

ghost commented Apr 12, 2013

jreback commented Apr 12, 2013

lodagro commented Apr 12, 2013

ghost commented Apr 12, 2013

ghost commented Apr 12, 2013

ghost commented Apr 14, 2013

jreback commented Apr 15, 2013

ghost commented Apr 15, 2013

lodagro commented Apr 15, 2013

ghost commented Apr 15, 2013

ghost commented Apr 15, 2013

ghost commented Apr 15, 2013

lodagro commented Apr 15, 2013

ghost commented Apr 15, 2013

lodagro commented Apr 15, 2013

ghost commented Apr 15, 2013

ghost commented Apr 20, 2013