Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf check on upcoming v0.11.0 vs 0.10.1 #3326

Closed
ghost opened this issue Apr 12, 2013 · 36 comments
Closed

Perf check on upcoming v0.11.0 vs 0.10.1 #3326

ghost opened this issue Apr 12, 2013 · 36 comments
Milestone

Comments

@ghost
Copy link

ghost commented Apr 12, 2013

Here are all the vbenches which differ by more then 15 percent,
best of 5.

I hope to have this automated and realtime by the time the
next release comes around, along with bisection.

1st run

λ cat r1/report.txt 
Worse
getattr_dataframe_index              2.000000
frame_multi_and_no_ne                1.319728
series_constructor_ndarray           1.329545
ctor_index_array_string              1.651376
frame_wide_repr                     48.154345
groupby_sum_booleans                 1.152285
indexing_dataframe_boolean_rows      1.168337
series_getitem_scalar                1.862069
dataframe_getitem_scalar             1.214286
datamatrix_getitem_scalar            1.190476
concat_small_frames                  1.159975
series_align_left_monotonic          1.289482
reindex_daterange_backfill           1.190402
reindex_daterange_pad                1.185000
timeseries_large_lookup_value      235.037234
dtype: float64

Better
frame_multi_and_st               0.580615
frame_multi_and                  0.597748
frame_fancy_lookup               0.792336
frame_get_dtype_counts           0.000404
frame_fancy_lookup_all           0.797565
series_string_vector_slice       0.801726
frame_reindex_upcast             0.523595
frame_reindex_axis0              0.509034
groupby_first_float32            0.043276
groupby_last_float32             0.044075
groupby_transform                0.413400
indexing_dataframe_boolean_st    0.094886
indexing_dataframe_boolean       0.095269
frame_to_csv                     0.737868
frame_to_csv2                    0.121081
frame_to_csv_mixed               0.381196
write_csv_standard               0.198206
append_frame_single_mixed        0.805745
reindex_frame_level_align        0.790248
reindex_frame_level_reindex      0.787495

2nd run

Worse
frame_multi_and_no_ne                1.322615
series_constructor_ndarray           1.284091
ctor_index_array_string              1.557522
frame_wide_repr                     47.085119
indexing_dataframe_boolean_rows      1.158785
series_getitem_scalar                1.896552
dataframe_getitem_scalar             1.214286
datamatrix_getitem_scalar            1.190476
series_align_left_monotonic          1.293073
reindex_daterange_backfill           1.192585
reindex_daterange_pad                1.176850
frame_reindex_columns                1.231402
timeseries_large_lookup_value      370.207254
dtype: float64

Better
frame_multi_and_st               0.584512
frame_multi_and                  0.592406
frame_fancy_lookup               0.794964
frame_get_dtype_counts           0.000393
series_string_vector_slice       0.793987
frame_reindex_upcast             0.562601
frame_reindex_axis0              0.512411
groupby_first_float32            0.042893
groupby_last_float32             0.043322
groupby_transform                0.412138
indexing_dataframe_boolean_st    0.093816
indexing_dataframe_boolean       0.094429
frame_to_csv                     0.721259
frame_to_csv2                    0.120910
frame_to_csv_mixed               0.399276
write_csv_standard               0.197096
append_frame_single_mixed        0.795502
reindex_frame_level_align        0.787276
reindex_frame_level_reindex      0.786307
dtype: float64


Until test_perf gets validated in it's compare mode, re instability

#!/bin/bash

# profile current HEAD, against the commit
# specified on the command line

# assume you're running in a venv, and
# that upstream pandas is a git remote named
# "upstream"

PREV_VER=$1
THRESH=0.15
NITER=5
UPSTREAM=upstream/master

git reset --hard $UPSTREAM
H1=$(git log --format="%h" -1)
python setup.py develop
./test_perf.sh -H -N $NITER -c 1 -d $PWD/HEAD-$H1.pickle "$2"

git reset --hard $PREV_VER
H2=$(git log --format="%h" -1)
git checkout upstream/master vb_suite setup.py # bring back the updated suite and test_perf, and build cache
python setup.py develop
./test_perf.sh -H -N $NITER -c 1 -d $PWD/PREV-$H2.pickle "$2"

# back to master
git reset --hard $UPSTREAM
python setup.py develop

SCR=$(tee <<EOF
import pandas as pd
H=(pd.load("HEAD-$H1.pickle").min(1)/pd.load("PREV-$H2.pickle").min(1))
print "Worse"
print H[(H-1)>$THRESH]
print "\nBetter"
print H[(1-H)>$THRESH]
EOF
)
python -c "$SCR"

@ghost
Copy link
Author

ghost commented Apr 12, 2013

Regression TODO (GH concurrent editing sucks, post a comment, I'll update): update: sorry GH, we still <3 u.

@jreback
Copy link
Contributor

jreback commented Apr 12, 2013

can you show the start/end commits that you used?

@ghost
Copy link
Author

ghost commented Apr 12, 2013

λ ll /tmp/pandas/r1/
total 44
drwxr-xr-x 2 user1 user1  4096 Apr 12 15:46 ./
drwxr-xr-x 4 user1 user1  4096 Apr 12 15:46 ../
-rw-r--r-- 1 user1 user1 12511 Apr 12 15:46 HEAD-4618675.pickle
-rw-r--r-- 1 user1 user1 12511 Apr 12 15:46 PREV-31ecaa9.pickle
-rw-r--r-- 1 user1 user1  1560 Apr 12 15:46 report.txt

@ghost
Copy link
Author

ghost commented Apr 12, 2013

Would be good if you could indep. confirm the results on your own machine with the script above.

@jreback
Copy link
Contributor

jreback commented Apr 12, 2013

did you check series_ndarray_constructor?

@lodagro
Copy link
Contributor

lodagro commented Apr 12, 2013

Is there an easy way to run vbench only for one benchmark, like frame_wide_repr?

@ghost
Copy link
Author

ghost commented Apr 12, 2013

In bash, from repo root dir

git reset --hard <commit hash> 
git checkout upstream/master vb_suite setup.py; 
python setup.py clean; python setup.py develop; # git clean -f if you're going back far enough, careful with that
./test_perf.sh -H  -N 5 -u 1  -c 5 -n 1 -r frame_wide_repr

@ghost
Copy link
Author

ghost commented Apr 12, 2013

@jreback , just bisected down to the offending commit, not the cause, go for it.

@jreback
Copy link
Contributor

jreback commented Apr 12, 2013

ahh...that's what the check is for!
yah...that ones going to be tough...i will look though...

@jreback
Copy link
Contributor

jreback commented Apr 12, 2013

#3329 should fix ctor_index_array_string (and helps on some others actually)
my comments are getting wiped on the editing....

@ghost
Copy link
Author

ghost commented Apr 12, 2013

you're right. dechecked it.
post a comment, don't edit.

@lodagro
Copy link
Contributor

lodagro commented Apr 12, 2013

Reason why frame_wide_repr takes longer, compared to 0.10.1 is that now, outside interactive mode there is never a info/wrapped repr, always full view. Probably best to run the benchmark in context of simulated interactive mode, i`ll see if i can change it..

@ghost
Copy link
Author

ghost commented Apr 12, 2013

@lodagro , that doesn't make sense, #2241 was merged for 0.9.1, and the bisected commit
8ad9598 is new to this release. The branch name in that commit points to a dtype issue.

@lodagro
Copy link
Contributor

lodagro commented Apr 12, 2013

@y-p any way i added the test such that a wide frame repr is tested both with/without interactive mode.
It is strange that on base (v0.10,1) there is no difference between the two, given #2241.

Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_wide_repr_interactive                  |   1.1680 | 276.7550 |   0.0042 |
frame_wide_repr                              | 13561.5706 | 273.1387 |  49.6509 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

@jreback
Copy link
Contributor

jreback commented Apr 12, 2013

@lodagro maye %prun this and see where its spending the time?

@jreback
Copy link
Contributor

jreback commented Apr 12, 2013

repr(df) display the concise format, how do I force it do display all rows? which I think is what this is testing? (frame_wide_repr)

@jreback
Copy link
Contributor

jreback commented Apr 12, 2013

@y-p merged in a change for series_constructor_ndarray #3333 (lucky num!)

@ghost
Copy link
Author

ghost commented Apr 12, 2013

Try pd.options.display.max_rows= infinity

@lodagro
Copy link
Contributor

lodagro commented Apr 12, 2013

@jreback i also think that on 0.10.1 repr(df) results in a concise format (given that there is no performance difference between between with/without interactive mode on v.0.10.1 -- see above), on HEAD this is no longer the case. concise formats are only used in interactive mode.

@ghost
Copy link
Author

ghost commented Apr 12, 2013

@lodagro , my mistake, I bisected to the wrong changepoint (I used a collection going much further back).
The regression in wide_repe seems to be 7db1af4, which was just merged yesterday.
can you check it out?

@jreback
Copy link
Contributor

jreback commented Apr 12, 2013

one of the big timesinks in this is finding the max len of an array of strings, I have a function in cython to do this lib.max_len_string_array (I use in hdfstore), but not good for unicode. Maybe not for 0.11, suggest to replace the _strlen bizness and just do it in cython

@lodagro
Copy link
Contributor

lodagro commented Apr 12, 2013

When forcing full view outside interactive mode on v.0.10.0, the ratio for frame_wide_repr drops from 50 to 1.2 (still a 20% increase though).

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_wide_repr_interactive                  |   1.2130 | 259.1007 |   0.0047 |
frame_wide_repr                              | 13969.2047 | 11723.8823 |   1.1915 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

So yes 7db1af4 definitely has an impact. Since i made sure that concise views are only used in interactive mode (before this was not always the case - i mentioned this also in #2881, this was not discussed further though).

@ghost
Copy link
Author

ghost commented Apr 12, 2013

ok, 7db1af4 made long frames print in full when in non-interactive mode,
where prior to that, you would just get the summary and that's the reason for the
perf difference. was that change intentional? what's the rationale?

@ghost
Copy link
Author

ghost commented Apr 12, 2013

I recall complaints about un-unix like behaviour in that we previously considered terminal
width in non-interactive mode. that was fixed in a prior release. is there a mailing list discussion
about the problem with summary views in non-interactive mode? I've seen no GH issues.

specifically, prior to 7db1af4 the user could opt in to full display by setting display.max_rows,
but now they can't opt-out, and they may be hit with a 13 sec (read: noticable) delay to boot.

@ghost
Copy link
Author

ghost commented Apr 14, 2013

@lodagro , seems the new repr code is ignoring the max_rows, max_col options,
and taking forever on large dataframes #3337.

@jreback
Copy link
Contributor

jreback commented Apr 15, 2013

@y-p can you post a rerun of what you did above when you have a chance? to just confirm the fixes?

@ghost
Copy link
Author

ghost commented Apr 15, 2013

I reran test_perf with your fixes and confirmed that the mean time for the vbench, over N runs, backed down
when compared with it's previous commit. some of the tests are noisy, so you need
a respectable N and some gut feeling.

with a fresh checkout of test_perf.py

test_perf.sh -q -S -H -N 100 - r "vbench_name"

will do 100 runs of the named vbench and will print out a describe() line with
summary stats for the 100 data points.

Use the build cache (now in scripts/use_build_cache.py) to jump around revisions quickly.

@lodagro
Copy link
Contributor

lodagro commented Apr 15, 2013

@y-p yes the change was intentional. I had in mind the same 'un-unix-like' behavior discussion #1610 and i disabled concise formats entirenly in non-interactive mode. By using pd.options.mode.sim_interactive = True, interactive mode can be enforced.

@ghost
Copy link
Author

ghost commented Apr 15, 2013

We need to decide where this discussion is taking place.
the sim_interactive =True idea makes no sense at all. it's a testing feature.

@ghost
Copy link
Author

ghost commented Apr 15, 2013

#1610 was about console width and made sense (I implemented the fix), I disagree that the same
argument works in the row side of things.

@ghost
Copy link
Author

ghost commented Apr 15, 2013

In fact, if you read #1610, it basically states the exact opposite then what you're suggesting.

@lodagro
Copy link
Contributor

lodagro commented Apr 15, 2013

#1610 requests that the the default in non-interactive mode is that there is no limit on the width/max_columns, this is still the case and i extended this to rows also. User can change it also.

For me no problem to keep only a vertical limit, so in non-interactive mode there can be a limit on height/max_rows but not on on width/max_columns, this will fix the script mentioned in #3337.

@ghost
Copy link
Author

ghost commented Apr 15, 2013

The problem was that we were relying on terminal dimensions in non-interactive mode.
That's a different matter then ignoring display options, such as max_rows. I think
those should be respected at all times. users are free to set them according to their needs.
I'm sure some users who want long-form output have already done so in their scripts.

From #1610
"""
In non-interactive usage the stdout should depend only on the print options currently in effect.
"""

I agree.

@lodagro
Copy link
Contributor

lodagro commented Apr 15, 2013

So both max_rows/max_columns should always be used and options display.width, display.height, display.expand_frame_repr only in interactive mode.

@ghost
Copy link
Author

ghost commented Apr 15, 2013

max_rows and max_columns should always be respected, just like in v0.10.1.
expand_frame_repr_only too I think, since users might like to have output
folded across page even in scripts, and they can opt out.

height and width are new options, we're free to choose their meaning.
Since they're disabled by default I think they should also be respected always.
if a user explicitly sets them in a script, why not?

terminal height/width detection should play no role in non-interactive mode.

That's the breakdown as I see it.

@ghost
Copy link
Author

ghost commented Apr 20, 2013

remaining issue to be discussed/closed in #3395

@ghost ghost closed this as completed Apr 20, 2013
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants