Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add value_counts implementation for Series and as free function #1535

Merged

Conversation

YarShev
Copy link
Collaborator

@YarShev YarShev commented Jun 4, 2020

Signed-off-by: Yaroslav Igoshev yaroslav.igoshev@intel.com

What do these changes do?

@YarShev YarShev added this to the 0.7.4 milestone Jun 4, 2020
@YarShev YarShev self-assigned this Jun 4, 2020
@YarShev YarShev changed the title Add value_counts implementation for both Series and as free function Add value_counts implementation for Series and as free function Jun 4, 2020
@YarShev YarShev linked an issue Jun 4, 2020 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Jun 4, 2020

Codecov Report

Merging #1535 into master will increase coverage by 0.07%.
The diff coverage is 92.64%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1535      +/-   ##
==========================================
+ Coverage   88.12%   88.20%   +0.07%     
==========================================
  Files          71       71              
  Lines        7023     7104      +81     
==========================================
+ Hits         6189     6266      +77     
- Misses        834      838       +4     
Impacted Files Coverage Δ
modin/pandas/__init__.py 88.00% <ø> (ø)
modin/engines/base/frame/data.py 94.08% <77.77%> (-0.31%) ⬇️
modin/pandas/series.py 94.20% <85.71%> (+0.01%) ⬆️
modin/backends/pandas/query_compiler.py 95.71% <95.83%> (+0.05%) ⬆️
modin/backends/base/query_compiler.py 100.00% <100.00%> (ø)
...din/data_management/functions/mapreducefunction.py 100.00% <100.00%> (ø)
modin/pandas/general.py 96.00% <100.00%> (+0.16%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0ad9da...28daea3. Read the comment docs.

@YarShev YarShev force-pushed the dev/yigoshev-value_counts branch 5 times, most recently from e6fd2f2 to 3452dc1 Compare June 4, 2020 12:37
Signed-off-by: Yaroslav Igoshev <yaroslav.igoshev@intel.com>
Copy link
Collaborator

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think value_counts can be implemented with groupby, what do you think?

self.groupby(self, as_index=True).count()

@YarShev
Copy link
Collaborator Author

YarShev commented Jun 4, 2020

@devin-petersohn , if implement value_counts the way using group_by, how will we use input parameters of value_counts? So far, I don't understand it. In addition, count do counting non-NA cells only, whereas values_count can consider NA cells. So, I think, we can keep the current approach.

@YarShev YarShev force-pushed the dev/yigoshev-value_counts branch 2 times, most recently from 5789e52 to 4689672 Compare June 5, 2020 07:48
@YarShev YarShev force-pushed the dev/yigoshev-value_counts branch from 4689672 to 8b2c4aa Compare June 5, 2020 07:48
@YarShev
Copy link
Collaborator Author

YarShev commented Jun 5, 2020

looks like the containerized tests are failed because of syntax error in /localdisk/tc_agent/temp/agentTmp/custom_script12880017450402304134: line 16: syntax error: unexpected end of file

@gshimansky
Copy link
Collaborator

looks like the containerized tests are failed because of syntax error in /localdisk/tc_agent/temp/agentTmp/custom_script12880017450402304134: line 16: syntax error: unexpected end of file

Yes. Containerized tests didn't send logs into PR comments because log file was created outside of container. I tried to fix it but broke all test runs. I am working on it.

@YarShev
Copy link
Collaborator Author

YarShev commented Jun 5, 2020

@gshimansky , ok, thanks.

Copy link
Collaborator

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Is it possible to do without _apply_full_axis? It is quite expensive and should only be used when absolutely required.

Something like this in the query compiler:

value_counts = MapReduceFunction(lambda x: x.squeeze().value_counts(**kwargs), lambda x: x.groupby(x.index).sum())

@modin-bot
Copy link

modin-bot commented Jun 5, 2020

TeamCity Python test results bot

Tests PASSed

Tests Logs
============================= test session starts ==============================
platform linux -- Python 3.7.7, pytest-5.4.3, py-1.8.1, pluggy-0.13.1
rootdir: /modin, inifile: setup.cfg
plugins: openfiles-0.5.0, remotedata-0.3.2, cov-2.10.0, custom-exit-code-0.3.0, forked-1.1.3, testmon-1.0.2, xdist-1.32.0
collected 90 items

modin/pandas/test/test_io.py .................s.........s............... [ 47%]
s..............s..s.X.....s.................ss.                          [100%]

----------- coverage: platform linux, python 3.7.7-final-0 -----------
Name                                                               Stmts   Miss  Cover
--------------------------------------------------------------------------------------
modin/__init__.py                                                     68     31    54%
modin/_version.py                                                    272    172    37%
modin/apply_license_header.py                                         19     19     0%
modin/backends/__init__.py                                             0      0   100%
modin/backends/base/__init__.py                                        0      0   100%
modin/backends/base/query_compiler.py                                128      1    99%
modin/backends/pandas/__init__.py                                      0      0   100%
modin/backends/pandas/parsers.py                                     118     91    23%
modin/backends/pandas/query_compiler.py                              624    356    43%
modin/data_management/__init__.py                                      0      0   100%
modin/data_management/dispatcher.py                                   78     16    79%
modin/data_management/factories.py                                    84     26    69%
modin/data_management/functions/__init__.py                            7      0   100%
modin/data_management/functions/binary_function.py                    21     14    33%
modin/data_management/functions/foldfunction.py                        6      1    83%
modin/data_management/functions/function.py                            6      1    83%
modin/data_management/functions/groupby_function.py                   55     49    11%
modin/data_management/functions/mapfunction.py                         6      1    83%
modin/data_management/functions/mapreducefunction.py                   9      2    78%
modin/data_management/functions/reductionfunction.py                   6      1    83%
modin/data_management/utils.py                                        32     12    62%
modin/engines/__init__.py                                              0      0   100%
modin/engines/base/__init__.py                                         0      0   100%
modin/engines/base/frame/__init__.py                                   0      0   100%
modin/engines/base/frame/axis_partition.py                            45     20    56%
modin/engines/base/frame/data.py                                     417    261    37%
modin/engines/base/frame/partition.py                                  1      0   100%
modin/engines/base/frame/partition_manager.py                        159     88    45%
modin/engines/base/io/__init__.py                                     11      0   100%
modin/engines/base/io/column_stores/__init__.py                        0      0   100%
modin/engines/base/io/column_stores/column_store_reader.py            40     29    28%
modin/engines/base/io/column_stores/feather_reader.py                  9      5    44%
modin/engines/base/io/column_stores/hdf_reader.py                      3      0   100%
modin/engines/base/io/column_stores/parquet_reader.py                 34     29    15%
modin/engines/base/io/file_reader.py                                  85     66    22%
modin/engines/base/io/io.py                                          108      6    94%
modin/engines/base/io/sql/__init__.py                                  0      0   100%
modin/engines/base/io/sql/sql_reader.py                               39     31    21%
modin/engines/base/io/text/__init__.py                                 0      0   100%
modin/engines/base/io/text/csv_reader.py                             112    105     6%
modin/engines/base/io/text/fwf_reader.py                             115    108     6%
modin/engines/base/io/text/json_reader.py                             50     43    14%
modin/engines/base/io/text/text_file_reader.py                        26     18    31%
modin/engines/base/series/__init__.py                                  0      0   100%
modin/engines/dask/__init__.py                                         0      0   100%
modin/engines/dask/pandas_on_dask/__init__.py                          0      0   100%
modin/engines/dask/pandas_on_dask/frame/__init__.py                    0      0   100%
modin/engines/dask/pandas_on_dask/frame/axis_partition.py             27     27     0%
modin/engines/dask/pandas_on_dask/frame/data.py                       15     15     0%
modin/engines/dask/pandas_on_dask/frame/partition.py                  73     73     0%
modin/engines/dask/pandas_on_dask/frame/partition_manager.py          44     44     0%
modin/engines/dask/pandas_on_dask/io.py                               16     16     0%
modin/engines/dask/pandas_on_dask/series/__init__.py                   0      0   100%
modin/engines/dask/task_wrapper.py                                     9      9     0%
modin/engines/python/__init__.py                                       0      0   100%
modin/engines/python/pandas_on_python/__init__.py                      0      0   100%
modin/engines/python/pandas_on_python/frame/__init__.py                0      0   100%
modin/engines/python/pandas_on_python/frame/axis_partition.py         14      0   100%
modin/engines/python/pandas_on_python/frame/data.py                    4      0   100%
modin/engines/python/pandas_on_python/frame/partition.py              63      5    92%
modin/engines/python/pandas_on_python/frame/partition_manager.py       7      0   100%
modin/engines/python/pandas_on_python/io.py                            6      0   100%
modin/engines/python/pandas_on_python/series/__init__.py               0      0   100%
modin/engines/ray/__init__.py                                          0      0   100%
modin/engines/ray/generic/__init__.py                                  0      0   100%
modin/engines/ray/generic/frame/__init__.py                            0      0   100%
modin/engines/ray/generic/frame/partition_manager.py                  10     10     0%
modin/engines/ray/generic/io.py                                       14     14     0%
modin/engines/ray/generic/series/__init__.py                           0      0   100%
modin/engines/ray/pandas_on_ray/__init__.py                            0      0   100%
modin/engines/ray/pandas_on_ray/frame/__init__.py                      0      0   100%
modin/engines/ray/pandas_on_ray/frame/axis_partition.py               22     22     0%
modin/engines/ray/pandas_on_ray/frame/data.py                         11     11     0%
modin/engines/ray/pandas_on_ray/frame/partition.py                    84     84     0%
modin/engines/ray/pandas_on_ray/frame/partition_manager.py            43     43     0%
modin/engines/ray/pandas_on_ray/io.py                                 17     17     0%
modin/engines/ray/pandas_on_ray/series/__init__.py                     0      0   100%
modin/engines/ray/task_wrapper.py                                      7      7     0%
modin/engines/ray/utils.py                                            10     10     0%
modin/error_message.py                                                22      5    77%
modin/experimental/__init__.py                                         0      0   100%
modin/experimental/engines/__init__.py                                 0      0   100%
modin/experimental/engines/pandas_on_ray/__init__.py                   0      0   100%
modin/experimental/engines/pandas_on_ray/io_exp.py                    38     38     0%
modin/experimental/engines/pandas_on_ray/sql.py                       66     66     0%
modin/experimental/pandas/__init__.py                                  6      6     0%
modin/experimental/pandas/io_exp.py                                    7      7     0%
modin/pandas/__init__.py                                              75     42    44%
modin/pandas/base.py                                                1048    808    23%
modin/pandas/concat.py                                                42     36    14%
modin/pandas/dataframe.py                                            842    652    23%
modin/pandas/datetimes.py                                              7      3    57%
modin/pandas/general.py                                               50     32    36%
modin/pandas/groupby.py                                              287    197    31%
modin/pandas/indexing.py                                             185    185     0%
modin/pandas/io.py                                                   147     11    93%
modin/pandas/iterator.py                                              17     11    35%
modin/pandas/reshape.py                                               30     20    33%
modin/pandas/series.py                                               896    578    35%
modin/pandas/utils.py                                                 28      3    89%
--------------------------------------------------------------------------------------
TOTAL                                                               7082   4709    34%


=========== 81 passed, 8 skipped, 1 xpassed, 137 warnings in 44.77s ============
Closing remaining open files:test_write_modin.hdf...donetest_write_pandas.hdf...done
============================= test session starts ==============================
platform linux -- Python 3.7.7, pytest-5.4.3, py-1.8.1, pluggy-0.13.1
rootdir: /modin, inifile: setup.cfg
plugins: openfiles-0.5.0, remotedata-0.3.2, cov-2.10.0, custom-exit-code-0.3.0, forked-1.1.3, testmon-1.0.2, xdist-1.32.0
gw0 I / gw1 I / gw2 I / gw3 I / gw4 I / gw5 I / gw6 I / gw7 I / gw8 I / gw9 I / gw10 I / gw11 I / gw12 I / gw13 I / gw14 I / gw15 I / gw16 I / gw17 I / gw18 I / gw19 I / gw20 I / gw21 I / gw22 I / gw23 I / gw24 I / gw25 I / gw26 I / gw27 I / gw28 I / gw29 I / gw30 I / gw31 I / gw32 I / gw33 I / gw34 I / gw35 I / gw36 I / gw37 I / gw38 I / gw39 I / gw40 I / gw41 I / gw42 I / gw43 I / gw44 I / gw45 I / gw46 I / gw47 I
gw0 [18512] / gw1 [18512] / gw2 [18512] / gw3 [18512] / gw4 [18512] / gw5 [18512] / gw6 [18512] / gw7 [18512] / gw8 [18512] / gw9 [18512] / gw10 [18512] / gw11 [18512] / gw12 [18512] / gw13 [18512] / gw14 [18512] / gw15 [18512] / gw16 [18512] / gw17 [18512] / gw18 [18512] / gw19 [18512] / gw20 [18512] / gw21 [18512] / gw22 [18512] / gw23 [18512] / gw24 [18512] / gw25 [18512] / gw26 [18512] / gw27 [18512] / gw28 [18512] / gw29 [18512] / gw30 [18512] / gw31 [18512] / gw32 [18512] / gw33 [18512] / gw34 [18512] / gw35 [18512] / gw36 [18512] / gw37 [18512] / gw38 [18512] / gw39 [18512] / gw40 [18512] / gw41 [18512] / gw42 [18512] / gw43 [18512] / gw44 [18512] / gw45 [18512] / gw46 [18512] / gw47 [18512]

........................................................................ [  0%]
........................................................................ [  0%]
........................................................................ [  1%]
........................................................................ [  1%]
........................................................................ [  1%]
........................................................................ [  2%]
........................................................................ [  2%]
........................................................................ [  3%]
..............x..............................x.......................... [  3%]
....X..........................................X........................ [  3%]
............X...........................X..................X.X.....X.... [  4%]
.....X....................X.X..............X...XX....................... [  4%]
......X......Xs........X..................X............X...........x.... [  5%]
XX...............X.X.............................X...X..X....XX...X...X. [  5%]
Xx.x..........XX.........X..X..X...X.X.....X.X....X..X....X...XX.X...X.. [  5%]
XX...X..XXX.X......X..XX.....X.XX....X.X.....XX.XXX..XX....X.X.XXX..Xs.. [  6%]
XX..XXXX.......X.XX.X...X.X....X..X......X.XX.XX..X...X.....X...Xx.X.XX. [  6%]
X.....X.X...X....x..X.x...X.XXX.X...X...X........X..X..xXX.........X..X. [  7%]
.........X.X........s.....X......X....X.....X.................X........X [  7%]
.......XxX...X.......X.........xx.........x......X....x.X....Xx.....XX.. [  7%]
....x.X.x...x..xx..XX....XX..XX...X.x.X.x...XX....xXX...XXXx.X.XX..X.... [  8%]
...X....x.XX...x.XX....XX.xxX....X....X..X..XXX.X.XX....Xxx.X.....X..XX. [  8%]
.XXX..XX.X....X.X.XX...XX.X...X.....X..XXX..X.XX...XX....X...X..XX.X.X.X [  8%]
XX........X..XXX..X..XXX.X...X.XXX....X...x.X..X..XXX..X.XX.....X...XXXX [  9%]
X.....XX..Xx....XXX.XXX.X.X...XXX.X........X..X..X..X...XXXXXXX...X...XX [  9%]
.....X..XX.XX.XX....XX...X..X.XX.XXX.......X....X...X..X.XXxX.X....XX... [ 10%]
X...XX....X.X..X.XXx.XXXXX.X....x.....X...X........XX.X.....X.X.XXX....X [ 10%]
.X..X.....XX....X....XX......XX....XX....................XX....X........ [ 10%]
..X...X......X........X..........X.X......x..x......X......x..X.......X. [ 11%]
X..X.X........XX.xX..X.....X..................X.....XXXX......X......... [ 11%]
X.X.X....................X............X......X.XX........X..........X..X [ 12%]
x...............XXXX.....X..Xx....x.......X.......xXX.X.....X.X..X.....X [ 12%]
....xX.....X...Xxx...x....X...x.X...x..X...x..X.X..x.........x...XXx.... [ 12%]
x.XXX.....X..XX.......xX..xX.....x....X..X....X..XX...X.XX.x..X......... [ 13%]
....X...XX.X.X......X..XX.....XXX....Xs....X.X.X..X.....X..XX.x.....X... [ 13%]
X........X.X..Xx.......XX....X....X....XX......X........................ [ 14%]
.X........X......X..........X....................X..x........X.......... [ 14%]
.............X.......................................................... [ 14%]
...............x....................x.................xX................ [ 15%]
...............................X........................x............... [ 15%]
.X...............................x..X.........................X......... [ 15%]
........................................................................ [ 16%]
..........................................................x............. [ 16%]
........................................................................ [ 17%]
....................................x.....................x............. [ 17%]
......................x................................................. [ 17%]
.......x.............x.x......................x....................s.... [ 18%]
................x...................................X....x.............. [ 18%]
.....................................X.................................. [ 19%]
...x...X...............................................x...........X.... [ 19%]
................x..............................X..........x............. [ 19%]
........................................................................ [ 20%]
....................................................x................... [ 20%]
........................................................................ [ 21%]
........................................................................ [ 21%]
........................................................................ [ 21%]
........................................................................ [ 22%]
.....x.................................................................. [ 22%]
........................................................................ [ 22%]
........................................................................ [ 23%]
........................................................................ [ 23%]
........................................................................ [ 24%]
.....................................s.................................. [ 24%]
........................................................................ [ 24%]
........................................................................ [ 25%]
........................................................................ [ 25%]
........................................................................ [ 26%]
........................................................................ [ 26%]
........................................................................ [ 26%]
........................................................................ [ 27%]
........................................................................ [ 27%]
........................................................................ [ 28%]
........................................................................ [ 28%]
........................................................................ [ 28%]
.............................x....................X..................... [ 29%]
..X...............X.................X..............X.................... [ 29%]
.......xx................X...X............x..X......X.X................X [ 29%]
XX..........X......X.X.......X.....XX...............X....x........X..... [ 30%]
X...........X...............X...............x....X...X............X..... [ 30%]
.....X..............X......X...........................................x [ 31%]
....x....X....X.x.....X.X.X....X..........X........X...XX..x...X...X..XX [ 31%]
.X...X.........................................X........................ [ 31%]
.....................X......X.....X..................................... [ 32%]
........................................................................ [ 32%]
........................................................................ [ 33%]
........................................................................ [ 33%]
........................................................................ [ 33%]
........................................................................ [ 34%]
......................................................................... [ 34%]
........................................................................ [ 35%]
......................................................................... [ 35%]
........................................................................ [ 35%]
........................................................................ [ 36%]
.......................................................................... [ 36%]
........................................................................ [ 36%]
..................................................................X...... [ 37%]
......................................................................... [ 37%]
........................................................................ [ 38%]
.......................................................................... [ 38%]
........................................................................ [ 38%]
......................................................................... [ 39%]
........................................................................ [ 39%]
.......................................................................... [ 40%]
...................ssss..ssss.ssss.ssss.s.s............................. [ 40%]
........................................................................ [ 40%]
......................................................................... [ 41%]
......................................................................... [ 41%]
........................................................................ [ 42%]
......................................................................... [ 42%]
...X..........X........X.....X............X.......X..................... [ 42%]
........................................................................ [ 43%]
........................................................................ [ 43%]
......................................................................... [ 44%]
......................................................................... [ 44%]
........................................................................ [ 44%]
......................................................................... [ 45%]
........................................................................ [ 45%]
......................................................................... [ 45%]
......................................................................... [ 46%]
........................................................................ [ 46%]
........................................................................ [ 47%]
........................................................................ [ 47%]
........................................................................ [ 47%]
........................................................................ [ 48%]
........................................................................ [ 48%]
........................................................................ [ 49%]
........................................................................ [ 49%]
......................................................................... [ 49%]
......................................................................... [ 50%]
........................................................................ [ 50%]
........................................................................ [ 51%]
......................................................................... [ 51%]
........................................................................ [ 51%]
......................................................................... [ 52%]
........................................................................ [ 52%]
........................................................................ [ 53%]
........................................................................ [ 53%]
........................................................................ [ 53%]
........................................................................ [ 54%]
........................................................................ [ 54%]
........................................................................ [ 54%]
........................................................................ [ 55%]
........................................................................ [ 55%]
........................................................................ [ 56%]
........................................................................ [ 56%]
........................................................................ [ 56%]
........................................................................ [ 57%]
......................................................................... [ 57%]
........................................................................ [ 58%]
........................................................................ [ 58%]
......................................................................... [ 58%]
........................................................................ [ 59%]
.......................................................................... [ 59%]
........................................................................ [ 60%]
......................................................................... [ 60%]
........................................................................ [ 60%]
......................................................................... [ 61%]
........................................................................ [ 61%]
........................................................................ [ 61%]
........................................................................ [ 62%]
......................................................................... [ 62%]
........................................................................ [ 63%]
........................................................................ [ 63%]
........................................................................ [ 63%]
........................................................................ [ 64%]
........................................................................ [ 64%]
........................................................................ [ 65%]
......................................................................... [ 65%]
........................................................................ [ 65%]
......................................................................... [ 66%]
........................................................................ [ 66%]
........................................................................ [ 67%]
........................................................................ [ 67%]
........................................................................ [ 67%]
........................................................................ [ 68%]
........................................................................ [ 68%]
........................................................................ [ 69%]
......................................................................... [ 69%]
......................................................................... [ 69%]
.......................................................................... [ 70%]
......................................................................... [ 70%]
.......................................................................... [ 70%]
........................................................................ [ 71%]
........................................................................ [ 71%]
........................................................................ [ 72%]
........................................................................ [ 72%]
........................................................................ [ 72%]
........................................................................ [ 73%]
........................................................................ [ 73%]
........................................................................ [ 74%]
........................................................................ [ 74%]
......................................................................... [ 74%]
........................................................................ [ 75%]
........................................................................ [ 75%]
........................................................................ [ 76%]
........................................................................ [ 76%]
......................................................................... [ 76%]
........................................................................ [ 77%]
........................................................................ [ 77%]
......................................................................... [ 78%]
........................................................................ [ 78%]
........................................................................ [ 78%]
........................................................................ [ 79%]
......................................................................... [ 79%]
........................................................................ [ 79%]
........................................................................ [ 80%]
........................................................................ [ 80%]
......................................................................... [ 81%]
........................................................................ [ 81%]
........................................................................ [ 81%]
........................................................................ [ 82%]
........................................................................ [ 82%]
........................................................................ [ 83%]
........................................................................ [ 83%]
.......................................................................... [ 83%]
......................................................................... [ 84%]
......................................................................... [ 84%]
........................................................................ [ 85%]
........................................................................ [ 85%]
........................................................................ [ 85%]
........................................................................ [ 86%]
........................................................................ [ 86%]
........................................................................ [ 86%]
........................................................................ [ 87%]
........................................................................ [ 87%]
........................................................................ [ 88%]
........................................................................ [ 88%]
........................................................................ [ 88%]
........................................................................ [ 89%]
........................................................................ [ 89%]
........................................................................ [ 90%]
........................................................................ [ 90%]
........................................s...ssss..s..................... [ 90%]
........................................................................ [ 91%]
....................xs......X.......X....x...X.......X......x.......x... [ 91%]
........................................................................ [ 92%]
......................................................................... [ 92%]
.......................................................................... [ 92%]
........................................................................ [ 93%]
........................................................................ [ 93%]
........................................................................ [ 94%]
........................................................................ [ 94%]
......................................................................... [ 94%]
........................................................................ [ 95%]
........................................................................ [ 95%]
........................................................................ [ 95%]
........................................................................ [ 96%]
........................................................................ [ 96%]
...........xx........xxxxx.........................................x.... [ 97%]
........................................................................ [ 97%]
........................................................................ [ 97%]
........................................................................ [ 98%]
........................................................................ [ 98%]
........................................................................ [ 99%]
........................................................................ [ 99%]
........................................................................ [ 99%]
............................                                             [100%]

----------- coverage: platform linux, python 3.7.7-final-0 -----------
Name                                                               Stmts   Miss  Cover
--------------------------------------------------------------------------------------
modin/__init__.py                                                     68     31    54%
modin/_version.py                                                    272    172    37%
modin/apply_license_header.py                                         19     19     0%
modin/backends/__init__.py                                             0      0   100%
modin/backends/base/__init__.py                                        0      0   100%
modin/backends/base/query_compiler.py                                128      0   100%
modin/backends/pandas/__init__.py                                      0      0   100%
modin/backends/pandas/parsers.py                                     118     87    26%
modin/backends/pandas/query_compiler.py                              624     27    96%
modin/data_management/__init__.py                                      0      0   100%
modin/data_management/dispatcher.py                                   78     16    79%
modin/data_management/factories.py                                    84     26    69%
modin/data_management/functions/__init__.py                            7      0   100%
modin/data_management/functions/binary_function.py                    21      0   100%
modin/data_management/functions/foldfunction.py                        6      0   100%
modin/data_management/functions/function.py                            6      1    83%
modin/data_management/functions/groupby_function.py                   55      6    89%
modin/data_management/functions/mapfunction.py                         6      0   100%
modin/data_management/functions/mapreducefunction.py                   9      0   100%
modin/data_management/functions/reductionfunction.py                   6      0   100%
modin/data_management/utils.py                                        32      0   100%
modin/engines/__init__.py                                              0      0   100%
modin/engines/base/__init__.py                                         0      0   100%
modin/engines/base/frame/__init__.py                                   0      0   100%
modin/engines/base/frame/axis_partition.py                            45     10    78%
modin/engines/base/frame/data.py                                     417     25    94%
modin/engines/base/frame/partition.py                                  1      0   100%
modin/engines/base/frame/partition_manager.py                        159     20    87%
modin/engines/base/io/__init__.py                                     11      0   100%
modin/engines/base/io/column_stores/__init__.py                        0      0   100%
modin/engines/base/io/column_stores/column_store_reader.py            40     29    28%
modin/engines/base/io/column_stores/feather_reader.py                  9      5    44%
modin/engines/base/io/column_stores/hdf_reader.py                      3      0   100%
modin/engines/base/io/column_stores/parquet_reader.py                 34     29    15%
modin/engines/base/io/file_reader.py                                  85     66    22%
modin/engines/base/io/io.py                                          108      6    94%
modin/engines/base/io/sql/__init__.py                                  0      0   100%
modin/engines/base/io/sql/sql_reader.py                               39     31    21%
modin/engines/base/io/text/__init__.py                                 0      0   100%
modin/engines/base/io/text/csv_reader.py                             112    105     6%
modin/engines/base/io/text/fwf_reader.py                             115    108     6%
modin/engines/base/io/text/json_reader.py                             50     43    14%
modin/engines/base/io/text/text_file_reader.py                        26     18    31%
modin/engines/base/series/__init__.py                                  0      0   100%
modin/engines/dask/__init__.py                                         0      0   100%
modin/engines/dask/pandas_on_dask/__init__.py                          0      0   100%
modin/engines/dask/pandas_on_dask/frame/__init__.py                    0      0   100%
modin/engines/dask/pandas_on_dask/frame/axis_partition.py             27     27     0%
modin/engines/dask/pandas_on_dask/frame/data.py                       15     15     0%
modin/engines/dask/pandas_on_dask/frame/partition.py                  73     73     0%
modin/engines/dask/pandas_on_dask/frame/partition_manager.py          44     44     0%
modin/engines/dask/pandas_on_dask/io.py                               16     16     0%
modin/engines/dask/pandas_on_dask/series/__init__.py                   0      0   100%
modin/engines/dask/task_wrapper.py                                     9      9     0%
modin/engines/python/__init__.py                                       0      0   100%
modin/engines/python/pandas_on_python/__init__.py                      0      0   100%
modin/engines/python/pandas_on_python/frame/__init__.py                0      0   100%
modin/engines/python/pandas_on_python/frame/axis_partition.py         14      0   100%
modin/engines/python/pandas_on_python/frame/data.py                    4      0   100%
modin/engines/python/pandas_on_python/frame/partition.py              63      4    94%
modin/engines/python/pandas_on_python/frame/partition_manager.py       7      0   100%
modin/engines/python/pandas_on_python/io.py                            6      0   100%
modin/engines/python/pandas_on_python/series/__init__.py               0      0   100%
modin/engines/ray/__init__.py                                          0      0   100%
modin/engines/ray/generic/__init__.py                                  0      0   100%
modin/engines/ray/generic/frame/__init__.py                            0      0   100%
modin/engines/ray/generic/frame/partition_manager.py                  10     10     0%
modin/engines/ray/generic/io.py                                       14     14     0%
modin/engines/ray/generic/series/__init__.py                           0      0   100%
modin/engines/ray/pandas_on_ray/__init__.py                            0      0   100%
modin/engines/ray/pandas_on_ray/frame/__init__.py                      0      0   100%
modin/engines/ray/pandas_on_ray/frame/axis_partition.py               22     22     0%
modin/engines/ray/pandas_on_ray/frame/data.py                         11     11     0%
modin/engines/ray/pandas_on_ray/frame/partition.py                    84     84     0%
modin/engines/ray/pandas_on_ray/frame/partition_manager.py            43     43     0%
modin/engines/ray/pandas_on_ray/io.py                                 17     17     0%
modin/engines/ray/pandas_on_ray/series/__init__.py                     0      0   100%
modin/engines/ray/task_wrapper.py                                      7      7     0%
modin/engines/ray/utils.py                                            10     10     0%
modin/error_message.py                                                22      2    91%
modin/experimental/__init__.py                                         0      0   100%
modin/experimental/engines/__init__.py                                 0      0   100%
modin/experimental/engines/pandas_on_ray/__init__.py                   0      0   100%
modin/experimental/engines/pandas_on_ray/io_exp.py                    38     38     0%
modin/experimental/engines/pandas_on_ray/sql.py                       66     66     0%
modin/experimental/pandas/__init__.py                                  6      6     0%
modin/experimental/pandas/io_exp.py                                    7      7     0%
modin/pandas/__init__.py                                              75     42    44%
modin/pandas/base.py                                                1048     51    95%
modin/pandas/concat.py                                                42      5    88%
modin/pandas/dataframe.py                                            842     96    89%
modin/pandas/datetimes.py                                              7      0   100%
modin/pandas/general.py                                               50      2    96%
modin/pandas/groupby.py                                              287     31    89%
modin/pandas/indexing.py                                             185     30    84%
modin/pandas/io.py                                                   147     11    93%
modin/pandas/iterator.py                                              17      0   100%
modin/pandas/reshape.py                                               30      0   100%
modin/pandas/series.py                                               896     54    94%
modin/pandas/utils.py                                                 28      3    89%
--------------------------------------------------------------------------------------
TOTAL                                                               7082   1730    76%

= 17862 passed, 31 skipped, 104 xfailed, 515 xpassed, 32109 warnings in 849.17s (0:14:09) =

TOTAL                                                               7082   1730    76%

= 17862 passed, 31 skipped, 104 xfailed, 515 xpassed, 32109 warnings in 849.17s (0:14:09) =

@modin-bot
Copy link

modin-bot commented Jun 5, 2020

TeamCity Dask test results bot

Tests PASSed

Tests Logs
============================= test session starts ==============================
platform linux -- Python 3.7.7, pytest-5.4.3, py-1.8.1, pluggy-0.13.1
rootdir: /modin, inifile: setup.cfg
plugins: openfiles-0.5.0, remotedata-0.3.2, cov-2.10.0, custom-exit-code-0.3.0, forked-1.1.3, testmon-1.0.2, xdist-1.32.0
collected 90 items

modin/pandas/test/test_io.py .................s.........s............... [ 47%]
...............s..s.X.....s.................ss.                          [100%]

----------- coverage: platform linux, python 3.7.7-final-0 -----------
Name                                                               Stmts   Miss  Cover
--------------------------------------------------------------------------------------
modin/__init__.py                                                     68     31    54%
modin/_version.py                                                    272    172    37%
modin/apply_license_header.py                                         19     19     0%
modin/backends/__init__.py                                             0      0   100%
modin/backends/base/__init__.py                                        0      0   100%
modin/backends/base/query_compiler.py                                128      1    99%
modin/backends/pandas/__init__.py                                      0      0   100%
modin/backends/pandas/parsers.py                                     118     60    49%
modin/backends/pandas/query_compiler.py                              624    352    44%
modin/data_management/__init__.py                                      0      0   100%
modin/data_management/dispatcher.py                                   78     16    79%
modin/data_management/factories.py                                    84     26    69%
modin/data_management/functions/__init__.py                            7      0   100%
modin/data_management/functions/binary_function.py                    21     14    33%
modin/data_management/functions/foldfunction.py                        6      1    83%
modin/data_management/functions/function.py                            6      1    83%
modin/data_management/functions/groupby_function.py                   55     49    11%
modin/data_management/functions/mapfunction.py                         6      1    83%
modin/data_management/functions/mapreducefunction.py                   9      2    78%
modin/data_management/functions/reductionfunction.py                   6      1    83%
modin/data_management/utils.py                                        32     14    56%
modin/engines/__init__.py                                              0      0   100%
modin/engines/base/__init__.py                                         0      0   100%
modin/engines/base/frame/__init__.py                                   0      0   100%
modin/engines/base/frame/axis_partition.py                            45     31    31%
modin/engines/base/frame/data.py                                     417    226    46%
modin/engines/base/frame/partition.py                                  1      0   100%
modin/engines/base/frame/partition_manager.py                        159     80    50%
modin/engines/base/io/__init__.py                                     11      0   100%
modin/engines/base/io/column_stores/__init__.py                        0      0   100%
modin/engines/base/io/column_stores/column_store_reader.py            40      0   100%
modin/engines/base/io/column_stores/feather_reader.py                  9      0   100%
modin/engines/base/io/column_stores/hdf_reader.py                      3      0   100%
modin/engines/base/io/column_stores/parquet_reader.py                 34      1    97%
modin/engines/base/io/file_reader.py                                  85      7    92%
modin/engines/base/io/io.py                                          108     25    77%
modin/engines/base/io/sql/__init__.py                                  0      0   100%
modin/engines/base/io/sql/sql_reader.py                               39      1    97%
modin/engines/base/io/text/__init__.py                                 0      0   100%
modin/engines/base/io/text/csv_reader.py                             112      4    96%
modin/engines/base/io/text/fwf_reader.py                             115    108     6%
modin/engines/base/io/text/json_reader.py                             50      2    96%
modin/engines/base/io/text/text_file_reader.py                        26      1    96%
modin/engines/base/series/__init__.py                                  0      0   100%
modin/engines/dask/__init__.py                                         0      0   100%
modin/engines/dask/pandas_on_dask/__init__.py                          0      0   100%
modin/engines/dask/pandas_on_dask/frame/__init__.py                    0      0   100%
modin/engines/dask/pandas_on_dask/frame/axis_partition.py             27      5    81%
modin/engines/dask/pandas_on_dask/frame/data.py                       15      0   100%
modin/engines/dask/pandas_on_dask/frame/partition.py                  73     21    71%
modin/engines/dask/pandas_on_dask/frame/partition_manager.py          44     30    32%
modin/engines/dask/pandas_on_dask/io.py                               16      0   100%
modin/engines/dask/pandas_on_dask/series/__init__.py                   0      0   100%
modin/engines/dask/task_wrapper.py                                     9      0   100%
modin/engines/python/__init__.py                                       0      0   100%
modin/engines/python/pandas_on_python/__init__.py                      0      0   100%
modin/engines/python/pandas_on_python/frame/__init__.py                0      0   100%
modin/engines/python/pandas_on_python/frame/axis_partition.py         14     14     0%
modin/engines/python/pandas_on_python/frame/data.py                    4      4     0%
modin/engines/python/pandas_on_python/frame/partition.py              63     63     0%
modin/engines/python/pandas_on_python/frame/partition_manager.py       7      7     0%
modin/engines/python/pandas_on_python/io.py                            6      6     0%
modin/engines/python/pandas_on_python/series/__init__.py               0      0   100%
modin/engines/ray/__init__.py                                          0      0   100%
modin/engines/ray/generic/__init__.py                                  0      0   100%
modin/engines/ray/generic/frame/__init__.py                            0      0   100%
modin/engines/ray/generic/frame/partition_manager.py                  10     10     0%
modin/engines/ray/generic/io.py                                       14     14     0%
modin/engines/ray/generic/series/__init__.py                           0      0   100%
modin/engines/ray/pandas_on_ray/__init__.py                            0      0   100%
modin/engines/ray/pandas_on_ray/frame/__init__.py                      0      0   100%
modin/engines/ray/pandas_on_ray/frame/axis_partition.py               22     22     0%
modin/engines/ray/pandas_on_ray/frame/data.py                         11     11     0%
modin/engines/ray/pandas_on_ray/frame/partition.py                    84     84     0%
modin/engines/ray/pandas_on_ray/frame/partition_manager.py            43     43     0%
modin/engines/ray/pandas_on_ray/io.py                                 17     17     0%
modin/engines/ray/pandas_on_ray/series/__init__.py                     0      0   100%
modin/engines/ray/task_wrapper.py                                      7      7     0%
modin/engines/ray/utils.py                                            10     10     0%
modin/error_message.py                                                22      5    77%
modin/experimental/__init__.py                                         0      0   100%
modin/experimental/engines/__init__.py                                 0      0   100%
modin/experimental/engines/pandas_on_ray/__init__.py                   0      0   100%
modin/experimental/engines/pandas_on_ray/io_exp.py                    38     38     0%
modin/experimental/engines/pandas_on_ray/sql.py                       66     66     0%
modin/experimental/pandas/__init__.py                                  6      6     0%
modin/experimental/pandas/io_exp.py                                    7      7     0%
modin/pandas/__init__.py                                              75     43    43%
modin/pandas/base.py                                                1048    808    23%
modin/pandas/concat.py                                                42     36    14%
modin/pandas/dataframe.py                                            842    652    23%
modin/pandas/datetimes.py                                              7      3    57%
modin/pandas/general.py                                               50     32    36%
modin/pandas/groupby.py                                              287    197    31%
modin/pandas/indexing.py                                             185    185     0%
modin/pandas/io.py                                                   147     11    93%
modin/pandas/iterator.py                                              17     11    35%
modin/pandas/reshape.py                                               30     20    33%
modin/pandas/series.py                                               896    578    35%
modin/pandas/utils.py                                                 28      3    89%
--------------------------------------------------------------------------------------
TOTAL                                                               7082   4315    39%


====== 82 passed, 7 skipped, 1 xpassed, 108 warnings in 67.75s (0:01:07) =======
Closing remaining open files:test_write_modin.hdf...donetest_write_pandas.hdf...done
============================= test session starts ==============================
platform linux -- Python 3.7.7, pytest-5.4.3, py-1.8.1, pluggy-0.13.1
rootdir: /modin, inifile: setup.cfg
plugins: openfiles-0.5.0, remotedata-0.3.2, cov-2.10.0, custom-exit-code-0.3.0, forked-1.1.3, testmon-1.0.2, xdist-1.32.0
gw0 I / gw1 I / gw2 I / gw3 I / gw4 I / gw5 I / gw6 I / gw7 I / gw8 I / gw9 I / gw10 I / gw11 I / gw12 I / gw13 I / gw14 I / gw15 I / gw16 I / gw17 I / gw18 I / gw19 I / gw20 I / gw21 I / gw22 I / gw23 I / gw24 I / gw25 I / gw26 I / gw27 I / gw28 I / gw29 I / gw30 I / gw31 I / gw32 I / gw33 I / gw34 I / gw35 I / gw36 I / gw37 I / gw38 I / gw39 I / gw40 I / gw41 I / gw42 I / gw43 I / gw44 I / gw45 I / gw46 I / gw47 I
gw0 [18512] / gw1 [18512] / gw2 [18512] / gw3 [18512] / gw4 [18512] / gw5 [18512] / gw6 [18512] / gw7 [18512] / gw8 [18512] / gw9 [18512] / gw10 [18512] / gw11 [18512] / gw12 [18512] / gw13 [18512] / gw14 [18512] / gw15 [18512] / gw16 [18512] / gw17 [18512] / gw18 [18512] / gw19 [18512] / gw20 [18512] / gw21 [18512] / gw22 [18512] / gw23 [18512] / gw24 [18512] / gw25 [18512] / gw26 [18512] / gw27 [18512] / gw28 [18512] / gw29 [18512] / gw30 [18512] / gw31 [18512] / gw32 [18512] / gw33 [18512] / gw34 [18512] / gw35 [18512] / gw36 [18512] / gw37 [18512] / gw38 [18512] / gw39 [18512] / gw40 [18512] / gw41 [18512] / gw42 [18512] / gw43 [18512] / gw44 [18512] / gw45 [18512] / gw46 [18512] / gw47 [18512]

........................................................................ [  0%]
........................................................................ [  0%]
........................................................................ [  1%]
........................................................................ [  1%]
........................................................................ [  1%]
......................................................................... [  2%]
........................................................................ [  2%]
........................................................................ [  3%]
........................................................................ [  3%]
...................................................s.................... [  3%]
.............x................x......................................... [  4%]
.X.....................................X................................ [  4%]
.........X.......................X.................X......X......X.X.X.. [  5%]
..X..XXX........X.....XX..sX..X.........XX........XX........XX.........X [  5%]
.X..........X.......x.X..x....x.x.....x..........x....xx.x............xx [  5%]
.x.x.....X....x.X........X.....X....x..X......X.X.X...X......XX.X.XX.X... [  6%]
...X..X.XX..XX..X..X....XX....X.X..X.XX....XX.X.X.X..XXX.XXXXXX...XX..X. [  6%]
X..X.X.X..XXXXX..XXxXX..XXXXX..XX..XX...XXxXX.X...XX.XXxX.X.....XX.XXX.X [  7%]
X.xxXXX.xX.xX.XX.XX.X..XXXxxX.Xx.XXX..XX..XX....XX.X.X......XXXX.X...XX. [  7%]
.....XX..X.....XX...XX....X...XX......X........X...........X.....X.......X [  7%]
.......................X.X.......xX.X...............XX.....X.X.XX....... [  8%]
....X....X.......X.X....XX.X......XX..X.X.....X..X.XX..XX.X....X.XX....X [  8%]
.X..X.X.XX.X..XX...XX.X...X......X..X....X.X..X.XX..X.X........X.X.XX... [  8%]
XX.....X.X.X...X.XX...XXX.....X.X..X.......XX..XX..X.X...X........X.X.XX [  9%]
..X.X.....X....X.X..X.X..X..............X....XXxX.........X....x..X...XX. [  9%]
...........X........X..................X.X..x....X.X.............x.X.... [ 10%]
..xX...............xx.X..x..............xX..X....XX.X...........X...X...X [ 10%]
.XX.x...xx..X....XXXX...xx...XXX..X.....x........XX.XX.........X.X..XX... [ 10%]
XX..X...X......X.X...X.....X....X.....X..X...X.xX.XX..X........Xx.X..... [ 11%]
X......xX.xx....X...X.......x.........X...X...X.Xx.....X.....X.....xx.xX [ 11%]
..XX.X..X.......XXxX.....X........X.X..X.X............X....X....X.X.X..x [ 12%]
..X..X...X...X..Xx.xX.....X..X..XX...X.......X.X...X...X.XX.....X.X...X. [ 12%]
....X...XX.XX.X...X...X.s..x....X..X..X.X............X......X....X.X.X.. [ 12%]
......X....X....XX.......X..X........X..X.....x....X........X..........X [ 13%]
..........X..X....X.......X......XX..X..X............X....xXx........... [ 13%]
..............X......X.........X...X........X....x..X...........X....... [ 14%]
............X.............X.......X..........x..x....................... [ 14%]
....X.......................................X.....X.X.............X..X.. [ 14%]
..................XX...........x......X.....x..........X................ [ 15%]
x.X.........X........................X...........X...x.................. [ 15%]
.X...X...........X............X.x.x.X.X....X.............X.....X....s... [ 15%]
X.............sX.X....X.....X..............X.....X.XXX...........x...x.. [ 16%]
.X......x.....x.......X..................X...XX..X..................X... [ 16%]
..............X...X.......................X............X.........X....... [ 17%]
.......X...............X.x..X........X................................... [ 17%]
..X..........X.............................X...........X................ [ 17%]
........X......x....x..xX.....................X............X............ [ 18%]
.........X...........................x...........................X...... [ 18%]
........................................................................ [ 19%]
.......................X................................................ [ 19%]
...........................................................X.............. [ 19%]
................X...............................X........................ [ 20%]
........................................................................ [ 20%]
...................................................................x.... [ 21%]
........................................................................ [ 21%]
.................................................................X...... [ 21%]
........................................................................ [ 22%]
........................................x.X.............................. [ 22%]
.........x..................................................X.......X... [ 23%]
.........X.X...............xX........................X..........X........ [ 23%]
...................................X.........X....X......X.....X........ [ 23%]
X....................................................................... [ 24%]
........................................................................ [ 24%]
........................................................................ [ 24%]
..x.........................................x........................... [ 25%]
...........X..................................................X.......... [ 25%]
..................................x....x........X.............X......... [ 26%]
.....X......X..X.................................X..X................... [ 26%]
......X.X.....X..X......X...X........................................... [ 26%]
........................................................................ [ 27%]
......................................................................... [ 27%]
........................................................................ [ 28%]
........................................................................ [ 28%]
........................................................................ [ 28%]
........................................................................ [ 29%]
........................................................................ [ 29%]
........................................................................ [ 30%]
.......................................x................................ [ 30%]
..........x...............................x............................. [ 30%]
..........x..............................................x.............. [ 31%]
........................................................................ [ 31%]
........................................................................ [ 31%]
........................................................................ [ 32%]
........................................................................ [ 32%]
.s...................................................................... [ 33%]
........................................................................ [ 33%]
........................................................................ [ 33%]
........................................................................ [ 34%]
........................................................................ [ 34%]
..................X.........X..............X...............X..........X. [ 35%]
.................X...................................................... [ 35%]
........................................................................ [ 35%]
......................................................................... [ 36%]
..............................................................X......... [ 36%]
........................................................................ [ 37%]
........................................................................ [ 37%]
......................................................................... [ 37%]
......................................................................... [ 38%]
........................................................................ [ 38%]
........................................................................ [ 38%]
........................................................................ [ 39%]
........................................................................ [ 39%]
......................................................................... [ 40%]
........................................................................ [ 40%]
........................................................................ [ 40%]
........................................................................ [ 41%]
........................................................................ [ 41%]
........................................................................ [ 42%]
...................................................................sss.s [ 42%]
ssssssssssssss.......................................................... [ 42%]
........................................................................ [ 43%]
........................................................................ [ 43%]
......................................................................... [ 44%]
........................................................................ [ 44%]
........................................................................ [ 44%]
......................................................................... [ 45%]
........................................................................ [ 45%]
........................................................................ [ 46%]
........................................................................ [ 46%]
........................................................................ [ 46%]
........................................................................ [ 47%]
........................................................................ [ 47%]
........................................................................ [ 47%]
........................................................................ [ 48%]
........................................................................ [ 48%]
........................................................................ [ 49%]
........................................................................ [ 49%]
........................................................................ [ 49%]
........................................................................ [ 50%]
........................................................................ [ 50%]
........................................................................ [ 51%]
........................................................................ [ 51%]
........................................................................ [ 51%]
........................................................................ [ 52%]
........................................................................ [ 52%]
........................................................................ [ 53%]
........................................................................ [ 53%]
........................................................................ [ 53%]
........................................................................ [ 54%]
........................................................................ [ 54%]
........................................................................ [ 54%]
........................................................................ [ 55%]
........................................................................ [ 55%]
........................................................................ [ 56%]
........................................................................ [ 56%]
........................................................................ [ 56%]
........................................................................ [ 57%]
........................................................................ [ 57%]
........................................................................ [ 58%]
........................................................................ [ 58%]
.......................................................................... [ 58%]
........................................................................ [ 59%]
........................................................................ [ 59%]
........................................................................ [ 60%]
........................................................................ [ 60%]
......................................................................... [ 60%]
........................................................................ [ 61%]
........................................................................ [ 61%]
........................................................................ [ 61%]
......................................................................... [ 62%]
......................................................................... [ 62%]
........................................................................ [ 63%]
........................................................................ [ 63%]
........................................................................ [ 63%]
........................................................................ [ 64%]
......................................................................... [ 64%]
........................................................................ [ 65%]
........................................................................ [ 65%]
........................................................................ [ 65%]
........................................................................ [ 66%]
........................................................................ [ 66%]
........................................................................ [ 67%]
........................................................................ [ 67%]
........................................................................ [ 67%]
........................................................................ [ 68%]
........................................................................ [ 68%]
........................................................................ [ 68%]
........................................................................ [ 69%]
........................................................................ [ 69%]
......................................................................... [ 70%]
........................................................................ [ 70%]
........................................................................ [ 70%]
........................................................................ [ 71%]
........................................................................ [ 71%]
........................................................................ [ 72%]
........................................................................ [ 72%]
........................................................................ [ 72%]
........................................................................ [ 73%]
........................................................................ [ 73%]
........................................................................ [ 74%]
........................................................................ [ 74%]
........................................................................ [ 74%]
........................................................................ [ 75%]
........................................................................ [ 75%]
........................................................................ [ 75%]
........................................................................ [ 76%]
........................................................................ [ 76%]
........................................................................ [ 77%]
........................................................................ [ 77%]
........................................................................ [ 77%]
........................................................................ [ 78%]
........................................................................ [ 78%]
......................................................................... [ 79%]
........................................................................ [ 79%]
........................................................................ [ 79%]
........................................................................ [ 80%]
........................................................................ [ 80%]
........................................................................ [ 81%]
........................................................................ [ 81%]
........................................................................ [ 81%]
........................................................................ [ 82%]
........................................................................ [ 82%]
........................................................................ [ 83%]
........................................................................ [ 83%]
........................................................................ [ 83%]
........................................................................ [ 84%]
........................................................................ [ 84%]
........................................................................ [ 84%]
........................................................................ [ 85%]
......................................................................... [ 85%]
........................................................................ [ 86%]
........................................................................ [ 86%]
........................................................................ [ 86%]
........................................................................ [ 87%]
........................................................................ [ 87%]
........................................................................ [ 88%]
........................................................................ [ 88%]
........................................................................ [ 88%]
........................................................................ [ 89%]
........................................................................ [ 89%]
........................................................................ [ 90%]
........................................................................ [ 90%]
........................................................................ [ 90%]
........................................................................ [ 91%]
...............s.sssss.................................................. [ 91%]
........................................................................ [ 91%]
.............................................................X.......x.. [ 92%]
........x.........X.........x.....x.......X.s............................ [ 92%]
......................................................................... [ 93%]
........................................................................ [ 93%]
........................................................................ [ 93%]
........................................................................ [ 94%]
........................................................................ [ 94%]
........................................................................ [ 95%]
........................................................................ [ 95%]
........................................................................ [ 95%]
........................................................................ [ 96%]
......................................................................... [ 96%]
........................................................................ [ 97%]
................x..........x.......x.x..............x.....x............. [ 97%]
............x.........x..............................X.................. [ 97%]
........................................................................ [ 98%]
........................................................................ [ 98%]
........................................................................ [ 98%]
........................................................................ [ 99%]
........................................................................ [ 99%]
..............................................                           [100%]

----------- coverage: platform linux, python 3.7.7-final-0 -----------
Name                                                               Stmts   Miss  Cover
--------------------------------------------------------------------------------------
modin/__init__.py                                                     68     31    54%
modin/_version.py                                                    272    172    37%
modin/apply_license_header.py                                         19     19     0%
modin/backends/__init__.py                                             0      0   100%
modin/backends/base/__init__.py                                        0      0   100%
modin/backends/base/query_compiler.py                                128      0   100%
modin/backends/pandas/__init__.py                                      0      0   100%
modin/backends/pandas/parsers.py                                     118     60    49%
modin/backends/pandas/query_compiler.py                              624    132    79%
modin/data_management/__init__.py                                      0      0   100%
modin/data_management/dispatcher.py                                   78     16    79%
modin/data_management/factories.py                                    84     26    69%
modin/data_management/functions/__init__.py                            7      0   100%
modin/data_management/functions/binary_function.py                    21      0   100%
modin/data_management/functions/foldfunction.py                        6      0   100%
modin/data_management/functions/function.py                            6      1    83%
modin/data_management/functions/groupby_function.py                   55     38    31%
modin/data_management/functions/mapfunction.py                         6      0   100%
modin/data_management/functions/mapreducefunction.py                   9      0   100%
modin/data_management/functions/reductionfunction.py                   6      0   100%
modin/data_management/utils.py                                        32     14    56%
modin/engines/__init__.py                                              0      0   100%
modin/engines/base/__init__.py                                         0      0   100%
modin/engines/base/frame/__init__.py                                   0      0   100%
modin/engines/base/frame/axis_partition.py                            45     28    38%
modin/engines/base/frame/data.py                                     417     39    91%
modin/engines/base/frame/partition.py                                  1      0   100%
modin/engines/base/frame/partition_manager.py                        159     30    81%
modin/engines/base/io/__init__.py                                     11      0   100%
modin/engines/base/io/column_stores/__init__.py                        0      0   100%
modin/engines/base/io/column_stores/column_store_reader.py            40      0   100%
modin/engines/base/io/column_stores/feather_reader.py                  9      0   100%
modin/engines/base/io/column_stores/hdf_reader.py                      3      0   100%
modin/engines/base/io/column_stores/parquet_reader.py                 34      1    97%
modin/engines/base/io/file_reader.py                                  85      7    92%
modin/engines/base/io/io.py                                          108     25    77%
modin/engines/base/io/sql/__init__.py                                  0      0   100%
modin/engines/base/io/sql/sql_reader.py                               39      1    97%
modin/engines/base/io/text/__init__.py                                 0      0   100%
modin/engines/base/io/text/csv_reader.py                             112      2    98%
modin/engines/base/io/text/fwf_reader.py                             115    108     6%
modin/engines/base/io/text/json_reader.py                             50      2    96%
modin/engines/base/io/text/text_file_reader.py                        26      1    96%
modin/engines/base/series/__init__.py                                  0      0   100%
modin/engines/dask/__init__.py                                         0      0   100%
modin/engines/dask/pandas_on_dask/__init__.py                          0      0   100%
modin/engines/dask/pandas_on_dask/frame/__init__.py                    0      0   100%
modin/engines/dask/pandas_on_dask/frame/axis_partition.py             27      1    96%
modin/engines/dask/pandas_on_dask/frame/data.py                       15      0   100%
modin/engines/dask/pandas_on_dask/frame/partition.py                  73     19    74%
modin/engines/dask/pandas_on_dask/frame/partition_manager.py          44     18    59%
modin/engines/dask/pandas_on_dask/io.py                               16      0   100%
modin/engines/dask/pandas_on_dask/series/__init__.py                   0      0   100%
modin/engines/dask/task_wrapper.py                                     9      0   100%
modin/engines/python/__init__.py                                       0      0   100%
modin/engines/python/pandas_on_python/__init__.py                      0      0   100%
modin/engines/python/pandas_on_python/frame/__init__.py                0      0   100%
modin/engines/python/pandas_on_python/frame/axis_partition.py         14     14     0%
modin/engines/python/pandas_on_python/frame/data.py                    4      4     0%
modin/engines/python/pandas_on_python/frame/partition.py              63     63     0%
modin/engines/python/pandas_on_python/frame/partition_manager.py       7      7     0%
modin/engines/python/pandas_on_python/io.py                            6      6     0%
modin/engines/python/pandas_on_python/series/__init__.py               0      0   100%
modin/engines/ray/__init__.py                                          0      0   100%
modin/engines/ray/generic/__init__.py                                  0      0   100%
modin/engines/ray/generic/frame/__init__.py                            0      0   100%
modin/engines/ray/generic/frame/partition_manager.py                  10     10     0%
modin/engines/ray/generic/io.py                                       14     14     0%
modin/engines/ray/generic/series/__init__.py                           0      0   100%
modin/engines/ray/pandas_on_ray/__init__.py                            0      0   100%
modin/engines/ray/pandas_on_ray/frame/__init__.py                      0      0   100%
modin/engines/ray/pandas_on_ray/frame/axis_partition.py               22     22     0%
modin/engines/ray/pandas_on_ray/frame/data.py                         11     11     0%
modin/engines/ray/pandas_on_ray/frame/partition.py                    84     84     0%
modin/engines/ray/pandas_on_ray/frame/partition_manager.py            43     43     0%
modin/engines/ray/pandas_on_ray/io.py                                 17     17     0%
modin/engines/ray/pandas_on_ray/series/__init__.py                     0      0   100%
modin/engines/ray/task_wrapper.py                                      7      7     0%
modin/engines/ray/utils.py                                            10     10     0%
modin/error_message.py                                                22      2    91%
modin/experimental/__init__.py                                         0      0   100%
modin/experimental/engines/__init__.py                                 0      0   100%
modin/experimental/engines/pandas_on_ray/__init__.py                   0      0   100%
modin/experimental/engines/pandas_on_ray/io_exp.py                    38     38     0%
modin/experimental/engines/pandas_on_ray/sql.py                       66     66     0%
modin/experimental/pandas/__init__.py                                  6      6     0%
modin/experimental/pandas/io_exp.py                                    7      7     0%
modin/pandas/__init__.py                                              75     43    43%
modin/pandas/base.py                                                1048     51    95%
modin/pandas/concat.py                                                42      5    88%
modin/pandas/dataframe.py                                            842     96    89%
modin/pandas/datetimes.py                                              7      0   100%
modin/pandas/general.py                                               50      2    96%
modin/pandas/groupby.py                                              287     31    89%
modin/pandas/indexing.py                                             185     30    84%
modin/pandas/io.py                                                   147     11    93%
modin/pandas/iterator.py                                              17      0   100%
modin/pandas/reshape.py                                               30      0   100%
modin/pandas/series.py                                               896     55    94%
modin/pandas/utils.py                                                 28      3    89%
--------------------------------------------------------------------------------------
TOTAL                                                               7082   1549    78%

= 17862 passed, 31 skipped, 104 xfailed, 515 xpassed, 32175 warnings in 1194.41s (0:19:54) =

TOTAL                                                               7082   1549    78%

= 17862 passed, 31 skipped, 104 xfailed, 515 xpassed, 32175 warnings in 1194.41s (0:19:54) =

@YarShev
Copy link
Collaborator Author

YarShev commented Jun 7, 2020

If we'd try to do value_counts via MapReduceFunction with something like that:

value_counts = MapReduceFunction.register(
        lambda x, *args, **kwargs: x.squeeze().value_counts(**kwargs),
        lambda y, *args, **kwargs: y.squeeze().groupby(y.squeeze().index, sort=False).sum()
    )

Here I see two problems. The first one is that it is required to recompute index and row_lengths of Modin Frame because they can be changed (NA cells may be considered and not considered via skipna parameter of value_counts). Currently, _map_reduce doesn't allow to recompute index and row_lengths. The second one (more important) is that groupby itself doesn't consider NA cells. Thus, I don't see that we could do without _apply_full_index in this case (Of course, if we don't want to think up some workarounds to do not use _apply_full_index and do changes in inner API. I think we don't want it).

@devin-petersohn
Copy link
Collaborator

@YarShev I see, thanks for pointing these out.

The difficult challenge here is that this is very commonly used, and ReductionFunction does not scale as well as MapReduceFunction. (One phase reduce vs Tree reduce). With Series methods, there is no parallelism for ReductionFunction.

I will think on a potential fix for this. There should still be some way to do it in two phases.

@YarShev
Copy link
Collaborator Author

YarShev commented Jun 8, 2020

@devin-petersohn , from my point of view, value_counts can't be implemented as scaling version using current functional. It is required to do something in inner API. I will try to think on this too.

@devin-petersohn
Copy link
Collaborator

I have a branch here where I have tried a quick and dirty implementation, let me know what you think: https://github.com/devin-petersohn/modin/tree/features/value_counts

@YarShev
Copy link
Collaborator Author

YarShev commented Jun 22, 2020

@devin-petersohn , I looked through the changes in your branch. I guess, we can't implement value_counts a such way. Here are why. Reading the doc for pandas value_counts and looking through the examples, I thought, there is a sorting of indices in addition to a sorting of values. I did the corresponding logic for this, but during the testing it turned out that indices are located in random order for pandas value_counts. I think, we can't guess where indices are located after groupby and sort_values operations and then spread them to correct "random" locations (locations like in pandas). If sorting of indices would was presented in pandas value_counts, it would be okay. Here are some examples.

import modin.pandas as pd
import pandas
import numpy as np
NROWS = 2 ** 8
RAND_LOW = 0
RAND_HIGH = 100
random_state = np.random.RandomState(seed=42)
data = random_state.randint(RAND_LOW, RAND_HIGH, size=(NROWS))
ms = pd.Series(data)
ps = pandas.Series(data)
mr = ms.value_counts()
pr = ps.value_counts()
mr[:10] # sorted (by descending) both values and indices for same values
61    12
1      8
87     6
43     6
14     6
89     5
88     5
59     5
58     5
2      5
ps[:10] # sorted (by descending) values but not sorted indices for same values
61    12
1      8
43     6
14     6
87     6
88     5
2      5
59     5
58     5
89     5

Regarding to the changes related to recomputing of indices in _map_reduce. I think, it may be useful for some operations. It might be useful for ReductionFunction as well but we have _apply_full_axis for this case.

@devin-petersohn , what do you think about all this?

@devin-petersohn
Copy link
Collaborator

@YarShev I think it is okay if we warn the user that the output order may slightly differ from pandas. We can create an issue to revisit this and if it becomes a huge problem we can revisit how to fix it, but scalability is more important.

I think if users are relying on pandas order, they are using the sort argument. Even though this implementation is also sorting by the value on ties, I think it will be okay as long as we warn the user. In the test we will have to sort the pandas version in the same way we sort in Modin and just link to the issue we will use to track the value_counts order.

@YarShev
Copy link
Collaborator Author

YarShev commented Jun 22, 2020

@devin-petersohn , okay, then can I take the changes from your branch and put them into this PR? Also, I will put the changes regarding sorting of indices in order to review this.

@devin-petersohn
Copy link
Collaborator

@YarShev Yes, feel free to cherry pick those commits over.

YarShev and others added 2 commits June 23, 2020 08:59
@YarShev YarShev force-pushed the dev/yigoshev-value_counts branch 5 times, most recently from bd7b7f7 to bf90699 Compare June 23, 2020 12:40
@YarShev
Copy link
Collaborator Author

YarShev commented Jun 23, 2020

@devin-petersohn , I combined all changes. Please, review. Also, I created the issue on difference of indices order in pandas and in Modin #1650 .

Copy link
Collaborator

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of questions and documentation requests, thanks @YarShev!

-------
PandasQueryCompiler
"""
if kwargs.get("bins", None) is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do bins not work with MapReduce style? How is the designator for the bin determined?

Copy link
Collaborator Author

@YarShev YarShev Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The designator for the bin determined as follow:

import pandas
import numpy as np
s = pandas.Series([3, 1, 2, 3, 4, np.nan])
r = s.value_counts(bins=3)
r
(2.0, 3.0]      2
(0.996, 2.0]    2
(3.0, 4.0]      1
dtype: int64
r.index
IntervalIndex([(2.0, 3.0], (0.996, 2.0], (3.0, 4.0]],
              closed='right',
              dtype='interval[float64]')
r.index[0]
Interval(2.0, 3.0, closed='right')

MapReduce style doesn't work for bins because equal values may be put in to different bins at the map stage (when we have more than one partitions). Then, when we do groupby by index, we will not able to group bins containing appropriate values.

is_end = True
break
if is_range:
new_index = np.concatenate(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will probably be slightly more performance to concatenate at the end to avoid the multiple malloc calls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully got rid np.concatenate.


result = result.sort_values(ascending=ascending) if sort else result

def sort_index_for_identical_values(result, ascending):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a brief comment on why this is needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, done. Also, I added some brief doc for this function.


return sort_index_for_identical_values(result, ascending)

return MapReduceFunction.register(map_func, reduce_func, preserve_index=False)(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking on how to structure this to keep the query compiler clean of these low-level implementation details, what if we created a file backends/pandas/kernels.py, then added these functions there and import? I have been thinking about how to keep things clean, maybe it is better left to another PR, but what do you think on this?

Copy link
Collaborator Author

@YarShev YarShev Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is great thought. I think, a file backend/pandas/kernels.py that will contain these implementation details is what we need to keep query compiler clean. Importing functions from that file into query compiler is a very nice solution.

Copy link
Collaborator

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some tests, performance is good. Thanks @YarShev LGTM!

@devin-petersohn devin-petersohn merged commit d2b4a74 into modin-project:master Jun 24, 2020
@YarShev YarShev deleted the dev/yigoshev-value_counts branch June 25, 2020 08:17
@anmyachev anmyachev mentioned this pull request Sep 23, 2020
22 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Util] value_counts implementation value_counts function implementation
4 participants