Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync #1 #4

Merged
merged 172 commits into from
Feb 18, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
172 commits
Select commit Hold shift + click to select a range
3fd16a9
FIX-#2195: fix describe error for datasets with datetimes (#2272)
anmyachev Oct 20, 2020
c680077
FIX-#1906: fixed incorrect behaviour of 'groupby.__getattr' (#2276)
dchigarev Oct 20, 2020
39d69ab
FIX-#2277: applied Title Case to the names of DATASET_SIZE_DICT keys …
dchigarev Oct 20, 2020
544de0d
FIX-#2280: use 32 bytes in secrets.token_hex (#2286)
anmyachev Oct 21, 2020
b514d6f
TEST-#2260: use recommended pandas testing api (#2273)
anmyachev Oct 22, 2020
64b94f5
FIX-#2254: handling dict functions at groupby.agg improved (#2267)
dchigarev Oct 22, 2020
6697e05
FEAT-#2282: support DataFrame.[count|max|min|sum] for OmniSci backend…
ienkovich Oct 22, 2020
55e3459
FIX-#1976: indices matching at reduction functions fixed (#2270)
dchigarev Oct 22, 2020
60d4956
FEAT-#2299: support value_counts in OmniSci backend. (#2300)
ienkovich Oct 23, 2020
a0a3ebb
FIX-#1765: Fix support of s3 in `read_parquet` (#2287)
prutskov Oct 23, 2020
e63c042
FIX-#2285: Default to pandas warning message improved (#2302)
dchigarev Oct 23, 2020
e685b8d
FEAT-#2303: fix OmniSci aggregates and add mean (#2304)
ienkovich Oct 23, 2020
a571e10
FIX-#2258: return 'Commit Message formatting' topic (#2306)
anmyachev Oct 24, 2020
8866ca8
FIX-#2133 #2265: Fix binary operations for modin frames in case when …
prutskov Oct 27, 2020
a7d3093
FIX-#2239: Compute row index start using pandas (#2240)
devin-petersohn Oct 27, 2020
5c5a8e0
FIX-#2253: loc assignment fixed in case of (1, 1) shape frame (#2316)
dchigarev Oct 27, 2020
5cabeb9
FIX-#2311: fixed performance bottleneck at reduction operations (#2314)
dchigarev Oct 28, 2020
ed69517
TEST-#2288: Cover by tests delimiters parameters of read_csv (#2310)
amyskov Oct 28, 2020
23b4408
FIX-#2234: update dask_deps in setup.py (#2325)
anmyachev Oct 28, 2020
e2f628c
FIX-#2326: move s3fs import in _read function (#2327)
anmyachev Oct 28, 2020
9d3ead8
FIX-#2329: TypeError while creating cluster (#2330)
anmyachev Oct 28, 2020
e08386b
FIX-#0000: Indexing regression (#2333)
devin-petersohn Oct 29, 2020
f9cd119
DOCS-#2334: Add tutorials to main repo (#2335)
devin-petersohn Oct 29, 2020
a11e7c9
DOCS-#2193: Add contributing doc in checklist (#2216)
anmyachev Oct 30, 2020
737ec34
REFACTOR-#2343: refactor offset, _read_rows, partitioned_file (#2344)
anmyachev Oct 30, 2020
c86422a
FIX-#1927: Fix performance issue related to `sparse` attribute access…
YarShev Oct 30, 2020
f1f82ee
FIX-#2269: Move `default_to_pandas` logic from API layer to backend (…
YarShev Oct 30, 2020
067b8ac
TEST-#2292: Cover by tests Datetime Handling parameters of read_csv (…
amyskov Oct 30, 2020
776d8e2
FEAT-#2271: Add implementation of `groupby.shift` (#2323)
prutskov Oct 30, 2020
5251135
FIX-#2348: Fix default to pandas warnings (#2349)
YarShev Oct 30, 2020
e924a1f
FIX-#2357: Fix path to documentation for contributing (#2358)
YarShev Nov 2, 2020
6fcaef1
FIX-#2352: remove deprecated option: 'num-redis-shards' (#2353)
anmyachev Nov 2, 2020
11574a8
FIX-#2339: Fix links to documentation (#2361)
YarShev Nov 2, 2020
5382769
FIX-#2354: use conda activate instead of conda run (#2355)
anmyachev Nov 2, 2020
8c00a8f
FEAT-#2363: introduce getter and setter for index name (#2368)
ienkovich Nov 3, 2020
a3e06c7
FEAT-#1844: upgrade pyarrow to 1.0 (#2347)
anmyachev Nov 5, 2020
f4f3a1e
FIX-#2365: Fix `Series.value_counts` when `dropna=False` (#2366)
YarShev Nov 5, 2020
fc34852
FIX-#2369: Update pandas version to 1.1.4 (#2371)
YarShev Nov 5, 2020
a13384c
FIX-#2322: add aligning partition' blocks (#2367)
anmyachev Nov 9, 2020
3395b59
Bump version to 0.8.2 (#2383)
devin-petersohn Nov 9, 2020
c5203b9
FIX-#2386: add new location for import ray functions (#2387)
anmyachev Nov 10, 2020
eba3abc
FIX-#2388: Fixed requirements for omnisci binaries (#2389)
gshimansky Nov 10, 2020
c6a1d93
FIX-#2380: don't ignore lengths parameter for dask engine (#2381)
anmyachev Nov 11, 2020
b424036
FIX-#2390: Fix inserting Series into DataFrame (#2391)
YarShev Nov 11, 2020
45ef859
FIX-2200: Enable Calcite by default in OmniSci backend (#2385)
amyskov Nov 11, 2020
ba006fb
TEST-#2289: Columns, Index Locations and Names parameters of read_csv…
amyskov Nov 12, 2020
e916874
REFACTOR-#2397: remove redundant assigment (#2398)
anmyachev Nov 12, 2020
2b0b755
FEAT-#2363: fix index name setter in OmniSci backend (#2379)
ienkovich Nov 13, 2020
5c9398c
Merged groupby_agg and groupby_dict_agg to implement dictionary funct…
gshimansky Nov 13, 2020
86ebc31
FIX-#2406: filter dictionary aggregation keys to limit them to keys o…
gshimansky Nov 14, 2020
1da5198
DOCS-#2413: Add examples page to documentation (#2414)
devin-petersohn Nov 14, 2020
40ae5a8
DOCS-#2415: Add comparisons section to documentation with stubs (#2416)
devin-petersohn Nov 14, 2020
b01f91f
DOCS-#2417: add sklearn example (#2425)
reshamas Nov 14, 2020
140988e
DOCS-#2421: Fixes bad link on contributing from architecture.rst (#2427)
vfdev-5 Nov 14, 2020
5252282
DOCS-#2419: Updated CONTRIBUTING.rst (#2423)
vfdev-5 Nov 14, 2020
bcf931d
DOCS-#2426,DOCS-#2424: Fixed two issues (#2431)
vfdev-5 Nov 15, 2020
3edf6d2
DOCS-#2420: Changed documentation to numpydoc style (#2429)
mohdkashif93 Nov 15, 2020
3e32d02
DOCS-#2433: Updated README.md with modin_vs_dask.md doc (#2435)
abdulelahsm Nov 15, 2020
5d3f693
FIX-#2450: fix CI recipe (#2449)
dchigarev Nov 17, 2020
54604f2
DOCS-#2437: Add documentation contrasting Modin and Dask (#2441)
devin-petersohn Nov 19, 2020
74acc1b
FEAT-#2444: add docker file for nyc on omnisci (#2445)
anmyachev Nov 20, 2020
03dbbef
FIX-#2458: fix 'psutil' install (#2452)
ashahba Nov 20, 2020
80125c1
FIX-#2456: update taxi queries with .copy usage (#2457)
anmyachev Nov 20, 2020
41d3111
FEAT-#2447: add docker file for census on omnisci (#2448)
anmyachev Nov 20, 2020
f7f1f7a
FIX-#2470: revert b867edf (#2471)
amyskov Nov 23, 2020
0c40d61
FIX-#2473: Some configuration values should not be transformed (#2476)
vnlitvinov Nov 25, 2020
e5556b5
FIX-#2402: Fix read_excel when files come from older windows (#2403)
devin-petersohn Nov 25, 2020
0aada32
REFACTOR-#2467: Convert internal base dataframe objects to ABC (#2468)
devin-petersohn Nov 26, 2020
fd5d476
FIX-#2459: Updated TeamCity tests image to use Ray as base image (#2460)
gshimansky Nov 30, 2020
24678d0
TEST-#2488: Increase commitlint message length limit to 88 characters…
devin-petersohn Nov 30, 2020
ce2bea8
DOCS-#2439: Add Documentation for Modin vs. pandas (#2487)
devin-petersohn Dec 1, 2020
3f65c89
TEST-#2290: Cover by tests General Parsing Configuration parameters o…
amyskov Dec 1, 2020
c863b3d
FIX-#2453: Remove sorting indices for equal values in `Series.value_c…
YarShev Dec 1, 2020
f571f69
TEST-#2291: Cover by tests NA and Missing Data Handling parameters of…
amyskov Dec 1, 2020
7972050
REFACTOR-#2496: Change internal reader names to dispatcher (#2497)
devin-petersohn Dec 2, 2020
372422b
TEST-#2294: add iteration parameters for read_csv tests (#2477)
amyskov Dec 2, 2020
299ba18
FIX-#2463: Added test with callable functions as aggregate argument (…
gshimansky Dec 2, 2020
f273121
TEST-#2296: Error Handling parameters of read_csv (#2501)
amyskov Dec 3, 2020
db794e0
TEST-#2295: Cover by tests Quoting, Compression, and File Format para…
amyskov Dec 3, 2020
7458746
FEAT-#2479: integrate asv (#2484)
anmyachev Dec 3, 2020
c2e7f9e
FIX-#2374: remove extra code; add pandas way to handle duplicate valu…
anmyachev Dec 4, 2020
031f444
TEST-#2297: Cover by tests Internal parameters of read_csv (#2502)
amyskov Dec 7, 2020
d710a16
Ensure excel reader closes file if it is passed as path (#2514)
vnlitvinov Dec 8, 2020
7c46bdd
FEAT-#2375: implementation of multi-column groupby aggregation (#2461)
dchigarev Dec 8, 2020
c6e43c5
FIX-#2442: fixed Series assignment with different indices (#2443)
dchigarev Dec 9, 2020
cd35dd4
FEAT-#2013: merge_asof that is a little more efficient (#2510)
itamarst Dec 9, 2020
e76366b
DOCS-#2436: Explicit local / single node backend (#2483)
raphaelauv Dec 9, 2020
2592849
Fix indices when reading Excel files in parallel (#2526)
vnlitvinov Dec 9, 2020
ecd41e5
FIX-#2527: Use random name for hdf file test, clean file after testin…
vnlitvinov Dec 10, 2020
ebbb6b2
FIX-#2524: Update pandas version to 1.1.5 (#2525)
YarShev Dec 10, 2020
1bf35c9
FIX-#2408: Fix read_csv and read_table args when used inside a decora…
staftermath Dec 11, 2020
1a8cd0a
FIX-#2169: avoid unnecessary index access in groupby (#2469)
dchigarev Dec 11, 2020
787d2b1
FIX-#2313: improved handling non-numeric types at 'mean' when 'axis=1…
dchigarev Dec 15, 2020
da83b62
TEST-#2509: Io tests refactoring (#2523)
amyskov Dec 15, 2020
f98a7b3
FIX-#2540: add __iter__ implementation (#2541)
anmyachev Dec 15, 2020
d13fe74
FEAT-#2520: add most important operations for asv benchmarks (#2539)
anmyachev Dec 16, 2020
d8d58bb
FIX-#2498: Fix possible number of partitions for Dask engine (#2532)
YarShev Dec 16, 2020
126f2a5
FIX-#2550: remove decorators usage for asv tested functions (#2551)
anmyachev Dec 17, 2020
a58dcf6
FEAT-#2236: Handling of space limited Ray Plasma directories (#2547)
amyskov Dec 17, 2020
ab29ed6
DOCS-#2518: add asv usage topic (#2549)
anmyachev Dec 17, 2020
ee39d17
FEAT-#2491: optimized groupby dictionary aggregation (#2534)
dchigarev Dec 17, 2020
bd60284
FEAT-#2553: add ability to run microbenchmarks for old Modin version …
anmyachev Dec 17, 2020
cee481b
Fix .loc[] assignment for Modin Series (#2555)
vnlitvinov Dec 17, 2020
9d9dc29
FIX-#2482: improved handling non-str 'by' (#2548)
dchigarev Dec 18, 2020
6f5cdc6
Fix taxi-runner.py cluster example (#2557)
anmyachev Dec 18, 2020
3e62906
Fix loc/iloc assignments when columns are selected (#2536)
vnlitvinov Dec 18, 2020
43df818
FIX-#2559: Ignore files from /proc/ when detecting file leaks (#2560)
vnlitvinov Dec 18, 2020
c5aac3e
Switch to Ray from conda-forge (#2562)
vnlitvinov Dec 18, 2020
db0f18c
FIX-#2566: Ensure `Series.unique` does not return a scalar when there…
richardlin047 Dec 23, 2020
ba507a2
FIX-#2572: fixed arrow version in OmniSci dependencies (#2571)
dchigarev Dec 29, 2020
af627dd
DOCS-#2578: fix simple typo, parition -> partition (#2573)
timgates42 Jan 5, 2021
439e17d
FIX-#0000: pin xlrd<=1.2.0 (#2594)
anmyachev Jan 11, 2021
3106e49
FIX-#2543: fixed handling 'as_index' at groupby dictionary renaming a…
dchigarev Jan 12, 2021
bcab1cc
Release commit for version 0.8.3 (#2597)
devin-petersohn Jan 12, 2021
9a6695d
REFACTOR-#2580: Move automatic engine init to after data ingestion (#…
devin-petersohn Jan 12, 2021
ff9bdbf
TEST-#2598: Add test for clean install from source (#2599)
devin-petersohn Jan 12, 2021
477c5f6
FIX-#976: add encoding parameter to read_csv call (#2593)
anmyachev Jan 13, 2021
7cfc85c
FEAT-#2342: Add axis partitions API (#2515)
YarShev Jan 13, 2021
d663730
Fixed MultiIndex.from_frame implementation (#2587)
gshimansky Jan 13, 2021
4292d55
FIX-#2608: Disable proxy for commands running inside container (#2609)
gshimansky Jan 14, 2021
d720579
FIX-#2601: reduce data size for some asv tests (#2602)
anmyachev Jan 15, 2021
01fbe9f
FIX-#2611: Fixed crash and sklearn version (#2612)
gshimansky Jan 15, 2021
6d420fb
FEAT-#2604: add docker file with plasticc benchmark on omnisci (#2605)
anmyachev Jan 15, 2021
5f03eb8
DOCS-#2618: Add code of conduct (#2619)
devin-petersohn Jan 15, 2021
1996666
FEAT-#2373: Add distributed xgboost on Modin with Ray (#2545)
prutskov Jan 19, 2021
fb94254
FEAT-2624: Improve performance of read_* methods when file handles ar…
mzjp2 Jan 20, 2021
2f880c1
FIX-#2616: Add config for num partitions, deprecate DEFAULT_NPARTITIO…
devin-petersohn Jan 20, 2021
ad55231
FEAT-#2091: add distributed dataframe compare (#2579)
kvu35 Jan 25, 2021
31d0632
DOCS-#2649: Fix github pr template's dead link. (#2650)
williamma12 Jan 27, 2021
6caa7b4
FEAT-#2606: Support creating DataFrame from remote partitions (#2613)
YarShev Jan 28, 2021
4f26fc1
FIX-#2637: Fix deprecation warnings due to invalid escape sequences. …
tirkarthi Jan 28, 2021
09d7c18
REFACTOR-#2648: Correct uses of MapReduceFunction and metadata manipu…
devin-petersohn Jan 28, 2021
f2a7271
DOCS-2653: Fix links in Modin's documentation (#2654)
prutskov Jan 29, 2021
03ea9b2
FEAT-#2663: Add algebraic operator `from_labels` (#2665)
devin-petersohn Jan 31, 2021
9cb165d
FIX-#2672: pin numpy>=1.16.5,<1.20 (#2673)
anmyachev Feb 1, 2021
e99b629
FEAT-#2675: Added benchmark for sort_values (#2676)
gshimansky Feb 2, 2021
5ad5fa3
FEAT-#2664: Add `to_labels` algebraic operator (#2666)
devin-petersohn Feb 2, 2021
3e1258f
FIX-#1806: Resolved error when reverting to Pandas for Multiindex (#2…
todd-yu Feb 2, 2021
a5417e8
FIX-#2614: Up python version for test jobs (#2615)
YarShev Feb 3, 2021
b0f92b8
DOCS-2633: Add documentation for distributed XGBoost on Modin (#2640)
prutskov Feb 3, 2021
e25a5e0
FIX-#2667: Change names of files for development env (#2668)
prutskov Feb 3, 2021
9495ff7
FIX-#2658: Move backend check in xgb to train/predict (#2659)
prutskov Feb 3, 2021
1b3e9d9
FEAT-#2451: Read multiple csv files simultaneously via glob paths (#2…
williamma12 Feb 3, 2021
90e1183
FIX-#2681: pin numpy<1.20.0 for docker containers with omnisci (#2682)
anmyachev Feb 4, 2021
77d40ce
TEST-#2670: some updates to improve asv tests stability (#2671)
anmyachev Feb 4, 2021
16fa188
TEST-#2686: add fillna benchmark (#2687)
anmyachev Feb 5, 2021
0f4ce8a
TEST-#2692: add drop benchmark (#2693)
anmyachev Feb 5, 2021
5990e06
FIX-#2688: Update ray.ObjectID to ray.ObjectRef for Ray 2.0 (#2695)
devin-petersohn Feb 5, 2021
3a31683
TEST-#2707: add lint check for ASV benchmarks (#2708)
dchigarev Feb 8, 2021
7525f05
TEST-#2699: add append benchmark (#2700)
anmyachev Feb 8, 2021
c637d89
FIX-#2684: Add method level docs for Modin XGBoost (#2685)
prutskov Feb 8, 2021
b93b879
TEST-#2694: add head benchmark (#2696)
anmyachev Feb 8, 2021
e74e012
TEST-#2705: add 'value_counts' benchmarks (#2706)
dchigarev Feb 9, 2021
8a50c4a
FIX-#2709: fixed typo in '_copartition' (#2710)
dchigarev Feb 9, 2021
5cb3283
FIX-#2596: Update pandas version to 1.2.1 (#2600)
YarShev Feb 9, 2021
6f1fe69
TEST-#2690: add astype benchmark (#2691)
anmyachev Feb 10, 2021
a54875a
TEST-#2702: add loc/iloc benchmark (#2703)
anmyachev Feb 10, 2021
569337b
TEST-#2716: add describe bench (#2718)
anmyachev Feb 11, 2021
a1238a0
DOCS-#2717: Fix version of Modin for building latest docs (#2719)
prutskov Feb 11, 2021
ba82360
FEAT-#1611: Add mod operation (#2726)
abykovsk Feb 11, 2021
f14a3c1
TEST-#2725: add index, columns, shape benchmarks (#2727)
anmyachev Feb 12, 2021
04cd912
FIX-#2305: fix handling of renaming aggregation (#2732)
dchigarev Feb 15, 2021
a2ecf31
FIX-#2362: fix key handling in 'Series.__setitem__' (#2731)
dchigarev Feb 15, 2021
33a57e3
TEST-#2722: add ASV read_csv skiprows benchmark (#2724)
amyskov Feb 15, 2021
7935c59
FIX-#2735: move '.reindex' logic about axis dispatching from the base…
dchigarev Feb 15, 2021
aa818f5
TEST-#1496: add tests for setting new column with different from fram…
dchigarev Feb 15, 2021
8ebbad9
REFACTOR-#2739: io tests refactoring (#2740)
amyskov Feb 16, 2021
61c6e99
TEST-#2753: add GroupBy benchmarsk with huge amount of groups (#2754)
dchigarev Feb 18, 2021
1f3b514
FIX-#2362: fix handling slices in 'DataFrame.__setitem__' (#2741)
dchigarev Feb 18, 2021
ad0bcab
FIX-#2742: fix performance degradation for dictionary GroupBy aggrega…
dchigarev Feb 18, 2021
84c9ab5
FIX-#2737: fix handling of dates for read_csv with OmniSci backend (#…
amyskov Feb 18, 2021
d1321f5
DOCS-#2584: Add CODEOWNERS file (#2759)
devin-petersohn Feb 18, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<!--
Thank you for your contribution!
Please review the contributing docs: https://modin.readthedocs.io/en/latest/contributing.html
Thank you for your contribution!
Please review the contributing docs: https://modin.readthedocs.io/en/latest/CONTRIBUTING.html
if you have questions about contributing.
-->

Expand Down
Loading