TEST-#7049: Add some sanity tests with pyarrow-backed pandas dataframes #7199

anmyachev · 2024-04-18T14:05:30Z

What do these changes do?

Fixing problematic test cases will be part of the work on #7203.

first commit message and PR title follow format outlined here

NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.
passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
signed commit with git commit -s
Resolves Add some sanity tests with pyarrow-backed pandas dataframes to make sure Modin doesn't have any issues with this backend. #7049
tests added and passing
module layout described at docs/development/architecture.rst is up-to-date

…ndas dataframes Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

…sue7049

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev · 2024-04-19T12:53:33Z

modin/tests/pandas/dataframe/test_map_metadata.py

+    pa = pytest.importorskip("pyarrow")
+
+    data = [[Decimal("3.19"), None], [None, Decimal("-1.23")]]
+    df_equals(*create_test_dfs(data, dtype=pd.ArrowDtype(pa.decimal128(3, scale=2))))


df_equals function is used specifically since there is no benefit to using eval_general function (because the results of the constructors are tested).

modin/tests/pandas/test_series.py

YarShev · 2024-04-22T13:38:25Z

modin/tests/pandas/test_series.py

+    modin_series, pandas_series = create_test_series(
+        [-1.545, 0.211, None], dtype="float32[pyarrow]"
+    )
+    df_equals(modin_series.mean(), pandas_series.mean())


eval_general?

mean() returns a floating point number, so I don't see the need for it

Co-authored-by: Iaroslav Igoshev <Poolliver868@mail.ru>

anmyachev mentioned this pull request Apr 18, 2024

TEST-#7049: Add some sanity tests with pyarrow-backed pandas dataframes #7050

Closed

7 tasks

anmyachev added 2 commits April 18, 2024 16:06

TEST-modin-project#7049: Add some sanity tests with pyarrow-backed pa…

8e46e4e

…ndas dataframes Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

fixes

6814c6e

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev force-pushed the issue7049 branch from 98586a3 to 6814c6e Compare April 18, 2024 14:06

anmyachev mentioned this pull request Apr 18, 2024

FEAT-#6808: Implement __arrow_array__ for Series #7200

Merged

7 tasks

anmyachev added 2 commits April 19, 2024 12:50

fix

e1dbc69

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

Merge branch 'main' of https://github.com/modin-project/modin into is…

0241d7f

…sue7049

anmyachev mentioned this pull request Apr 19, 2024

Modin should work correctly with pandas, which uses pyarrow as a backend #7203

Closed

anmyachev added 3 commits April 19, 2024 13:15

cleanup

7b925a5

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

fix comment

23003c5

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

skip some cases for HDK

cc2a5ab

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev marked this pull request as ready for review April 19, 2024 12:11

anmyachev requested review from devin-petersohn, mvashishtha, RehanSD, YarShev, vnlitvinov, dchigarev and a team as code owners April 19, 2024 12:11

anmyachev mentioned this pull request Apr 19, 2024

FEAT-#7203: Make sure modin works correctly with pandas, which uses pyarrow as a backend #7204

Merged

7 tasks

anmyachev commented Apr 19, 2024

View reviewed changes

YarShev reviewed Apr 22, 2024

View reviewed changes

Apply suggestions from code review

c3cc95a

Co-authored-by: Iaroslav Igoshev <Poolliver868@mail.ru>

YarShev approved these changes Apr 22, 2024

View reviewed changes

YarShev merged commit 3abd961 into modin-project:main Apr 22, 2024
38 checks passed

anmyachev deleted the issue7049 branch April 22, 2024 18:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TEST-#7049: Add some sanity tests with pyarrow-backed pandas dataframes #7199

TEST-#7049: Add some sanity tests with pyarrow-backed pandas dataframes #7199

anmyachev commented Apr 18, 2024 •

edited

Loading

anmyachev Apr 19, 2024

YarShev Apr 22, 2024

anmyachev Apr 22, 2024

TEST-#7049: Add some sanity tests with pyarrow-backed pandas dataframes #7199

TEST-#7049: Add some sanity tests with pyarrow-backed pandas dataframes #7199

Conversation

anmyachev commented Apr 18, 2024 • edited Loading

What do these changes do?

anmyachev Apr 19, 2024

Choose a reason for hiding this comment

YarShev Apr 22, 2024

Choose a reason for hiding this comment

anmyachev Apr 22, 2024

Choose a reason for hiding this comment

anmyachev commented Apr 18, 2024 •

edited

Loading