-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOCS-#3953: Add docs and notebook examples on running Modin with OmniSci #4001
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4001 +/- ##
==========================================
+ Coverage 86.87% 90.36% +3.48%
==========================================
Files 205 207 +2
Lines 16979 19510 +2531
==========================================
+ Hits 14751 17630 +2879
+ Misses 2228 1880 -348
Continue to review full report at Codecov.
|
examples/tutorial/tutorial_notebooks/introduction/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
a494a2f
to
e4ff5bf
Compare
@Rubtsowa, when you address comments, please leave your comments that the former are fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you run a code formatter on both notebooks, such as black, iota?
This is the extension for jupyterlab: https://jupyterlab-code-formatter.readthedocs.io/en/latest/ but whatever editor you use works.
examples/tutorial/tutorial_notebooks/introduction/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"df.count()" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like count
is broken now in OmniSci:
import numpy as np
import modin.pandas as pd
import modin.config as cfg
cfg.StorageFormat.put("omnisci")
frame_data = np.random.randint(0, 100, size=(2**10, 2**5))
df = pd.DataFrame(frame_data)
print(df.count())
Output:
File "pyarrow/table.pxi", line 2153, in pyarrow.lib.Table.rename_columns
File "stringsource", line 15, in string.from_py.__pyx_convert_string_from_py_std__in_string
TypeError: expected bytes, int found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting... I believe we have tests for count with Omnisci engine...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a known problem, it's caused by non-string column labels (#3370) I haven't found an issue for this though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it's not mentioned anywhere, but the OmniSci backend doesn't support non-string column labels. The problem should go away once you specify valid labels for the frame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, i see. Looks like issue here can be eliminated by replacement df.add_prefix("col1")
-> df = df.add_prefix("col")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests with count
was deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dchigarev, why doesn't df.add_prefix("col1")
work in-place for OmniSci?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDK, it works on my machine (c)
>>> df = pd.DataFrame([[1, 2], [3, 4]])
>>> df
0 1
0 1 2
1 3 4
>>> df.columns
RangeIndex(start=0, stop=2, step=1)
>>> df.count()
TypeError: expected bytes, int found
>>> df.add_prefix("col1").count()
col10 2
col11 2
dtype: int32
>>> df.add_prefix("col").count()
col0 2
col1 2
dtype: int32
Both of the add_prefix
should work...
The reproducer above doesn't use add_prefix
at all and that's causes the error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks!
examples/tutorial/tutorial_notebooks/introduction/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
f7f810a
to
21c71cf
Compare
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_3.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_3.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_3.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/pandas_on_ray/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
6fbc3a0
to
c44caf3
Compare
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
...ples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/env_omnisci_for_notebook.yml
Outdated
Show resolved
Hide resolved
...ples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/env_omnisci_for_notebook.yml
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_1.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/README.md
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/jupyter_omnisci_env.yml
Outdated
Show resolved
Hide resolved
examples/tutorial/tutorial_notebooks/introduction/omnisci_on_native/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
This PR should be updated to be in line with the new structure introduced as part of #4214. |
ea8b341
to
fd12939
Compare
2785742
to
a8504fd
Compare
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_3.ipynb
Show resolved
Hide resolved
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_3.ipynb
Show resolved
Hide resolved
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_3.ipynb
Show resolved
Hide resolved
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_1.ipynb
Show resolved
Hide resolved
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_3.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_3.ipynb
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rubtsowa, LGTM, thanks!
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_2.ipynb
Outdated
Show resolved
Hide resolved
examples/tutorial/jupyter/execution/omnisci_on_native/local/exercise_3.ipynb
Outdated
Show resolved
Hide resolved
6b169f3
to
6a5651c
Compare
Signed-off-by: Maria Rubtsova maria.rubtsova@intel.com
What do these changes do?
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/developer/architecture.rst
is up-to-date