Skip to content

Commit

Permalink
DOCS-#4290: Add missed changes for OmniSci notebooks (#4291)
Browse files Browse the repository at this point in the history
Signed-off-by: Maria Rubtsova <maria.rubtsova@intel.com>
  • Loading branch information
Rubtsowa authored Mar 4, 2022
1 parent 3f00e24 commit acd0ff9
Show file tree
Hide file tree
Showing 4 changed files with 20 additions and 10 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci-notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ jobs:
- run: conda install black flake8 flake8-print jupyter nbformat nbconvert -c conda-forge
if: matrix.execution == 'omnisci_on_native'
- run: pip list
if: matrix.execution != 'omnisci_on_native'
- run: |
conda info
conda list
Expand Down
2 changes: 2 additions & 0 deletions docs/release_notes/release_notes-0.14.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,9 @@ Key Features and Updates
* DOCS-#4176: Update OmniSci usage section (#4192)
* DOCS-#4027: Add GIF images and chart to Modin README demonstrating speedups (#4232)
* DOCS-#3954: Add Dask example notebooks (#4139)
* DOCS-#3953: Add docs and notebook examples on running Modin with OmniSci (#4001)
* DOCS-#4280: Change links in jupyter notebooks (#4281)
* DOCS-#4290: Add changes for OmniSci notebooks (#4291)
* DOCS-#4241: Update warnings and docs regarding defaulting to pandas (#4242)
* Dependencies
* FIX-#4113, FIX-#4116, FIX-#4115: Apply new `black` formatting, fix pydocstyle check and readthedocs build (#4114)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,8 @@
"metadata": {},
"outputs": [],
"source": [
"# Note: Do not change this code!\n",
"import numpy as np\n",
"import pandas\n",
"import sys\n",
"import modin"
"import modin.config as cfg\n",
"cfg.StorageFormat.put('omnisci')"
]
},
{
Expand All @@ -100,8 +97,11 @@
"metadata": {},
"outputs": [],
"source": [
"import modin.config as cfg\n",
"cfg.StorageFormat.put('omnisci')"
"# Note: Importing notebooks dependencies. Do not change this code!\n",
"import numpy as np\n",
"import pandas\n",
"import sys\n",
"import modin"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,18 @@
"\n",
"We convert the Modin dataframe to a pyarrow.Table, perform a lazy tree execution in OmniSci, render it as a pyarrow.Table, convert it to pandas to perform the operation, and then convert it back to Modin when complete. These operations will have a large overhead due to the communication involved and will take longer than pandas.\n",
"\n",
"When this is happening, a warning will be given to the user to inform them that this operation will take longer than usual. For example, `DataFrame.apply` is not supported. In this case, when a user tries to use it, they will see this warning:\n",
"When this is happening, a warning will be given to the user to inform them that this operation will take longer than usual. For example, `DataFrame.mask` is not supported. In this case, when a user tries to use it, they will see this warning:\n",
"\n",
"```\n",
"UserWarning: `DataFrame.apply` defaulting to pandas implementation.\n",
"```"
"UserWarning: `DataFrame.mask` defaulting to pandas implementation.\n",
"```\n",
"\n",
"#### Relation engine limitations\n",
"As the `OmnisciOnNative` execution is backed by relation algebra based DB engine, there is a certain set of limitations on operations that could be used in Modin with such an execution. For example arbitrary functions in `DataFrame.apply` are not supported as the OmniSci engine can't execute python callables against its tables, this means that `DataFrame.apply(python_callable)` will **always** be defaulting to pandas. \n",
"\n",
"For more info about `OmnisciOnNative` limitations visit the appropriate section on read-the-docs: [relation algebra limitations](https://modin.readthedocs.io/en/stable/flow/modin/experimental/core/execution/native/implementations/omnisci_on_native/index.html#relational-engine-limitations).\n",
"\n",
"If your flow mainly operates with non-relational algebra operations, you should better choose non-OmniSci execution (for example, `PandasOnRay`)."
]
},
{
Expand Down

0 comments on commit acd0ff9

Please sign in to comment.