Skip to content

Commit

Permalink
change README & ci-tests & examples
Browse files Browse the repository at this point in the history
  • Loading branch information
Rubtsowa committed Mar 1, 2022
1 parent aa95b7f commit a8504fd
Show file tree
Hide file tree
Showing 4 changed files with 13 additions and 16 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/ci-notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,10 @@ jobs:
- run: conda install black flake8 flake8-print jupyter nbformat nbconvert -c conda-forge
if: matrix.execution == 'omnisci_on_native'
- run: pip list
- run: |
conda info
conda list
if: matrix.execution == 'omnisci_on_native'
- run: |
black --check --diff examples/tutorial/jupyter/execution/${{ matrix.execution }}/test/test_notebooks.py
black --check --diff examples/tutorial/jupyter/execution/test/utils.py
Expand Down
9 changes: 5 additions & 4 deletions examples/tutorial/jupyter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@ Currently we provide tutorial notebooks for the following execution backends:

- [PandasOnRay](https://modin.readthedocs.io/en/latest/development/using_pandas_on_ray.html)
- [PandasOnDask](https://modin.readthedocs.io/en/latest/development/using_pandas_on_dask.html)
- [OmnisciOnNative](https://modin.readthedocs.io/en/latest/development/using_omnisci.html)

## Creating a development environment

To get required dependencies for these Jupyter Notebooks
To get required dependencies for `PandasOnRay` and `PandasOnDask` Jupyter Notebooks
you should create a development environment with `pip`
using `requirements.txt` file located in the respective directory:

Expand All @@ -28,11 +29,11 @@ please install every package listed in `requirements.txt` file individually with

To get required dependencies for `OmnisciOnNative` Jupyter Notebooks
you should create a development environment with `conda`
using `jupyter_omnisci_env.yml` file located in the current directory:
using `jupyter_omnisci_env.yml` file located in the respective directory:

```bash
conda config --set channel_priority strict
conda env create -f jupyter_omnisci_env.yml
conda env create -f execution/omnisci_on_native/jupyter_omnisci_env.yml
```

After the environment is created it needs to be activated:
Expand All @@ -41,7 +42,7 @@ After the environment is created it needs to be activated:
conda activate jupyter_modin_on_omnisci
```

**Note:** Notebook for `OmnisciOnNative` working only on Linux.
**Note:** `Omnisci` engine is available on Linux only for now.

## Run Jupyter Notebooks

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,8 @@
"metadata": {},
"outputs": [],
"source": [
"# When working with non-string column labels it could happen that some backend logic would try to insert a column \n",
"# with a string name to the frame, so we do add_prefix()\n",
"df = df.add_prefix(\"col\")"
]
},
Expand All @@ -189,16 +191,6 @@
"df.head(10)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Print the DataFrame.\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@
"source": [
"start = time.time()\n",
"\n",
"pandas_sum = (pandas_df[['VendorID', 'passenger_count', 'trip_distance', 'pickup_longitude', 'pickup_latitude', 'RateCodeID', 'dropoff_longitude', 'dropoff_latitude', 'payment_type', 'fare_amount', 'extra', 'mta_tax', 'tip_amount', 'tolls_amount', 'improvement_surcharge', 'total_amount']]).sum()\n",
"pandas_sum = (pandas_df[:500000]).sum()\n",
"\n",
"end = time.time()\n",
"pandas_duration = end - start\n",
Expand All @@ -206,7 +206,7 @@
"source": [
"start = time.time()\n",
"\n",
"modin_sum = (modin_df[['VendorID', 'passenger_count', 'trip_distance', 'pickup_longitude', 'pickup_latitude', 'RateCodeID', 'dropoff_longitude', 'dropoff_latitude', 'payment_type', 'fare_amount', 'extra', 'mta_tax', 'tip_amount', 'tolls_amount', 'improvement_surcharge', 'total_amount']]).sum()\n",
"modin_sum = (modin_df[:500000]).sum()\n",
"\n",
"end = time.time()\n",
"modin_duration = end - start\n",
Expand Down

0 comments on commit a8504fd

Please sign in to comment.