Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds stub of NarwhalsAdapter #998

Merged
merged 7 commits into from
Jul 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .ci/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,13 @@ if [[ ${TASK} == "vaex" ]]; then
exit 0
fi

if [[ ${TASK} == "narwhals" ]]; then
pip install -e .
pip install polars pandas narwhals
pytest plugin_tests/h_narwhals
exit 0
fi

if [[ ${TASK} == "tests" ]]; then
pip install .
pytest \
Expand Down
18 changes: 18 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -155,3 +155,21 @@ workflows:
name: integrations-py312
python-version: '3.12'
task: integrations
- test:
requires:
- check_for_changes
name: narwhals-py39
python-version: '3.9'
task: narwhals
- test:
requires:
- check_for_changes
name: narwhals-py310
python-version: '3.10'
task: narwhals
- test:
requires:
- check_for_changes
name: narwhals-py311
python-version: '3.11'
task: narwhals
1 change: 1 addition & 0 deletions docs/integrations/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@ This section showcases how Hamilton integrates with popular frameworks.
Slack <https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/slack>
Spark <https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/spark>
Vaex <https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/vaex>
Narwhals <https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/narwhals>
28 changes: 28 additions & 0 deletions examples/narwhals/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Narwhals

[Narwhals](https://narwhals-dev.github.io/narwhals/) is a library that aims
to unify expression across dataframe libraries. It is meant to be lightweight
and focuses on python first dataframe libraries.

This examples shows how you can write dataframe agnostic code
and then load up a pandas or polars data to then use with it.

## Running the example

You can run the example doing:

```bash
# cd examples/narwhals/
python example.py
```
This will run both variants one after the other.

or running the notebook:

```bash
# cd examples/narwhals
jupyter notebook # pip install jupyter if you don't have it
```
Or you can open up the notebook in Colab:

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dagworks-inc/hamilton/blob/main/examples/narwhals/notebook.ipynb)
Binary file added examples/narwhals/example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
70 changes: 70 additions & 0 deletions examples/narwhals/example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import narwhals as nw
import pandas as pd
import polars as pl

from hamilton.function_modifiers import config, tag


@config.when(load="pandas")
def df__pandas() -> nw.DataFrame:
return pd.DataFrame({"a": [1, 1, 2, 2, 3], "b": [4, 5, 6, 7, 8]})


@config.when(load="pandas")
def series__pandas() -> nw.Series:
return pd.Series([1, 3])


@config.when(load="polars")
def df__polars() -> nw.DataFrame:
return pl.DataFrame({"a": [1, 1, 2, 2, 3], "b": [4, 5, 6, 7, 8]})


@config.when(load="polars")
def series__polars() -> nw.Series:
return pl.Series([1, 3])


@tag(nw_kwargs=["eager_only"])
def example1(df: nw.DataFrame, series: nw.Series, col_name: str) -> int:
return df.filter(nw.col(col_name).is_in(series.to_numpy())).shape[0]


def group_by_mean(df: nw.DataFrame) -> nw.DataFrame:
return df.group_by("a").agg(nw.col("b").mean()).sort("a")


if __name__ == "__main__":
import __main__ as example

from hamilton import base, driver
from hamilton.plugins import h_narwhals, h_polars

# pandas
dr = (
driver.Builder()
.with_config({"load": "pandas"})
.with_modules(example)
.with_adapters(
h_narwhals.NarwhalsAdapter(),
h_narwhals.NarwhalsDataFrameResultBuilder(base.PandasDataFrameResult()),
)
.build()
)
r = dr.execute([example.group_by_mean, example.example1], inputs={"col_name": "a"})
print(r)

# polars
dr = (
driver.Builder()
.with_config({"load": "polars"})
.with_modules(example)
.with_adapters(
h_narwhals.NarwhalsAdapter(),
h_narwhals.NarwhalsDataFrameResultBuilder(h_polars.PolarsDataFrameResult()),
)
.build()
)
r = dr.execute([example.group_by_mean, example.example1], inputs={"col_name": "a"})
print(r)
dr.display_all_functions("example.png")
Loading