Skip to content

Commit

Permalink
Add docs
Browse files Browse the repository at this point in the history
  • Loading branch information
jernejfrank committed Oct 31, 2024
1 parent 32d72d5 commit 4691398
Show file tree
Hide file tree
Showing 5 changed files with 496 additions and 1 deletion.
14 changes: 13 additions & 1 deletion docs/reference/decorators/with_columns.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,19 @@
with_columns
=======================

** Overview **
Pandas
--------------

We have a ``with_columns`` option to run operations on columns of a Pandas dataframe and append the results as new columns.

**Reference Documentation**

.. autoclass:: hamilton.function_modifiers.with_columns
:special-members: __init__


PySpark
--------------

This is part of the hamilton pyspark integration. To install, run:

Expand Down
Binary file added examples/pandas/with_columns/DAG.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 7 additions & 0 deletions examples/pandas/with_columns/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Using with_columns with Pandas

We show the ability to use the familiar `with_columns` from either `pyspark` or `polars` on a Pandas dataframe.

To see the example look at the notebook.

![image info](./dag.png)
38 changes: 38 additions & 0 deletions examples/pandas/with_columns/my_functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import pandas as pd

"""
Notes:
1. This file is used for all the [ray|dask|spark]/hello_world examples.
2. It therefore show cases how you can write something once and not only scale it, but port it
to different frameworks with ease!
"""


def avg_3wk_spend(spend: pd.Series) -> pd.Series:
"""Rolling 3 week average spend."""
return spend.rolling(3).mean()


def spend_per_signup(spend: pd.Series, signups: pd.Series) -> pd.Series:
"""The cost per signup in relation to spend."""
return spend / signups


def spend_mean(spend: pd.Series) -> float:
"""Shows function creating a scalar. In this case it computes the mean of the entire column."""
return spend.mean()


def spend_zero_mean(spend: pd.Series, spend_mean: float) -> pd.Series:
"""Shows function that takes a scalar. In this case to zero mean spend."""
return spend - spend_mean


def spend_std_dev(spend: pd.Series) -> float:
"""Function that computes the standard deviation of the spend column."""
return spend.std()


def spend_zero_mean_unit_variance(spend_zero_mean: pd.Series, spend_std_dev: float) -> pd.Series:
"""Function showing one way to make spend have zero mean and unit variance."""
return spend_zero_mean / spend_std_dev
Loading

0 comments on commit 4691398

Please sign in to comment.