Skip to content

Commit

Permalink
docs(python): Add visualisation page to user guide (#13052)
Browse files Browse the repository at this point in the history
Co-authored-by: Stijn de Gooijer <stijndegooijer@gmail.com>
  • Loading branch information
braaannigan and stinodego authored Feb 3, 2024
1 parent a7a1549 commit 8c8d4fb
Show file tree
Hide file tree
Showing 6 changed files with 199 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/development/contributing/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,9 @@ The user guide is maintained in the `docs/user-guide` folder. Before creating a

#### Building and serving the user guide

The user guide is built using [MkDocs](https://www.mkdocs.org/). You install the dependencies for building the user guide by running `make requirements` in the root of the repo.
The user guide is built using [MkDocs](https://www.mkdocs.org/). You install the dependencies for building the user guide by running `make build` in the root of the repo.

Run `mkdocs serve` to build and serve the user guide, so you can view it locally and see updates as you make changes.
Activate the virtual environment and run `mkdocs serve` to build and serve the user guide, so you can view it locally and see updates as you make changes.

#### Creating a new user guide page

Expand Down
3 changes: 3 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@ pandas
pyarrow
graphviz
matplotlib
seaborn
plotly
altair

mkdocs-material==9.5.2
mkdocs-macros-plugin==1.0.4
Expand Down
130 changes: 130 additions & 0 deletions docs/src/python/user-guide/misc/visualization.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# --8<-- [start:dataframe]
import polars as pl

path = "docs/data/iris.csv"

df = pl.scan_csv(path).group_by("species").agg(pl.col("petal_length").mean()).collect()
print(df)
# --8<-- [end:dataframe]

"""
# --8<-- [start:hvplot_show_plot]
df.plot.bar(
x="species",
y="petal_length",
width=650,
)
# --8<-- [end:hvplot_show_plot]
"""

# --8<-- [start:hvplot_make_plot]
import hvplot

plot = df.plot.bar(
x="species",
y="petal_length",
width=650,
)
hvplot.save(plot, "docs/images/hvplot_bar.html")
with open("docs/images/hvplot_bar.html", "r") as f:
chart_html = f.read()
print(f"{chart_html}")
# --8<-- [end:hvplot_make_plot]

"""
# --8<-- [start:matplotlib_show_plot]
import matplotlib.pyplot as plt
plt.bar(x=df["species"], height=df["petal_length"])
# --8<-- [end:matplotlib_show_plot]
"""

# --8<-- [start:matplotlib_make_plot]
import base64

import matplotlib.pyplot as plt

plt.bar(x=df["species"], height=df["petal_length"])
plt.savefig("docs/images/matplotlib_bar.png")
with open("docs/images/matplotlib_bar.png", "rb") as f:
png = base64.b64encode(f.read()).decode()
print(f'<img src="data:image/png;base64, {png}"/>')
# --8<-- [end:matplotlib_make_plot]

"""
# --8<-- [start:seaborn_show_plot]
import seaborn as sns
sns.barplot(
df,
x="species",
y="petal_length",
)
# --8<-- [end:seaborn_show_plot]
"""

# --8<-- [start:seaborn_make_plot]
import seaborn as sns

sns.barplot(
df,
x="species",
y="petal_length",
)
plt.savefig("docs/images/seaborn_bar.png")
with open("docs/images/seaborn_bar.png", "rb") as f:
png = base64.b64encode(f.read()).decode()
print(f'<img src="data:image/png;base64, {png}"/>')
# --8<-- [end:seaborn_make_plot]

"""
# --8<-- [start:plotly_show_plot]
import plotly.express as px
px.bar(
df,
x="species",
y="petal_length",
width=400,
)
# --8<-- [end:plotly_show_plot]
"""

# --8<-- [start:plotly_make_plot]
import plotly.express as px

fig = px.bar(
df,
x="species",
y="petal_length",
width=650,
)
fig.write_html("docs/images/plotly_bar.html", full_html=False, include_plotlyjs="cdn")
with open("docs/images/plotly_bar.html", "r") as f:
chart_html = f.read()
print(f"{chart_html}")
# --8<-- [end:plotly_make_plot]

"""
# --8<-- [start:altair_show_plot]
import altair as alt
alt.Chart(df, width=700).mark_bar().encode(x="species:N", y="petal_length:Q")
# --8<-- [end:altair_show_plot]
"""

# --8<-- [start:altair_make_plot]
import altair as alt

chart = (
alt.Chart(df, width=600)
.mark_bar()
.encode(
x="species:N",
y="petal_length:Q",
)
)
chart.save("docs/images/altair_bar.html")
with open("docs/images/altair_bar.html", "r") as f:
chart_html = f.read()
print(f"{chart_html}")
# --8<-- [end:altair_make_plot]
60 changes: 60 additions & 0 deletions docs/user-guide/misc/visualization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Visualization

Data in a Polars `DataFrame` can be visualized using common visualization libraries.

We illustrate plotting capabilities using the Iris dataset. We scan a CSV and then do a group-by on the `species` column and get the mean of the `petal_length`.

{{code_block('user-guide/misc/visualization','dataframe',[])}}

```python exec="on" result="text" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:dataframe"
```

## Built-in plotting with hvPlot

Polars has a `plot` method to create interactive plots using [hvPlot](https://hvplot.holoviz.org/).

{{code_block('user-guide/misc/visualization','hvplot_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:hvplot_make_plot"
```

## Matplotlib

To create a bar chart we can pass columns of a `DataFrame` directly to Matplotlib as a `Series` for each column. Matplotlib does not have explicit support for Polars objects but Matplotlib can accept a Polars `Series` because it can convert each Series to a numpy array, which is zero-copy for numeric
data without null values.

{{code_block('user-guide/misc/visualization','matplotlib_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:matplotlib_make_plot"
```

## Seaborn, Plotly & Altair

[Seaborn](https://seaborn.pydata.org/), [Plotly](https://plotly.com/) & [Altair](https://altair-viz.github.io/) can accept a Polars `DataFrame` by leveraging the [dataframe interchange protocol](https://data-apis.org/dataframe-api/), which offers zero-copy conversion where possible.

### Seaborn

{{code_block('user-guide/misc/visualization','seaborn_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:seaborn_make_plot"
```

### Plotly

{{code_block('user-guide/misc/visualization','plotly_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:plotly_make_plot"
```

### Altair

{{code_block('user-guide/misc/visualization','altair_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:altair_make_plot"
```
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ nav:
- user-guide/ecosystem.md
- Misc:
- user-guide/misc/multiprocessing.md
- user-guide/misc/visualization.md
- user-guide/misc/comparison.md

- API reference: api/index.md
Expand Down
3 changes: 3 additions & 0 deletions py-polars/tests/docs/test_user_guide.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
python_snippets_dir = repo_root / "docs" / "src" / "python"
snippet_paths = list(python_snippets_dir.rglob("*.py"))

# Skip visualization snippets
snippet_paths = [p for p in snippet_paths if "visualization" not in str(p)]


@pytest.fixture(scope="module")
def _change_test_dir() -> Iterator[None]:
Expand Down

0 comments on commit 8c8d4fb

Please sign in to comment.