Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(python): Add visualization page to user guide #13052

Merged
merged 1 commit into from
Feb 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/development/contributing/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,9 @@ The user guide is maintained in the `docs/user-guide` folder. Before creating a

#### Building and serving the user guide

The user guide is built using [MkDocs](https://www.mkdocs.org/). You install the dependencies for building the user guide by running `make requirements` in the root of the repo.
The user guide is built using [MkDocs](https://www.mkdocs.org/). You install the dependencies for building the user guide by running `make build` in the root of the repo.

Run `mkdocs serve` to build and serve the user guide, so you can view it locally and see updates as you make changes.
Activate the virtual environment and run `mkdocs serve` to build and serve the user guide, so you can view it locally and see updates as you make changes.

#### Creating a new user guide page

Expand Down
3 changes: 3 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@ pandas
pyarrow
graphviz
matplotlib
seaborn
plotly
altair

mkdocs-material==9.5.2
mkdocs-macros-plugin==1.0.4
Expand Down
130 changes: 130 additions & 0 deletions docs/src/python/user-guide/misc/visualization.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# --8<-- [start:dataframe]
import polars as pl

path = "docs/data/iris.csv"

df = pl.scan_csv(path).group_by("species").agg(pl.col("petal_length").mean()).collect()
print(df)
# --8<-- [end:dataframe]

"""
# --8<-- [start:hvplot_show_plot]
df.plot.bar(
x="species",
y="petal_length",
width=650,
)
# --8<-- [end:hvplot_show_plot]
"""

# --8<-- [start:hvplot_make_plot]
import hvplot

plot = df.plot.bar(
x="species",
y="petal_length",
width=650,
)
hvplot.save(plot, "docs/images/hvplot_bar.html")
with open("docs/images/hvplot_bar.html", "r") as f:
chart_html = f.read()
print(f"{chart_html}")
# --8<-- [end:hvplot_make_plot]

"""
# --8<-- [start:matplotlib_show_plot]
import matplotlib.pyplot as plt

plt.bar(x=df["species"], height=df["petal_length"])
# --8<-- [end:matplotlib_show_plot]
"""

# --8<-- [start:matplotlib_make_plot]
import base64

import matplotlib.pyplot as plt

plt.bar(x=df["species"], height=df["petal_length"])
plt.savefig("docs/images/matplotlib_bar.png")
with open("docs/images/matplotlib_bar.png", "rb") as f:
png = base64.b64encode(f.read()).decode()
print(f'<img src="data:image/png;base64, {png}"/>')
# --8<-- [end:matplotlib_make_plot]

"""
# --8<-- [start:seaborn_show_plot]
import seaborn as sns
sns.barplot(
df,
x="species",
y="petal_length",
)
# --8<-- [end:seaborn_show_plot]
"""

# --8<-- [start:seaborn_make_plot]
import seaborn as sns

sns.barplot(
df,
x="species",
y="petal_length",
)
plt.savefig("docs/images/seaborn_bar.png")
with open("docs/images/seaborn_bar.png", "rb") as f:
png = base64.b64encode(f.read()).decode()
print(f'<img src="data:image/png;base64, {png}"/>')
# --8<-- [end:seaborn_make_plot]

"""
# --8<-- [start:plotly_show_plot]
import plotly.express as px

px.bar(
df,
x="species",
y="petal_length",
width=400,
)
# --8<-- [end:plotly_show_plot]
"""

# --8<-- [start:plotly_make_plot]
import plotly.express as px

fig = px.bar(
df,
x="species",
y="petal_length",
width=650,
)
fig.write_html("docs/images/plotly_bar.html", full_html=False, include_plotlyjs="cdn")
with open("docs/images/plotly_bar.html", "r") as f:
chart_html = f.read()
print(f"{chart_html}")
# --8<-- [end:plotly_make_plot]

"""
# --8<-- [start:altair_show_plot]
import altair as alt

alt.Chart(df, width=700).mark_bar().encode(x="species:N", y="petal_length:Q")
# --8<-- [end:altair_show_plot]
"""

# --8<-- [start:altair_make_plot]
import altair as alt
MarcoGorelli marked this conversation as resolved.
Show resolved Hide resolved

chart = (
alt.Chart(df, width=600)
.mark_bar()
.encode(
x="species:N",
y="petal_length:Q",
)
)
chart.save("docs/images/altair_bar.html")
with open("docs/images/altair_bar.html", "r") as f:
chart_html = f.read()
print(f"{chart_html}")
# --8<-- [end:altair_make_plot]
60 changes: 60 additions & 0 deletions docs/user-guide/misc/visualization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Visualization

Data in a Polars `DataFrame` can be visualized using common visualization libraries.

We illustrate plotting capabilities using the Iris dataset. We scan a CSV and then do a group-by on the `species` column and get the mean of the `petal_length`.

{{code_block('user-guide/misc/visualization','dataframe',[])}}

```python exec="on" result="text" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:dataframe"
```

## Built-in plotting with hvPlot

Polars has a `plot` method to create interactive plots using [hvPlot](https://hvplot.holoviz.org/).

{{code_block('user-guide/misc/visualization','hvplot_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:hvplot_make_plot"
```

## Matplotlib

To create a bar chart we can pass columns of a `DataFrame` directly to Matplotlib as a `Series` for each column. Matplotlib does not have explicit support for Polars objects but Matplotlib can accept a Polars `Series` because it can convert each Series to a numpy array, which is zero-copy for numeric
data without null values.

{{code_block('user-guide/misc/visualization','matplotlib_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:matplotlib_make_plot"
```

## Seaborn, Plotly & Altair

[Seaborn](https://seaborn.pydata.org/), [Plotly](https://plotly.com/) & [Altair](https://altair-viz.github.io/) can accept a Polars `DataFrame` by leveraging the [dataframe interchange protocol](https://data-apis.org/dataframe-api/), which offers zero-copy conversion where possible.

### Seaborn

{{code_block('user-guide/misc/visualization','seaborn_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:seaborn_make_plot"
```

### Plotly

{{code_block('user-guide/misc/visualization','plotly_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:plotly_make_plot"
```

### Altair

{{code_block('user-guide/misc/visualization','altair_show_plot',[])}}

```python exec="on" session="user-guide/misc/visualization"
--8<-- "python/user-guide/misc/visualization.py:altair_make_plot"
```
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ nav:
- user-guide/ecosystem.md
- Misc:
- user-guide/misc/multiprocessing.md
- user-guide/misc/visualization.md
- user-guide/misc/comparison.md

- API reference: api/index.md
Expand Down
3 changes: 3 additions & 0 deletions py-polars/tests/docs/test_user_guide.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
python_snippets_dir = repo_root / "docs" / "src" / "python"
snippet_paths = list(python_snippets_dir.rglob("*.py"))

# Skip visualization snippets
snippet_paths = [p for p in snippet_paths if "visualization" not in str(p)]


@pytest.fixture(scope="module")
def _change_test_dir() -> Iterator[None]:
Expand Down
Loading