Skip to content

Commit

Permalink
Add warning to LazyFrame properties
Browse files Browse the repository at this point in the history
  • Loading branch information
stinodego committed Jun 14, 2024
1 parent 5cde3b8 commit ceda989
Showing 1 changed file with 44 additions and 11 deletions.
55 changes: 44 additions & 11 deletions py-polars/polars/lazyframe/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
extend_bool,
is_bool_sequence,
is_sequence,
issue_warning,
normalize_filepath,
parse_percentiles,
)
Expand Down Expand Up @@ -74,6 +75,7 @@
py_type_to_dtype,
)
from polars.dependencies import import_optional, subprocess
from polars.exceptions import PerformanceWarning
from polars.lazyframe.group_by import LazyGroupBy
from polars.lazyframe.in_process import InProcessQuery
from polars.schema import Schema
Expand Down Expand Up @@ -399,13 +401,15 @@ def columns(self) -> list[str]:
Warnings
--------
Determining the column names of a LazyFrame requires resolving its schema.
Resolving the schema of a LazyFrame can be an expensive operation.
Avoid accessing this property repeatedly if possible.
Determining the column names of a LazyFrame requires resolving its schema,
which is a potentially expensive operation.
Using :meth:`collect_schema` is the idiomatic way of resolving the schema.
This property exists only for symmetry with the DataFrame class.
See Also
--------
collect_schema
Schema.names
Examples
--------
Expand All @@ -419,6 +423,12 @@ def columns(self) -> list[str]:
>>> lf.columns
['foo', 'bar']
"""
issue_warning(
"Determining the column names of a LazyFrame requires resolving its schema,"
" which is a potentially expensive operation. Use `LazyFrame.collect_schema().names()`"
" to get the column names without this warning.",
category=PerformanceWarning,
)
return self.collect_schema().names()

@property
Expand All @@ -433,13 +443,15 @@ def dtypes(self) -> list[DataType]:
Warnings
--------
Determining the data types of a LazyFrame requires resolving its schema.
Resolving the schema of a LazyFrame can be an expensive operation.
Avoid accessing this property repeatedly if possible.
Determining the data types of a LazyFrame requires resolving its schema,
which is a potentially expensive operation.
Using :meth:`collect_schema` is the idiomatic way to resolve the schema.
This property exists only for symmetry with the DataFrame class.
See Also
--------
collect_schema
Schema.dtypes
Examples
--------
Expand All @@ -453,6 +465,12 @@ def dtypes(self) -> list[DataType]:
>>> lf.dtypes
[Int64, Float64, String]
"""
issue_warning(
"Determining the data types of a LazyFrame requires resolving its schema,"
" which is a potentially expensive operation. Use `LazyFrame.collect_schema().dtypes()`"
" to get the data types without this warning.",
category=PerformanceWarning,
)
return self.collect_schema().dtypes()

@property
Expand All @@ -462,12 +480,14 @@ def schema(self) -> Schema:
Warnings
--------
Resolving the schema of a LazyFrame can be an expensive operation.
Avoid accessing this property repeatedly if possible.
Resolving the schema of a LazyFrame is a potentially expensive operation.
Using :meth:`collect_schema` is the idiomatic way to resolve the schema.
This property exists only for symmetry with the DataFrame class.
See Also
--------
collect_schema
Schema
Examples
--------
Expand All @@ -481,6 +501,11 @@ def schema(self) -> Schema:
>>> lf.schema
Schema({'foo': Int64, 'bar': Float64, 'ham': String})
"""
issue_warning(
"Resolving the schema of a LazyFrame is a potentially expensive operation."
" Use `LazyFrame.collect_schema()` to get the schema without this warning.",
category=PerformanceWarning,
)
return self.collect_schema()

@property
Expand All @@ -494,13 +519,15 @@ def width(self) -> int:
Warnings
--------
Determining the width of a LazyFrame requires resolving its schema.
Resolving the schema of a LazyFrame can be an expensive operation.
Avoid accessing this property repeatedly if possible.
Determining the width of a LazyFrame requires resolving its schema,
which is a potentially expensive operation.
Using :meth:`collect_schema` is the idiomatic way to resolve the schema.
This property exists only for symmetry with the DataFrame class.
See Also
--------
collect_schema
Schema.len
Examples
--------
Expand All @@ -513,6 +540,12 @@ def width(self) -> int:
>>> lf.width
2
"""
issue_warning(
"Determining the width of a LazyFrame requires resolving its schema,"
" which is a potentially expensive operation. Use `LazyFrame.collect_schema().len()`"
" to get the width without this warning.",
category=PerformanceWarning,
)
return self.collect_schema().len()

def __bool__(self) -> NoReturn:
Expand Down

0 comments on commit ceda989

Please sign in to comment.