Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): Add strict parameter to DataFrame constructor to allow non-strict construction #15034

Merged
merged 8 commits into from
Mar 13, 2024

Conversation

stinodego
Copy link
Member

@stinodego stinodego commented Mar 13, 2024

Ref #14427

The DataFrame constructor had no strict parameter, while the Series constructor does. Non-strict construction is useful when the input data is not guaranteed to be clean.

import polars as pl

data = {"a": [-1, 0, 1]}

# Data does not match desired data type, cannot construct dataframe
df = pl.DataFrame(data, schema={"a": pl.UInt8})
# OverflowError

# Pass `strict=False` to handle this cleanly
df = pl.DataFrame(data, schema={"a": pl.UInt8}, strict=False)
print(df)
shape: (3, 1)
┌──────┐
│ a    │
│ ---  │
│ u8   │
╞══════╡
│ null │
│ 0    │
│ 1    │
└──────┘

In this example, the result is similar to a non-strict cast of the data to the desired data type.
This will also allow handling mixed ints/floats cleanly in the future.

The default behavior is unchanged, as previously DataFrame construction was always strict, corresponding to strict=True.

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Mar 13, 2024
@stinodego stinodego marked this pull request as ready for review March 13, 2024 11:00
Copy link

codecov bot commented Mar 13, 2024

Codecov Report

Attention: Patch coverage is 91.30435% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 81.00%. Comparing base (1d7b49e) to head (e3437d4).
Report is 4 commits behind head on main.

❗ Current head e3437d4 differs from pull request most recent head 770f9ff. Consider uploading reports for the commit 770f9ff to get more accurate results

Files Patch % Lines
py-polars/polars/_utils/construction/dataframe.py 91.30% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15034      +/-   ##
==========================================
- Coverage   81.00%   81.00%   -0.01%     
==========================================
  Files        1337     1338       +1     
  Lines      173326   173603     +277     
  Branches     2460     2461       +1     
==========================================
+ Hits       140403   140624     +221     
- Misses      32454    32509      +55     
- Partials      469      470       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stinodego stinodego merged commit 82932c6 into main Mar 13, 2024
13 checks passed
@stinodego stinodego deleted the df-strict branch March 13, 2024 15:42
@c-peters c-peters added the accepted Ready for implementation label Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants