Release Python Polars 1.10.0 · pola-rs/polars

🚀 Performance improvements

Add/fix unordered row decode, change unordered format (#19284)
Fast decision for Parquet dictionary encoding (#19256)
Make date_range / datetime_range ~10x faster for constant durations (#19216)
Batch utf8-validation in csv 18% / 25% on 1.9.0 (#19124)
Use two-pass algorithm for csv to ensure correctness and SIMDize more ~17% (#19088)

✨ Enhancements

Add SQL support for bit_count and bitwise &, |, and xor operators (#19114)
Add credential provider utility classes for AWS, GCP (#19297)
Support decoding Float16 in Parquet (#19278)
Experimental credential_provider argument for scan_parquet (#19271)
Allow DeltaTable input to scan_delta and read_delta (#19229)
New quantile interpolation method & QUANTILE_DISC function in SQL (#19139)
Conserve Parquet SortingColumns for ints (#19251)
Low level flight interface (#19239)
Improved list arithmetic support (#19162)
Add Expr.struct.unnest() as alias for Expr.struct.field("*") (#19212)
Add 'drop_empty_rows' parameter for read_ods (#19202)
Add 'drop_empty_rows' parameter for read_excel (#18253)
Expose LTS CPU in show_versions() (#19193)
Check Python version when deserializing UDFs (#19175)
Raise an error when users try to use Polars API in a fork()-without-execve() child (#19149)
Quantile function in SQL (#18047)
Improve scalar strict message (#19117)
Add Series::{first, last, approx_n_unique} (#19093)
Allow for rolling_*_by to use index count as window (#19071)
Delay deserialization of python function until physical plan (#19069)
Add cum(_min/_max) for pl.Boolean (#19061)

🐞 Bug fixes

Don't produce duplicate column names in Series.to_dummies (#19326)
Use of HAVING outside of GROUP BY should raise a suitable SQLSyntaxError (#19320)
More accurate from_dicts typing/signature (#19322)
Fix empty array gather (#19316)
Merge categorical rev-map in unpivot (#19313)
DataFrame descending sorting by single list element (#19233)
Fix cse union schema (#19305)
Correctly load Parquet statistics for f16 (#19296)
Error on invalid query (#19303)
Fix enum scalar output (#19301)
Fix list gather invalid fast path (#19299)
Fix quoting style of decimal csv output (#19298)
Don't vertically parallelize literal select (#19295)
Fix struct reshape fast path (#19294)
Also split on forward slashes during hive path inference on Windows (#19282)
Don't cse as_struct (#19280)
Only apply string parsing to String dtype (#19222)
Make the SQLAlchemy connection check more robust (#19270)
Ensure that read_database takes advantage of Arrow return from a duckdb_engine connection when using a SQLAlchemy Selectable (#19255)
Compilation error missing use JsonLineReader (#19244)
Don't remember Parquet statistics if filtered (#19248)
Do not check dtypes of non-projected columns for parquet (#19254)
Parquet predicate pushdown for lit(_) != (#19246)
Use all chunks in Series from arrow struct (#19218)
Don't trigger row limit in array construction (#19215)
Fix struct literals (#19214)
Plotting was not interacting well with Altair schema wrappers (#19213)
Fixing infer_schema for DataType::Null (#19201)
Migrate to PyO3 0.22 and released verion of rust-numpy crate (#19199)
Add 'drop_empty_rows' parameter for read_excel (#18253)
Don't unwrap() expansion (#19196)
Properly handle non-nullable nested Parquet (#19192)
Fix invalid list collection in expression engine (#19191)
Fix use of "hidden_columns" parameter in write_excel (#19029)
Implement to_arrow functionality properly for Arrays (#19077)
Remove incorrect warning when using an IO[bytes] instance (#19154)
Don't fail test if e.g. jax has been used first, since jax installs a fork handler that warns (#19178)
Fix incorrect (eq|ne)_missing on List/Array types (#19155)
Properly broadcast Struct when then validity (#19148)
Allow partial name overlap in join_where resolution (#19128)
Fix floordiv / modulo with scalar 0 on LHS (#19143)
Ensure aligned chunks in OOC sort (#19118)
Recursively align when converting to ArrowArray (#19097)
Raise on invalid shape of shape 1, empty combination (#19113)
Use two-pass algorithm for csv to ensure correctness and SIMDize more ~17% (#19088)
Allow converting DatetimeOwned to ChunkedArray (#19094)
Throw proper error for empty char params in scan_csv (#19100)
Ensure parquet schema arg is propagated to IR (#19084)
Only rewrite numeric ineq joins (#19083)
Check validity of columns of keys/aggs in dsl->ir (#19082)
Bitwise aggregations should ignore null values (#19067)
Remove failing datetime subclass test (#19068)
Don't ignore multiple columns in LazyFrame.unnest (#19035)

📖 Documentation

Remove ecosystem viz section since there is one in misc already (#18408)
Fix typo in custom expressions docs (#19292)
Add SQL docs for new QUANTILE_CONT and QUANTILE_DISC functions (#19272)
Add marimo to ecosystem.md (#19250)
Improve DataFrame.write_database docstring (#19189)
Link to main website from banner (#19177)
Fix example of as_struct (#19116)
Clarify difference between bitwise/logical ops (#19180)
Add non-equi joins to, and revise, joins docs page (#19127)
Add Series.first,last,approx_n_unique to docs (#19146)
Annotate Config kwarg options (#18988)
Revise and improve 'Concepts' section (#19087)

🛠️ Other improvements

Add/fix unordered row decode, change unordered format (#19284)
Move from parquet-format-safe to polars-parquet-format (#19275)
Skip flaky test (#19242)
Add more tests for list arithmetic (#19225)
Remove unused IPC async (#19223)
Make get_list_builder infallible (#19217)
Migrate to PyO3 0.22 and released verion of rust-numpy crate (#19199)
Make expression output type known (#19195)
Revert "feat(python): Raise an error when users try to use Polars API in a fork()-without-execve() child (#19149) (#19188)
Zero-Field Structs and DataFrame with Height Property (#19123)
Make pl.repeat part of the IR (#19152)
Expose IEJoin IR node to python (#19104)
Clean remove_prefix since python3.9 is now the minimum Python (#19070)
Add new streaming engine to CI (#19051)

Thank you to all our contributors for making this release possible!
@Bidek56, @MarcoGorelli, @Rashik-raj, @adamreeve, @alexander-beedie, @alonme, @balbok0, @coastalwhite, @deanm0000, @dependabot, @dependabot[bot], @eitsupi, @etrotta, @itamarst, @jbutterwick, @joelostblom, @kenkoooo, @khalidmammadov, @laurentS, @mcrumiller, @mscolnick, @nameexhaustion, @orlp, @pomo-mondreganto, @ritchie46, @rodrigogiraoserrao, @siddharth-vi, @stinodego, @sunadase and @wence-

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python Polars 1.10.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors