Python Polars 1.3.0
🚀 Performance improvements
- Ensure metadata flags are maintained on vertical parallelization (#17804)
- Ensure only nodes that are not changed are cached in collapse optimizer (#17791)
- Use bitflags for OptState (#17788)
- Remove async directory auto-detection (#17779)
- Fix accidental quadratic horizontal concat (#17783)
- Batch parquet integer decoding (#17734)
- Use mmap-ed memory if possible in Parquet reader (#17725)
- Use bitflags for function options (#17723)
- Also set target features and tune cpu for CC (#17716)
- Introduce
MemReader
to file buffer in Parquet reader (#17712)
✨ Enhancements
- Expose binary_elementwise_into_string_amortized for plugin authors, recommend
apply_into_string_amortized
instead ofapply_to_buffer
(#17903) - Expose allocator to capsule (#17817)
- Decompress in CSV / NDJSON scan (#17841)
- Ensure unique names in HConcat (#17884)
- Support authentication with HuggingFace login (#17881)
- Enable collection with gpu engine (#17550)
- Support "BY NAME" qualifier for
SQL
"INTERSECT" and "EXCEPT" set ops (#17835) - Write data at table level in
write_excel
(#17757) - Support PyCapsule Interface in DataFrame & Series constructors (#17693)
- Implement Arrow PyCapsule Interface for Series/DataFrame export (#17676)
- Raise informative error instead of panicking when passing invalid directives to
to_string
for Date dtype (#17670) - Implement forward/backward fill for all types (#17861)
- Implement
is_in
operation on decimal type (#17832) - Optimise
read_excel
when using "calamine" engine with the latestfastexcel
(#17735) - Support
hf://
inread_(csv|ipc|ndjson)
functions (#17785) - Allow literals in sort (#17780)
- Expose 'strict' argument to 'is_in' (#17776)
- Release the GIL in
collect_schema
(#17761) - Cloud support for NDJSON (#17717)
- Support API token for scanning
hf://
(#17682)
🐞 Bug fixes
- Scanning '%' from cloud (#17890)
- Raise suitable error when invalid column passed to
get_column_index
(#17868) - Respect
glob=False
for cloud reads (#17860) - Properly write nest-nulled values in Parquet (#17845)
- Improve default
write_excel
int/float format when using a dark "table_style" (#17869) - Fix
from_arrow
for struct type (#17839) - Fix bool/string usage of "column_totals" parameter in
write_excel
(#17846) - Infer decimal scales on mixed scale input (#17840)
- Don't ignore timezones in list of dicts constructor (#14211)
- Raise on unsupported fill strategy dtype (#17837)
- Properly write nested
NullArray
in Parquet (#17807) - Check input type on list.to_struct (#17834)
- Fix right join schema (#17833)
- Simultaneous usage of
named_expr
andschema
inpl.struct
(#17768) - Fix projection pusdhown of literals without names (#17778)
- Don't expand HTTP paths (#17774)
- Check funtion input len at expansion (#17763)
- Don't panic in invalid agg_groups (#17762)
- Raise empty struct (#17736)
- Fix GC logic in
write_ipc
(#17752) - Panic in pl.concat_list and list.concat on empty inputs (#17742)
- Fix out nullability for structs coming from arrow (#17738)
- Percent encode for Hugging Face paths (#17718)
📖 Documentation
- Updating the join example input for rust for consistency with python example (#17898)
- Improve filter documentation (#17755)
- Reword "how" param docstring entry for 'semi' and 'anti'
join
types for clarity (#17843) - Mention
read_*
functions in Hugging Face section in user guide (#17799) - Show return type for Series attributes in API reference (#17759)
- Add function with multiple arguments example to
Expr.map_batches
(#17789) - Add Hugging Face section to user guide (#17721)
📦 Build system
- Update Rust toolchain to
nightly-2024-07-26
(#17891) - Correctly reference released package in optional dependencies (#17691)
🛠️ Other improvements
- On Python release, trigger docs build after API reference build (#17904)
- Set
uv pip install
to verbose (#17901) - Fix broken
typos
command inmake pre-commit
for py-polars folder (#17897) - Remove HybridRLE iter / batch nested parquet decoding (#17889)
- Add version field for python IR (#17876)
- Pass through missing rolling and stringfunction information in pyir (#17702)
- Make better use of
typos
configuration features (#17800) - Better deprecate message for _import_from_c (#17753)
- Rename Unit to Plain in Parquet reader (#17751)
- Unpin
setuptools
(#17726) - Update CODEOWNERS (#17707)
Thank you to all our contributors for making this release possible!
@MarcoGorelli, @Object905, @SandroCasagrande, @alexander-beedie, @atigbadr, @coastalwhite, @deanm0000, @delsner, @dependabot, @dependabot[bot], @henryharbeck, @implicit-apparatus, @jparag, @knl, @kylebarron, @lukapeschke, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @ruihe774, @stinodego, @szepeviktor and @wence-