Skip to content

Python Polars 0.20.7

Compare
Choose a tag to compare
@github-actions github-actions released this 04 Feb 23:33
fd781eb

⚠️ Deprecations

  • Rename threadpool_size to thread_pool_size (#14236)

🚀 Performance improvements

  • prune parquet row groups when is_not_null is used (#14260)
  • Avoid unnecessary copies in Series.to_numpy for boolean/temporal types (#14261)
  • use is_between to skip parquet row groups (#14244)
  • Use a compression API that is designed for this use case (#11699) (#14194)
  • Use UnitVec in polars-plan traversal (#14199)
  • use UnitVec in streaming joins (#14197)
  • improve ChunkId (#14175)
  • improve iteration performance (#14126)
  • elide unneeded work in window? (#14108)
  • run window functions more in parallel (#14095)
  • improve skip row group using statistics condition (#14056)

✨ Enhancements

  • add u8/i8/u16/i16 parsers to CSV reader (#14241)
  • move F-order data in and out of numpy to polars zero copy (#14259)
  • read arrow-c-interface without requiring pyarrow (#14254)
  • Implements list.gather_every (#14253)
  • Implements prefix/suffix_fields (#14251)
  • Change Series.to_numpy to return f64 for Int32/UInt32 Series with nulls instead of f32 (#14240)
  • Polish decimal arithmetic (#14172)
  • improved read_excel format detection, and support for excel 97-2004 workbooks (#14234)
  • Introduce arr.to_struct (#14202)
  • Supports map fields name of struct (#14203)
  • make IdxVec generic as UnitVec (#14196)
  • add new arithmetic kernels (#14026)
  • Supports unique and hash_rows for null column (#14111)
  • Implement arithmetic operations for Null columns (#14107)
  • support pd.Index in from_pandas and elsewhere (#14087)
  • Allow renaming expressions with keyword syntax in group_by (#14071)
  • raise more informative error message if someone lands on Expr.__bool__ (#14067)
  • Adapt extend_constant to function expr architecture and expressify it (#14058)
  • add integer negation (#14049)
  • list & array measures of dispersion (#13245)
  • gc binview when writing ipc (#14035)
  • When calling convert_time_zone on time-zone-naive datetime, convert as if converting from UTC (#13960)

🐞 Bug fixes

  • deduplicate recursive growables (#14264)
  • Fix glimpse overload signature (#14258)
  • allow set operations on list of categoricals (#14110)
  • any/all_horizontal with single input has incorrect type (#14256)
  • load numpy array with np array values #14237 (#14238)
  • Make Series.to_numpy on booleans without nulls return bool type (#14239)
  • fix ufunc in agg (change __ufunc_array__ so it uses is_elementwise=True parameter) (#14135)
  • Fix join validation for String types (#14229)
  • enable windows test coverage for read_excel "calamine" (fastexcel) engine (#14171)
  • make csv parser more robust to edge cases (#14210)
  • Fix for set_operations of binary dtype (#14152)
  • fix read_csv date/datetime inference and parsing (#14113)
  • don't see files as hive partitions (#14128)
  • allow eval on list of categoricals (#14132)
  • Forbid casting from Date to Time and vice versa (#14127)
  • preserve old naming convention for multi-value pivot (this will change in 1.0 to no longer redundantly have the column name in the middle) (#14120)
  • Implements gt/lt cmp for null dtype (#14119)
  • ignore comments at beginning of csv if schema provided (#14115)
  • fix pivot when multiple columns are passed. Output is now aligned with what tidyverse / pandas.pivot_table would do (#14048)
  • multiple read_excel updates (#14039)
  • some temporal conversion errors for datetimes earlier than 1970-01-01 (#14050)
  • Preserve name when casting from categorical (#14085)
  • respect Object dtype designation (#14072)
  • fix cse bug when window function is nested (#14070)
  • Fix melt panic when there are no value vars (#14057)
  • json_encode should respect the logical type (#14063)
  • improve skip row group using statistics condition (#14056)
  • Raise for .dt.epoch and .dt.timestamp for Duration dtype (#13962)
  • handle SliceSink with empty data (#14025)
  • Allow Series.to_pandas for categorical types (#14028)
  • correct field type schema inference (using read_csv) (#14042)
  • Use int formatter for unsigned ints (#14043)

📖 Documentation

  • fix code block in user-guide/lazy/schemas (#14228)
  • Add visualization page to user guide (#13052)
  • Fix typo in contributing guide (#14181)
  • Small improvements Ecosystem page (#14176)
  • fix code blocks in user-guide/concepts/data-structures (#14146)
  • Document that Kleene logic is followed in any_horizontal and all_horizontal (#14148)
  • Fix description of return_dtype parameter for map_elements and map_batches (#14114)
  • Fix bullet point formatting in CI contributing guide (#14117)
  • Add documentation on replacement strings to str.replace and str.replace_all (#13382)
  • Replace alternatives page with more objective comparison (#13784)
  • Note that only one name operation is allowed per expression (#14075)
  • Improve deprecation message of dtype_if_empty param (#14068)
  • fix more docstring bullet points (#14065)

🛠️ Other improvements

  • Reorganize NumPy interop tests (#14257)
  • additional dataframe test coverage (#14243)
  • Remove *args in Series.to_numpy (#14248)
  • Move metadata utils to meta module (#14230)
  • remove unused method DataFrame._from_dicts (#14212)
  • make gather_chunked completely generic (#14195)
  • Add .cargo directory to .gitignore (#14191)
  • take_chunked to polars-ops (#14185)
  • Issue a warning when running doctests on Python 3.11 or lower (#14187)
  • Run cargo update (#14160)
  • merge take kernels (#14137)
  • improve From<Ca> -> Vec (#14123)
  • hoist boolean -> string cast (#14122)
  • remove unused argument (#14014)

Thank you to all our contributors for making this release possible!
@JulianCologne, @MarcoGorelli, @Vincenthays, @Wainberg, @alexander-beedie, @apcamargo, @braaannigan, @c-peters, @deanm0000, @dependabot, @dependabot[bot], @dpinol, @edavisau, @eitsupi, @flisky, @grinya007, @ion-elgreco, @itamarst, @lukemanley, @mcrumiller, @orlp, @r-brink, @reswqa, @ritchie46, @stinodego and @taki-mekhalfa