Skip to content

Python Polars 0.20.8

Compare
Choose a tag to compare
@github-actions github-actions released this 14 Feb 02:22
2ae55d5

🏆 Highlights

  • Implemented tree formatting for LogicalPlan (#14221)

⚠️ Deprecations

  • Deprecate positional args in pivot to prepare new functionality (#14428)

🚀 Performance improvements

  • Combine small chunks in sinks for streaming pipelines (#14346)
  • reduce heap allocs in expression/logical-plan iteration (#14440)
  • simplify and speed up cum_sum and cum_prod (#14409)
  • simplify negated predicates to improve row groups skipping (#14370)

✨ Enhancements

  • Increase verbosity of duplicate column error message (#11899)
  • change print to warn in reading csv from python file like object (#14469)
  • Raise if pivot would introduce duplicate column names (#14431)
  • apply negate in simplify expression pass (#14436)
  • restrict more cloud interop to semaphore budget (#14435)
  • Implement min/max for categorical dtype (#14112)
  • Hide polars.testing.* in pytest stack traces (#14399)
  • expose numpy view to integer types (#14405)
  • Allow column name input in clip (#14410)
  • add boolean rle decoding for parquet (#14403)
  • Allow brackets in SQL join conditions (#14263)
  • Implemented tree formatting for LogicalPlan (#14221)
  • Implement mean_horizontal expression (#14369)
  • support decimal comparison (#14338)
  • Implements arr.shift (#14298)
  • Implements list.n_unique (#14306)
  • Do not panic when casting from an empty Series to pl.Decimal (#14330)
  • unset WRITEABLE flag in zero-copy output (#14283)
  • Support Categorical/Enum in Series.to_numpy (#14275)
  • add parametric testing support for the Array dtype (#14265)

🐞 Bug fixes

  • don't gc after variadic buffers are written (#14473)
  • Increase verbosity of duplicate column error message (#11899)
  • Return appropriate data type for duration mean and median (#14376)
  • change print to warn in reading csv from python file like object (#14469)
  • regression in out-of-core group-by by new string-type (#14464)
  • DataFrame.pivot was returning incorrect results when multiple columns were passed to index and one of them was Struct (#14438)
  • remove literal Series from projection state (#14437)
  • pivot was producing incorrect results when (single) index was Struct (#14308)
  • Error on some invalid clip inputs (#14416)
  • Series.hist panicking on empty/all-null (#14407)
  • rechunk series when apply_lambda (#14406)
  • Raise if invalid strategy is passed to map_elements (#14397)
  • Require exact checking for Decimals in assertion utils (#14357)
  • fix ufunc for unlimited column args (#14328)
  • Handle chunked Series in Series.to_numpy (#14341)
  • Remove duplicated content in error messages (#8107)
  • Fix set_operation if the input is sliced and be broadcast (#14303)
  • Wrap par_iter in list.to_struct by POOL.install (#14304)
  • Do not panic when casting from an empty Series to pl.Decimal (#14330)
  • Preserve name when casting to Enum (#14320)
  • list.get does not work on list of decimals (#14276)
  • relax precision when up scaling (#14270)
  • Allow format object series with registry (#14272)

📖 Documentation

  • Update read_database docstring note about getting the connection URI string for sqlalchemy (#14461)
  • Fix typo in plugins section (#14402)
  • Add debugging section to contributing docs (#10576)
  • Define what a 'character' means in slice / len_chars (#14395)
  • Clarify behavior of DataFrame.rows_by_key (#14149)
  • Fix some typos (#14394)
  • Realign file structure of user guide (#14360)
  • Rust examples for data structures in user guide (#14339)
  • Add deprecation period policy example for post-1.0.0 (#14184)
  • Add example for Series.bin.contains (#14297)
  • Small clarifications in the contributing guide (#14310)
  • Fix capitalization of user guide references (#14291)
  • Fix explode docstring mentioning String types (#14285)
  • Update deltalake docstrings to new link (#14282)

🛠️ Other improvements

  • Ignore unclosed file warnings for now (#14467)
  • Raise better error in import timings test (#14441)
  • Refactor arg_min/max test case (#14439)
  • Skip some OOC tests that fail randomly in the CI (#14434)
  • Bump release drafter to v6 (#14429)
  • Set specific temp dir for OOC tests (#14420)
  • Bump setup-graphviz action to v2 (#14418)
  • Minor test refactor (#14404)
  • Update make clean command (#14408)
  • Internal rename of _or to or_ in PyO3 (same for _xor/_and) (#14393)
  • Minor refactor of DataFrame.to_numpy structured code (#14348)
  • Update Series.to_numpy to handle Decimal/Time types in Rust (#14296)
  • Add test for Series.to_numpy with timezones (#14337)
  • Bump ruff version to 0.2.0 (#14294)
  • Temporarily fix failing deltalake test (#14288)
  • remove dataframe consortium standard api entrypoint (#14279)

Thank you to all our contributors for making this release possible!
@BGR360, @CaselIT, @MarcoGorelli, @migi, @NedJWestern, @Vincenthays, @alexander-beedie, @deanm0000, @dependabot, @dependabot[bot], @engdoreis, @flisky, @grinya007, @itamarst, @janosh, @kalekundert, @lukemanley, @mbuhidar, @mcrumiller, @petrosbar, @r-brink, @rben01, @reswqa, @ritchie46, @stinodego, @taki-mekhalfa and @thomasfrederikhoeck