Skip to content

Modin 0.12.0

Compare
Choose a tag to compare
@RehanSD RehanSD released this 24 Nov 01:48
· 1284 commits to master since this release
0.12.0
054e7fb
This release contains a refactor to the codebase, encapsulating
significant amounts of improvements to the maintainability of the code,
and a plethora of bugfixes.

This release also introduces a slack community for Modin users to interact
with Modin developers. Please join us at our [Slack](https://modin.org/slack.html)
to continue the conversation!

Key Features and Updates
------------------------
* Stability and bugfixes
  * Support allowing callables and scalars together in .loc/.iloc (25ea7fd)
  * Ensure .loc with slice and scalar column returns Series (9492878)
  * Fix Modin OmniSci Docker example (b853c51)
  * Ensure Modin OmniSci + Modin Ray Docker containers install packages from conda-forge (032afd6)
  * Determine return type (Series or DataFrame) from one element Series (17ad1f0)
  * Update cloud examples (648b6a0)
  * Fix Modin OmniSci memory leak during `read_csv` (8581ba1)
  * Use `floor` for casting `float` to `int` for OmniSci 5.8.0 (c67a936)
  * Fix .loc on empty DataFrame (2260431)
  * Ensure Modin on Ray does not duplicate writes to disk on `to_csv` when workers die (6178a57)
  * Add support for `storage_options` argument in `read_*` functions except `read_excel` (77a00cc)
  * Ensure Modin Ray correctly raises exceptions when `to_parquet` or `to_csv` fail (8d67cd3)
  * Ensure Modin Ray does not hang when workers crash on `to_csv` (73bf061)
  * Remove platform specific code from `setup.py` to ensure distributions are pure Python (b186e40)
* Refactor Codebase
  * Update import of public index classes to import from `pandas.core.indexes.api ` module (488357a)
  * Replace `try...finally` with pytest fixtures (c349a94)
  * Restructure project files (b37bcf8)
  * Use `fsspec` to open files (b8a9c07)
  * Add LGTM Service to CI (b193fef)
  * Remove extraneous `*NUM_THREADS` environment variables from CI (b925625)
  * Update documentation + code + comment language to reflect new project structure (7a81588)
  * Update language to reflect new project structure and add implementation to BaseDataframeAxisPartition (7ab2d90)
  * De-dupe `read_fwf` and `read_csv` code (2f824f8)
  * Reformat entire codebase with `black` and `flake8` (75f698c)
* Pandas API implementations and improvements
  * Add support for `{true|false}_values` for `read_csv` for Modin OmniSci (9cd93f2)
  * Implement `explode` for Series and DataFrame (ddd4afe)
  * Support reading gzipped fwf (a80cb3b)
  * Add support for `to_parquet` Modin Ray (643596d)
  * Add support for creating an `sqlalchemy` connection with arbitrary arguments (ece98a6, 4a42e04)
  * Add support for `set_index` with different input types (cab37f2)
* XGBoost enhancements
  * Support new DMatrix parameters (4d7f6d4)
* Developer API enhancements
  * Throw custom errors when optional dependencies are missing (53bb047)
  * Improve Modin OmniSci quickstart (167957b)
* Update testing suite
* Documentation improvements
* Dependencies
  * Add fsspec (dependency for IO) to dependencies (44e3f10)
  * Make `botocore` import optional (adc15c6)
  * Pin minimum `s3fs` dependency to fix `aibotocore` issue (8acad95)
  * Update PyArrow to 5.0 and OmniSci to 5.8 (4121358)

Contributors
------------
@ienkovich, @vnlitvinov, @mvashishtha, @devin-petersohn, @dchigarev, @prutskov, @amyskov,
@gshimansky, @anmyachev, @YarShev, @Garra1980, @Rubtsowa, @jeffreykennethli, @RehanSD,
@dorisjlee, @naren-ponder