Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: clean-up 0.21.0 whatsnew file #18001

Merged
merged 11 commits into from
Oct 27, 2017

Conversation

jorisvandenbossche
Copy link
Member

No description provided.

@jorisvandenbossche jorisvandenbossche added this to the 0.21.0 milestone Oct 27, 2017
@jorisvandenbossche jorisvandenbossche mentioned this pull request Oct 27, 2017
64 tasks
@jorisvandenbossche
Copy link
Member Author

This cleans-up a little bit the whatsnew (few small mistakes / formatting errors, adding parquet section (now was only in the highlights), restructures new features a little bit)

- New user-facing :class:`pandas.api.types.CategoricalDtype` for specifying
categoricals independent of the data, see :ref:`here <whatsnew_0210.enhancements.categorical_dtype>`.
- The behavior of ``sum`` and ``prod`` on all-NaN Series/DataFrames is now consistent and no longer depends on whether `bottleneck <http://berkeleyanalytics.com/bottleneck>`__ is installed, see :ref:`here <whatsnew_0210.api_breaking.bottleneck>`
- Compatibility fixes for pypy, see :ref:`here <whatsnew_0210.pypy>`.
- ``GroupBy`` objects now have a ``pipe`` method, similar to the one on ``DataFrame`` and ``Series``.
This allows for functions that take a ``GroupBy`` to be composed in a clean, readable syntax, see :ref:`here <whatsnew_0210.enhancements.GroupBy_pipe>`.
- Additions to the ``drop``, ``reindex`` and ``rename`` API (see :ref:`here <whatsnew_0210.enhancements.drop_api>`) and new methods ``infer_objects`` (see :ref:`here <whatsnew_0210.enhancements.infer_objects>`) and ``GroupBy.pipe`` (see :ref:`here <whatsnew_0210.enhancements.GroupBy_pipe>`).
Copy link
Contributor

@TomAugspurger TomAugspurger Oct 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts about splitting this into two? One for drop / reindex / rename (additions to existing APIs) and the second for infer_objects and GroupBy.pipe (new methods).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's fine. I mainly added it because now it felt a bit strange to speak about pipe and not the other subsections in the 'new features' section. Can also leave out entirely.
But will split and add a bit more context.


Integration with `Apache Parquet <https://parquet.apache.org/>`__, including a new top-level :func:`read_parquet` and :func:`DataFrame.to_parquet` method, see :ref:`here <io.parquet>` (:issue:`15838`, :issue:`17438`).

`Apache Parquet <https://parquet.apache.org/>`__ provides a partitioned binary columnar serialization for data frames. It is designed to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, "It is provides a language-agnostic file format for reading and writing data frames efficiently."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or "cross-language" if you prefer that to language-agnostic.


.. _whatsnew_0210.enhancements.other:

Other Enhancements
^^^^^^^^^^^^^^^^^^

- The ``validate`` argument for :func:`merge` now checks whether a merge is one-to-one, one-to-many, many-to-one, or many-to-many. If a merge is found to not be an example of specified merge type, an exception of type ``MergeError`` will be raised. For more, see :ref:`here <merging.validation>` (:issue:`16270`)
- Added support for `PEP 518 <https://www.python.org/dev/peps/pep-0518/>`_ (``pyproject.toml``) to the build system (:issue:`16745`)
New functions or methods:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make these sub-sections

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking locally how it looks (problem is that you hardly see the difference between the current section and subsection, and this would add a subsubsection, and don't want to do that if there is hardly a difference. I can eg make it bold though, if the subsubsection doesn't work out nicely)


Integration with `Apache Parquet <https://parquet.apache.org/>`__, including a new top-level :func:`read_parquet` and :func:`DataFrame.to_parquet` method, see :ref:`here <io.parquet>` (:issue:`15838`, :issue:`17438`).

`Apache Parquet <https://parquet.apache.org/>`__ provides a partitioned binary columnar serialization for data frames. It is designed to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or "cross-language" if you prefer that to language-agnostic.

languages easy. Parquet can use a variety of compression techniques to shrink the file size as much as possible
while still maintaining good read performance.
Parquet is designed to faithfully serialize and de-serialize ``DataFrame`` s, supporting all of the pandas
dtypes, including extension dtypes such as datetime with tz.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tz" -> "timezone"?

2018-03-31 10.0
Freq: 2Q-DEC, dtype: float64
Sum/Prod of all-NaN Series/DataFrames is now consistently NaN
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit is a purely reordering (putting eg sum of all NaN first, as this is included in the highlights).

The "list with missing values- indexing" might maybe also deserve a highlight?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "list with missing values- indexing" might maybe also deserve a highlight?

Yeah, that seems reasonable.

@codecov
Copy link

codecov bot commented Oct 27, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@cc7abd9). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #18001   +/-   ##
=========================================
  Coverage          ?   91.23%           
=========================================
  Files             ?      163           
  Lines             ?    50173           
  Branches          ?        0           
=========================================
  Hits              ?    45774           
  Misses            ?     4399           
  Partials          ?        0
Flag Coverage Δ
#multiple 89.04% <ø> (?)
#single 40.29% <ø> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc7abd9...1e9e4e8. Read the comment docs.

@TomAugspurger TomAugspurger merged commit 0091268 into pandas-dev:master Oct 27, 2017
peterpanmj pushed a commit to peterpanmj/pandas that referenced this pull request Oct 31, 2017
* DOC: clean-up 0.21.0 whatsnew file

* literal include warning

* split highlight

* reorder subsections in API changes (somewhat more in order of importance)

* move python 3.4 support drop to section dropped version support of deps

* minor formatting

* add indexing with list of partly missing labels to highlights

* udpate parquet explanation

* update highlights in release notes

* format as subsubsections

* wording
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
* DOC: clean-up 0.21.0 whatsnew file

* literal include warning

* split highlight

* reorder subsections in API changes (somewhat more in order of importance)

* move python 3.4 support drop to section dropped version support of deps

* minor formatting

* add indexing with list of partly missing labels to highlights

* udpate parquet explanation

* update highlights in release notes

* format as subsubsections

* wording
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants