Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: msgpack #30112

Merged
merged 17 commits into from
Dec 12, 2019
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ global-exclude *.gz
global-exclude *.h5
global-exclude *.html
global-exclude *.json
global-exclude *.msgpack
global-exclude *.pickle
global-exclude *.png
global-exclude *.pyc
Expand Down
32 changes: 0 additions & 32 deletions asv_bench/benchmarks/io/msgpack.py

This file was deleted.

2 changes: 1 addition & 1 deletion asv_bench/benchmarks/io/sas.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,5 @@ def setup(self, format):
]
self.f = os.path.join(*paths)

def time_read_msgpack(self, format):
def time_read_sas(self, format):
read_sas(self.f, format=format)
4 changes: 2 additions & 2 deletions ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -94,10 +94,10 @@ if [[ -z "$CHECK" || "$CHECK" == "lint" ]]; then

# We don't lint all C files because we don't want to lint any that are built
# from Cython files nor do we want to lint C files that we didn't modify for
# this particular codebase (e.g. src/headers, src/klib, src/msgpack). However,
# this particular codebase (e.g. src/headers, src/klib). However,
# we can lint all header files since they aren't "generated" like C files are.
MSG='Linting .c and .h' ; echo $MSG
cpplint --quiet --extensions=c,h --headers=h --recursive --filter=-readability/casting,-runtime/int,-build/include_subdir pandas/_libs/src/*.h pandas/_libs/src/parser pandas/_libs/ujson pandas/_libs/tslibs/src/datetime pandas/io/msgpack pandas/_libs/*.cpp pandas/util
cpplint --quiet --extensions=c,h --headers=h --recursive --filter=-readability/casting,-runtime/int,-build/include_subdir pandas/_libs/src/*.h pandas/_libs/src/parser pandas/_libs/ujson pandas/_libs/tslibs/src/datetime pandas/_libs/*.cpp
RET=$(($RET + $?)) ; echo $MSG "DONE"

echo "isort --version-number"
Expand Down
3 changes: 0 additions & 3 deletions doc/redirects.csv
Original file line number Diff line number Diff line change
Expand Up @@ -491,7 +491,6 @@ generated/pandas.DataFrame.to_hdf,../reference/api/pandas.DataFrame.to_hdf
generated/pandas.DataFrame.to,../reference/api/pandas.DataFrame.to
generated/pandas.DataFrame.to_json,../reference/api/pandas.DataFrame.to_json
generated/pandas.DataFrame.to_latex,../reference/api/pandas.DataFrame.to_latex
generated/pandas.DataFrame.to_msgpack,../reference/api/pandas.DataFrame.to_msgpack
generated/pandas.DataFrame.to_numpy,../reference/api/pandas.DataFrame.to_numpy
generated/pandas.DataFrame.to_panel,../reference/api/pandas.DataFrame.to_panel
generated/pandas.DataFrame.to_parquet,../reference/api/pandas.DataFrame.to_parquet
Expand Down Expand Up @@ -890,7 +889,6 @@ generated/pandas.read_gbq,../reference/api/pandas.read_gbq
generated/pandas.read_hdf,../reference/api/pandas.read_hdf
generated/pandas.read,../reference/api/pandas.read
generated/pandas.read_json,../reference/api/pandas.read_json
generated/pandas.read_msgpack,../reference/api/pandas.read_msgpack
generated/pandas.read_parquet,../reference/api/pandas.read_parquet
generated/pandas.read_pickle,../reference/api/pandas.read_pickle
generated/pandas.read_sas,../reference/api/pandas.read_sas
Expand Down Expand Up @@ -1231,7 +1229,6 @@ generated/pandas.Series.to_json,../reference/api/pandas.Series.to_json
generated/pandas.Series.to_latex,../reference/api/pandas.Series.to_latex
generated/pandas.Series.to_list,../reference/api/pandas.Series.to_list
generated/pandas.Series.tolist,../reference/api/pandas.Series.tolist
generated/pandas.Series.to_msgpack,../reference/api/pandas.Series.to_msgpack
generated/pandas.Series.to_numpy,../reference/api/pandas.Series.to_numpy
generated/pandas.Series.to_period,../reference/api/pandas.Series.to_period
generated/pandas.Series.to_pickle,../reference/api/pandas.Series.to_pickle
Expand Down
1 change: 0 additions & 1 deletion doc/source/development/developer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,6 @@ The ``metadata`` field is ``None`` except for:
in ``BYTE_ARRAY`` Parquet columns. The encoding can be one of:

* ``'pickle'``
* ``'msgpack'``
* ``'bson'``
* ``'json'``

Expand Down
4 changes: 2 additions & 2 deletions doc/source/getting_started/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ PyTables 3.4.2 HDF5-based reading / writing
SQLAlchemy 1.1.4 SQL support for databases other than sqlite
SciPy 0.19.0 Miscellaneous statistical functions
XLsxWriter 0.9.8 Excel writing
blosc Compression for msgpack
blosc Compression for HDF5
fastparquet 0.3.2 Parquet reading / writing
gcsfs 0.2.2 Google Cloud Storage access
html5lib HTML parser for read_html (see :ref:`note <optional_html>`)
Expand All @@ -269,7 +269,7 @@ xclip Clipboard I/O on linux
xlrd 1.1.0 Excel reading
xlwt 1.2.0 Excel writing
xsel Clipboard I/O on linux
zlib Compression for msgpack
zlib Compression for HDF5
========================= ================== =============================================================

.. _optional_html:
Expand Down
1 change: 0 additions & 1 deletion doc/source/reference/frame.rst
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,6 @@ Serialization / IO / conversion
DataFrame.to_feather
DataFrame.to_latex
DataFrame.to_stata
DataFrame.to_msgpack
DataFrame.to_gbq
DataFrame.to_records
DataFrame.to_string
Expand Down
1 change: 0 additions & 1 deletion doc/source/reference/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ Flat file
read_table
read_csv
read_fwf
read_msgpack

Clipboard
~~~~~~~~~
Expand Down
1 change: 0 additions & 1 deletion doc/source/reference/series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -574,7 +574,6 @@ Serialization / IO / conversion
Series.to_xarray
Series.to_hdf
Series.to_sql
Series.to_msgpack
Series.to_json
Series.to_string
Series.to_clipboard
Expand Down
2 changes: 1 addition & 1 deletion doc/source/user_guide/cookbook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1229,7 +1229,7 @@ in the frame:
The offsets of the structure elements may be different depending on the
architecture of the machine on which the file was created. Using a raw
binary file format like this for general data storage is not recommended, as
it is not cross platform. We recommended either HDF5 or msgpack, both of
it is not cross platform. We recommended either HDF5 or parquet, both of
which are supported by pandas' IO facilities.

Computation
Expand Down
88 changes: 0 additions & 88 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like
binary;`HDF5 Format <https://support.hdfgroup.org/HDF5/whatishdf5.html>`__;:ref:`read_hdf<io.hdf5>`;:ref:`to_hdf<io.hdf5>`
binary;`Feather Format <https://github.com/wesm/feather>`__;:ref:`read_feather<io.feather>`;:ref:`to_feather<io.feather>`
binary;`Parquet Format <https://parquet.apache.org/>`__;:ref:`read_parquet<io.parquet>`;:ref:`to_parquet<io.parquet>`
binary;`Msgpack <https://msgpack.org/index.html>`__;:ref:`read_msgpack<io.msgpack>`;:ref:`to_msgpack<io.msgpack>`
binary;`Stata <https://en.wikipedia.org/wiki/Stata>`__;:ref:`read_stata<io.stata_reader>`;:ref:`to_stata<io.stata_writer>`
binary;`SAS <https://en.wikipedia.org/wiki/SAS_(software)>`__;:ref:`read_sas<io.sas_reader>`;
binary;`SPSS <https://en.wikipedia.org/wiki/SPSS>`__;:ref:`read_spss<io.spss_reader>`;
Expand Down Expand Up @@ -3376,93 +3375,6 @@ The default is to 'infer':
os.remove("data.pkl.gz")
os.remove("s1.pkl.bz2")

.. _io.msgpack:

msgpack
-------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is deprecated/removed on a very short time frame, I think it might be good to keep this title a bit longer with a small note that it was deprecated/removed and how to replace it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? the whole point is to clean things for 1.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To provide documentation about this removal, and how to replace it.

For example, there have been issues opened with questions about this which lead to added examples in the docstrings, but this has never been published yet in actual documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k that’s fair

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restored this section. LMK if you have suggestions for additional notes to put in here.

any thoughts on the LICENSE files mentioned in the OP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't restore the full section, but rather only keep a short explanation that it was deprecated/removed, and how to replace it (see the docstrings for some content for that)


pandas supports the ``msgpack`` format for
object serialization. This is a lightweight portable binary format, similar
to binary JSON, that is highly space efficient, and provides good performance
both on the writing (serialization), and reading (deserialization).

.. warning::

The msgpack format is deprecated as of 0.25 and will be removed in a future version.
It is recommended to use pyarrow for on-the-wire transmission of pandas objects.

.. warning::

:func:`read_msgpack` is only guaranteed backwards compatible back to pandas version 0.20.3

.. ipython:: python
:okwarning:

df = pd.DataFrame(np.random.rand(5, 2), columns=list('AB'))
df.to_msgpack('foo.msg')
pd.read_msgpack('foo.msg')
s = pd.Series(np.random.rand(5), index=pd.date_range('20130101', periods=5))

You can pass a list of objects and you will receive them back on deserialization.

.. ipython:: python
:okwarning:

pd.to_msgpack('foo.msg', df, 'foo', np.array([1, 2, 3]), s)
pd.read_msgpack('foo.msg')

You can pass ``iterator=True`` to iterate over the unpacked results:

.. ipython:: python
:okwarning:

for o in pd.read_msgpack('foo.msg', iterator=True):
print(o)

You can pass ``append=True`` to the writer to append to an existing pack:

.. ipython:: python
:okwarning:

df.to_msgpack('foo.msg', append=True)
pd.read_msgpack('foo.msg')

Unlike other io methods, ``to_msgpack`` is available on both a per-object basis,
``df.to_msgpack()`` and using the top-level ``pd.to_msgpack(...)`` where you
can pack arbitrary collections of Python lists, dicts, scalars, while intermixing
pandas objects.

.. ipython:: python
:okwarning:

pd.to_msgpack('foo2.msg', {'dict': [{'df': df}, {'string': 'foo'},
{'scalar': 1.}, {'s': s}]})
pd.read_msgpack('foo2.msg')

.. ipython:: python
:suppress:
:okexcept:

os.remove('foo.msg')
os.remove('foo2.msg')

Read/write API
''''''''''''''

Msgpacks can also be read from and written to strings.

.. ipython:: python
:okwarning:

df.to_msgpack()

Furthermore you can concatenate the strings to produce a list of the original objects.

.. ipython:: python
:okwarning:

pd.read_msgpack(df.to_msgpack() + s.to_msgpack())

.. _io.hdf5:

HDF5 (PyTables)
Expand Down
8 changes: 3 additions & 5 deletions doc/source/whatsnew/v0.13.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -822,14 +822,13 @@ Experimental
For more details see the :ref:`the docs<indexing.query>`.

- ``pd.read_msgpack()`` and ``pd.to_msgpack()`` are now a supported method of serialization
of arbitrary pandas (and python objects) in a lightweight portable binary format. See :ref:`the docs<io.msgpack>`
of arbitrary pandas (and python objects) in a lightweight portable binary format.

.. warning::

Since this is an EXPERIMENTAL LIBRARY, the storage format may not be stable until a future release.

.. ipython:: python
:okwarning:
.. code-block:: python

df = pd.DataFrame(np.random.rand(5, 2), columns=list('AB'))
df.to_msgpack('foo.msg')
Expand All @@ -841,8 +840,7 @@ Experimental

You can pass ``iterator=True`` to iterator over the unpacked results

.. ipython:: python
:okwarning:
.. code-block:: python

for o in pd.read_msgpack('foo.msg', iterator=True):
print(o)
Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -624,6 +624,7 @@ or ``matplotlib.Axes.plot``. See :ref:`plotting.formatters` for more.
- Removed previously deprecated ``errors`` argument in :meth:`Timestamp.tz_localize`, :meth:`DatetimeIndex.tz_localize`, and :meth:`Series.tz_localize` (:issue:`22644`)
- Changed the default value for ``ordered`` in :class:`CategoricalDtype` from ``None`` to ``False`` (:issue:`26336`)
- :meth:`Series.set_axis` and :meth:`DataFrame.set_axis` now require "labels" as the first argument and "axis" as an optional named parameter (:issue:`30089`)
- Removed the previously deprecated :func:`to_msgpack`, :func:`read_msgpack`, :meth:`DataFrame.to_msgpack`, :meth:`Series.to_msgpack` (:issue:`27103`)
-

.. _whatsnew_1000.performance:
Expand Down
3 changes: 0 additions & 3 deletions pandas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,9 +148,6 @@
ExcelFile,
ExcelWriter,
read_excel,
# packers
read_msgpack,
to_msgpack,
# parsers
read_csv,
read_fwf,
Expand Down
103 changes: 0 additions & 103 deletions pandas/_libs/src/msgpack/pack.h

This file was deleted.

Loading