Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARN: Add FutureWarning to DataFrame.to_html #44451

Closed

Conversation

attack68
Copy link
Contributor

@attack68 attack68 commented Nov 14, 2021

This follows #44411 and has the same purpose.

xref #41693

msg = (
"In future versions `DataFrame.to_html` is expected to utilise the base "
"implementation of `Styler.to_html` for formatting and rendering. "
"The arguments signature may therefore change. It is recommended instead "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we intend to keep DataFrame.to_html but only change the signature, then I think such a general deprecation warning is overly broad (many people doing just df.to_html() will also see the warning while they actually don't need to change anything?)

Would it be an alternative to specifically check for keywords we know will change in the signature, and only warn for those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an example is #41648 on which kwargs might be removed or changed.

there are quite a few, and to be honest not sure at this point we could commit to knowing with certainty how they will change.

i agree with you that it might be overly broad, the warning does not show in jupyter notebook when rendering a df by default, only when the method to_html is specifically called

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something else i considered was just implementing a DataFrame.to_html2 method temporarily, and then issuing the warning and progressively merging through version cycles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche I have gone through and targeted the kwargs that will change and and only warn when they are input as non-default values. Please review

@attack68
Copy link
Contributor Author

this is getting stale, any further comments / suggestions?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like that you are pretty selective about the warnings, this is definitely the way to go. At the same time its hard to say if this is actually working (as the tests suppress the warnings).

Also in the doc-string itself it would be good to give the conversion table (e.g. implied by the warnings).

@@ -16,6 +16,8 @@

import pandas.io.formats.format as fmt

pytestmark = pytest.mark.filterwarnings("ignore::FutureWarning")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this actually filtering out all warnings?

that's the thing, how do we know this is not warning when it shouldn't be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that filters the generic warnings but the tests added, test_future_warning and test_no_future_warning explicitly test for the case where the warnings are generated as expected and no warnings are generated for the regular df.to_html() call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the filter a bit more specific? (eg match the beginning of the message) Just to ensure we only ignore this warning, and not accidentally suppress warnings we should see.

@jreback jreback added IO HTML read_html, to_html, Styler.apply, Styler.applymap Deprecate Functionality to remove in pandas labels Dec 17, 2021
…r_to_latex

# Conflicts:
#	doc/source/whatsnew/v1.4.0.rst
@attack68 attack68 mentioned this pull request Dec 21, 2021
@jreback jreback added this to the 1.4 milestone Dec 22, 2021
@jreback
Copy link
Contributor

jreback commented Dec 22, 2021

@jorisvandenbossche ok here?

@jreback jreback added the Blocker for rc Blocking issue or pull request for release candidate label Dec 23, 2021
@jorisvandenbossche
Copy link
Member

@attack68 thanks for the updates and adding more specific warnings, and sorry for the slow response.

Warnings are shown in 1.4.0 in preparation for signature changes in 2.0.0

Typically, if we deprecate keywords in favor of new ones, the new one is already added so that users have a direct alternative and can use the new keyword to not get the deprecation warning.

Is there a reason we can't do that here?

@attack68
Copy link
Contributor Author

Is there a reason we can't do that here?

Yeh I think so. Its a special situation where the implementation we want to transition to already exists via another object (which the user can immediately transition to if preferred), but where the treatment of each argument is a bit different, e.g:

To avoid the warnings the user could (and maybe should follow the warning) and switch to using df.style.format().to_html() instead of df.to_html()

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Dec 23, 2021

Its a special situation where the implementation we want to transition to already exists via another object (which the user can immediately transition to if preferred), but where the treatment of each argument is a bit different,

But so what's the plan in the future for those new keyword arguments? They will then be passed through to Styler.to_html which gets used under the hood in DataFrame.to_html?
Is that something we could already do now if those new keywords get specified?

@jorisvandenbossche
Copy link
Member

Something else: in general, I think we should also make a clear decision on the future of DataFrame.to_html now. Currently, the docs you added are quite vague indicating it might go away in the future. I don't think that's a very useful message to the user.

If we think we will keep DataFrame.to_html as a light wrapper around Styler.to_html, IMO we don't need to be that vague in the docs.

@attack68
Copy link
Contributor Author

Is that something we could already do now if those new keywords get specified?

The goal is to remove the necessity for DataFrameHTMLFormatter and DataFrameLaTexFormatter, and parse the inputs to DataFrame.to_html() and DataFrame.to_latex() into a format which can utilise the Styler implementation to return the result.

As far as I am currently aware all the development is in place to make the transition as of now. See #41648 as a complete example of this for to_latex and #43161 as an unfinished attempt for to_html.

These PRs were closed because the opinion was that a deprecation cycle was needed before making argument signature changes, which I do agree with.

As far as the 'vagueness' goes in the docs, which I agree is there, I am confident that this can be done and I am happy to implement it, but my uncertainty is based on the approval of the community..

@jorisvandenbossche
Copy link
Member

These PRs were closed because the opinion was that a deprecation cycle was needed before making argument signature changes, which I do agree with.

Yes, but doing the deprecation ànd already providing the new option can be done in the same release. If you specify one of the new keywords, we could make it use the new implementation. That should be both backwards compatible (all existing uses continue to use the old implementation, only when you change your code (eg to adjust for deprecation warnings), you might get the new implementation), and already make the future behaviour possible.

As far as the 'vagueness' goes in the docs, which I agree is there, I am confident that this can be done and I am happy to implement it, but my uncertainty is based on the approval of the community..

Yes, and so I know that's not a concrete comment for you to solve in the code, but I still think we (as pandas community) should first decide on this.
My personal opinion: I would leave DataFrame.to_html just as it is: 1) many use cases will not use one of the more advanced keywords, and for those there is not much reason to ask them to change their code, 2) we have a bunch of to_.. methods on the DataFrame, having to_html as well is good for both consistency and discoverability (and of course we still refer to style.to_html() if you want more advanced control).

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some inline comments as well

brevity's sake. See :func:`~pandas.core.frame.DataFrame.to_html` for the
full set of options.
DataFrame *and* Styler objects currently have a ``to_html`` method. We recommend
using the `Styler.to_html() <../reference/api/pandas.io.formats.style.Styler.to_html.rst>`__ method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the sphinx cross-reference syntax instead of the path-based link? (:meth:`Styler.to_html() <pandas.io.formats.style.Styler.to_html>` for the long format and controlling which text is changes, but I think in this case you can also use :meth:`.Styler.to_html` to get exactly the same)

(and the same for similar cases below)

@@ -275,6 +275,7 @@ column names and dtypes. That's because Dask hasn't actually read the data yet.
Rather than executing immediately, doing operations build up a **task graph**.

.. ipython:: python
:okwarning:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated? (if so, can you remove it again)

@@ -333,6 +334,7 @@ known automatically. In this case, since we created the parquet files manually,
we need to supply the divisions manually.

.. ipython:: python
:okwarning:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

``thousands``.
- ``border``, likely removed due to deprecated HTML specification.
- ``col_space``, likely removed in favour of CSS solutions.
- ``justify``, likely removed in favour of CSS solutions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a clear alternative for each of these cases. This can also just be an example how to achieve the same effect with styler

- ``border``, likely removed due to deprecated HTML specification.
- ``col_space``, likely removed in favour of CSS solutions.
- ``justify``, likely removed in favour of CSS solutions.
- ``render_links``, likely removed due to limited functionality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this actually quite useful, if you have a DataFrame with urls and want to use it in a notebook setting to have clickable links.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was not included in Styler, but I have added a separate PR for it #45058. Called it hyperlinks instead of render_links.

if locals()[kwarg] is not None:
warning_flag = True
warning_msg += f"\n `{kwarg}`, which may be {msg}."
for kwarg, msg in warnings_default_false.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For those cases where the default is not None (but True or False), we will also need to raise a deprecation warning if the user specified the keyword explicitly but with the default (as that will also stop working if the keyword is removed in the future).
Typically, to solve this we change the default value in the signature to None and then if it is None, change it to True/False (depending on the actual default).

@@ -16,6 +16,8 @@

import pandas.io.formats.format as fmt

pytestmark = pytest.mark.filterwarnings("ignore::FutureWarning")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the filter a bit more specific? (eg match the beginning of the message) Just to ensure we only ignore this warning, and not accidentally suppress warnings we should see.

@jorisvandenbossche
Copy link
Member

As far as the 'vagueness' goes in the docs, which I agree is there, I am confident that this can be done and I am happy to implement it, but my uncertainty is based on the approval of the community..

Yes, and so I know that's not a concrete comment for you to solve in the code, but I still think we (as pandas community) should first decide on this.

Is there an issue that already has some discussion on this topic?

@attack68
Copy link
Contributor Author

Yes, but doing the deprecation ànd already providing the new option can be done in the same release. If you specify one of the new keywords, we could make it use the new implementation. That should be both backwards compatible (all existing uses continue to use the old implementation, only when you change your code (eg to adjust for deprecation warnings), you might get the new implementation), and already make the future behaviour possible.

I gave this a go in #45090. That might be a better route..

…r_to_latex

# Conflicts:
#	doc/source/whatsnew/v1.4.0.rst
@lithomas1 lithomas1 removed the Blocker for rc Blocking issue or pull request for release candidate label Jan 2, 2022
@simonjayhawkins
Copy link
Member

moving to 1.5

@simonjayhawkins simonjayhawkins modified the milestones: 1.4, 1.5 Jan 6, 2022
@attack68
Copy link
Contributor Author

attack68 commented Jan 6, 2022

will close this for now in favour of #45090. Can revive later if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas IO HTML read_html, to_html, Styler.apply, Styler.applymap
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants