Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: default for pd.Interval of closed/inclusive changed on main from right to both #47365

Closed
3 tasks done
phofl opened this issue Jun 15, 2022 · 19 comments
Closed
3 tasks done
Labels
Blocker for rc Blocking issue or pull request for release candidate Bug Interval Interval data type
Milestone

Comments

@phofl
Copy link
Member

phofl commented Jun 15, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

print(pd.__version__)
i = pd.Interval(pd.Timestamp("2020-01-01"), pd.Timestamp("2020-01-02"))
print(repr(i))
print(i)

1.4.2
Interval('2020-01-01', '2020-01-02', closed='right')
(2020-01-01, 2020-01-02]

1.5.0.dev0+958.gf7be58a477
Interval('2020-01-01', '2020-01-02', inclusive='both')
[2020-01-01, 2020-01-02]

Issue Description

This isn't mentioned in the release notes. We should decide on how to proceed.

Expected Behavior

cc @simonjayhawkins

Installed Versions

Replace this line with the output of pd.show_versions()

@phofl phofl added Bug Needs Triage Issue that has not been reviewed by a pandas team member Interval Interval data type labels Jun 15, 2022
@phofl phofl added this to the 1.5 milestone Jun 15, 2022
@jreback
Copy link
Contributor

jreback commented Jun 15, 2022

on the PR changing to use inclusive this must have been accid changed
should change it back for the default o these cases

@phofl
Copy link
Member Author

phofl commented Jun 15, 2022

One additional thing: The class attributes were also renamend, e.g.

x = pd.IntervalDtype(subtype='int64')

x.closed

raises now. Are these considered public or private?

@jreback
Copy link
Contributor

jreback commented Jun 15, 2022

hmm let's add that back (and add a depreciation warning for it)

thanks!

@simonjayhawkins simonjayhawkins removed the Needs Triage Issue that has not been reviewed by a pandas team member label Jun 15, 2022
@simonjayhawkins
Copy link
Member

is it ok for the repr change in a minor release or should this be changed back too?

@jreback
Copy link
Contributor

jreback commented Jun 15, 2022

no this is for 1.5 only (as keyword change / deprecation)

@phofl
Copy link
Member Author

phofl commented Jun 15, 2022

We will change the attribute back too.

One other thing: The argument order changed, but I think we can live with that?

@simonjayhawkins
Copy link
Member

One other thing: The argument order changed, but I think we can live with that?

not sure, it's only the third argument, so maybe a few users will be using positional here.

looking through #46522, i'm wondering whether to revert for now and redo it in smaller parts.

@phofl
Copy link
Member Author

phofl commented Jun 15, 2022

I would like to give it a try on Friday, if this does not work out, reverting is probably the best option

@simonjayhawkins
Copy link
Member

One other thing: The argument order changed, but I think we can live with that?

not sure, it's only the third argument, so maybe a few users will be using positional here.

if the default is changed then this may not be an issue. The inclusive argument accepts a None value as well as the strings ‘both’, ‘neither’, ‘left’, ‘right’. These strings have the same meaning as the closed argument so effectively inclusive is just an alias for closed and should not be an issue.

ideally inclusive should have been lib.no_default (for the checking that both parameters are not passed at the same time) so that users don't explicitly pass None.

There is also

        if closed is None:
            inclusive = "both"

in _warning_interval that also maybe allows users to now pass None explicitly but this will probably be removed when changing the defaults.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jun 22, 2022

Should we maybe reconsider the actual deprecation of closed to start with (instead of fixing some issues caused by the deprecation)?

I understand the strive for consistency in keyword arguments (and that certainly makes sense for things like highlight_between, between and between_time that those are consistent with each other), but specifically for the Interval-related methods, we already were consistently using the "closed" terminology. In addition:

  • "closed" is standard terminology for intervals (and more than "inclusive" I think, or at least the rename is not fixing some unclear naming)
  • It's not just a keyword argument rename, but also several attributes and methods on several classes that need to be renamed (eg set_closed should then also be deprecated/renamed?)
  • Reading a parquet or pickle file written with a previous version of pandas no longer works (this can certainly be fixed, but is additional overhead that the deprecation causes)
  • The other methods (highlight_between, between and between_time) are not related to the Interval data type, so I think consistency between those methods and interval-related methods is less important as consistency within both groups.

@simonjayhawkins
Copy link
Member

+1 for closed for Interval etc. (i.e. +1 to revert all of #46522)

@phofl
Copy link
Member Author

phofl commented Jun 22, 2022

I don't have any preference there. I just saw that the default changed, hence this pr.

There where some cleanup prs after the initial one we would also have to revert

@jreback
Copy link
Contributor

jreback commented Jun 22, 2022

i don't think we should revert

inclusive is a much better and more consistent keyword and it's already been used in a bunch of issues

so +1 in this change
and -1 on reverting the entire thing

@simonjayhawkins
Copy link
Member

it seems weird to me that the docs state

inclusive{‘both’, ‘neither’, ‘left’, ‘right’}, default ‘both’
Whether the interval is closed on the left-side, right-side, both or neither. See the Notes for more detailed explanation.

i.e. we explain inclusive in terms of closed. strange.

@jorisvandenbossche jorisvandenbossche added the Blocker for rc Blocking issue or pull request for release candidate label Jul 17, 2022
@jbrockmendel
Copy link
Member

I agree with Joris here.

@mroeschke
Copy link
Member

xref #40245 with @attack68's issue with standardizing the keyword argument

Personally, I think inclusive is a clearer name than closed (closed_side possibly even better ;)).

Generally I would be +0 to standardize to inclusive. + for consistency for arguments that take both, left, right, neither in this context and 0 for this just being more of a style change that induces churn.

@mroeschke mroeschke mentioned this issue Aug 15, 2022
@mroeschke
Copy link
Member

Taking stock, appears we have:

If we are aiming to release the 1.5rc next week (#45223 (comment)), it appears we should revert this deprecation unless there are any other considerations

@jreback
Copy link
Contributor

jreback commented Aug 15, 2022

ok to revert and discuss later

@mroeschke
Copy link
Member

This has been reverted in #48116, and the discussion to change closed to inclusive can be discussed in #40245 so closing in favor of that issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocker for rc Blocking issue or pull request for release candidate Bug Interval Interval data type
Projects
None yet
Development

No branches or pull requests

6 participants