Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDEP-1 first revision (scope) #51417

Merged
merged 20 commits into from
Apr 14, 2023
Merged

Conversation

noatamir
Copy link
Member

@noatamir noatamir commented Feb 15, 2023

This proposal revises the PDEP definition and guidelines. Feedback is welcome.

Context and summary of the changes:

  • Clarify the scope and purpose of PDEPs to make it clearer when a PDEP would be needed and when not (this is a discussion we basically didn't really have in the initial PR (PDEP-1: Purpose and guidelines for pandas enhancement proposals #47444), and this wasn't very clear for several people involved): this is done by expanding the "PDEP definition, purpose and scope" section, and by adding a list of examples at the bottom of the page.
  • Updated the "PDEP authors" section to expand the explanation for non-core-team members and include they need a sponsor. IIRC the reason for this is two-fold: to have some form of quality assurance (to not be "spammed" by many drive-by PDEPs that might not need to be a PDEP) and to help the non-core-team submitter (the sponsor would be a core-team member who takes on the responsibility of guiding the PDEP author through the process).

Co-authored-by: @MarcoGorelli, @rhshadrach, @Dr-Irv, @jorisvandenbossche, @mroeschke

Copy link
Contributor

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly comments related to formatting.

web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
@noatamir
Copy link
Member Author

Thanks @Dr-Irv! I'm fixing it up. Should pandas-stubs also be pandas-stubs?

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Feb 15, 2023

Thanks @Dr-Irv! I'm fixing it up. Should pandas-stubs also be pandas-stubs?

Yes. And I may have missed a few other places where backquotes are needed.

@noatamir
Copy link
Member Author

Thanks @Dr-Irv! I'm fixing it up. Should pandas-stubs also be pandas-stubs?

Yes. And I may have missed a few other places where backquotes are needed.

Yeah, I'm adding a bunch more following your examples =)

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Feb 15, 2023

Yeah, I'm adding a bunch more following your examples =)

Let me know when you're done with your edits, and I'll take another look. Thanks!

@lithomas1
Copy link
Member

I haven't been to a governance meeting in a while(had a time conflict), but is there a procedure for revising a PDEP/deferring parts of a PDEP?

I am asking since I am going to open PDEP-7 soon, and there are some parts where I would like to survey users, but not have that block acceptance of a PDEP.

It would also be nice to have since this PR itself revises a PDEP :).

@lithomas1
Copy link
Member

I haven't been to a governance meeting in a while(had a time conflict), but is there a procedure for revising a PDEP/deferring parts of a PDEP?

I am asking since I am going to open PDEP-7 soon, and there are some parts where I would like to survey users, but not have that block acceptance of a PDEP.

It would also be nice to have since this PR itself revises a PDEP :).

Looking through the existing text, it looks like there's some process for this, but it would be nice to clarify whether changes would need to be accepted again and who can propose revisions.

Copy link
Member

@lithomas1 lithomas1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up. This definitely clears up a lot of ambiguity/confusion that I had about PDEPs.

web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
@MarcoGorelli
Copy link
Member

Nice, thanks for writing this up! 🙌

@noatamir
Copy link
Member Author

Looking through the existing text, it looks like there's some process for this, but it would be nice to clarify whether changes would need to be accepted again and who can propose revisions.

I believe that as long as you mean changes during the discussion phase, you don't need a process for amending your PDEP. Only once it has been accepted you'd need to follow a revision process.

We are working on a definition for the decision making process (e.g. voting and acceptance) at the moment.

@jorisvandenbossche jorisvandenbossche requested a review from a team February 16, 2023 20:11
@jorisvandenbossche jorisvandenbossche changed the title PDEP-1 first revision PDEP-1 first revision (scope) Feb 16, 2023
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening the PR!

web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
@jorisvandenbossche
Copy link
Member

Looking through the existing text, it looks like there's some process for this, but it would be nice to clarify whether changes would need to be accepted again and who can propose revisions.

I believe that as long as you mean changes during the discussion phase, you don't need a process for amending your PDEP. Only once it has been accepted you'd need to follow a revision process.

There is indeed some text around revisions currently. But it's maybe something we should discuss a bit more to what extent small revisions (after acceptance) would need to follow the full normal decision process.
For example, Marco has made some small revisions to PDEP4 while implementing it and from initial feedback with the implementation (https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html, see at the bottom there is a "revision history"). It might again become subjective call on whether something is a "significant" change or not.

@datapythonista
Copy link
Member

Preview docs at: https://pandas.pydata.org/preview/51417/

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, added some minor comments.

web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
@mroeschke mroeschke added the PDEP pandas enhancement proposal label Feb 16, 2023
@simonjayhawkins
Copy link
Member

This proposal revises the PDEP definition and guidelines.

Why?

PDEPs are predominantly to engage the community in significant changes to pandas? So this should also apply to PDEP revisions too? It's not clear to me what the motivation is for revising the PDEP.

@MarcoGorelli
Copy link
Member

It's just to clarify a few things

PDEPs are predominantly to engage the community in significant changes to pandas?

That, and to have a process with which to make significant changes

@simonjayhawkins
Copy link
Member

It's just to clarify a few things

Thanks @MarcoGorelli

I think the bar for revising a PDEP should be as high as the initial PDEP itself and therefore the motivation for the changes should be transparent (to the community).

maybe the revision status should indicate what needed clarifying. And also the change to requiring core-team sponsors explicitly mentioned?

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Feb 20, 2023

Sure, good point

For context, the revision comes out of the biweekly governance meeting, quite some discussion went into it

And yes a brief summary of the revision would be good (e.g. in pdep4 there's a sentence to say why it was revised, even though that particular revision was really minor)

@simonjayhawkins
Copy link
Member

For context, the revision comes out of the biweekly governance meeting, quite some discussion went into it

Yes. I do read the minutes. But we are now transferring that outcome to the wider community.

We introduced the PDEP as a formal process for pandas changes. I don't want to see that process compromised by allowing or encouraging revisions that would not have the same scrutiny or oversight.

(to be clear. I'm not trying to block the changes in this PR per se.)

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Feb 20, 2023

The top post could indeed have provided a brief summary and rationale of the proposed changes, to give some context to the diff. Now, I think apart from clarification of existing text (what Marco mentioned), the "sponsor" seems to be the only substantial change. Potential summary:

  • Clarify the scope and purpose of PDEPs to make it clearer when a PDEP would be needed and when not (this is a discussion we basically didn't really have in the initial PR (PDEP-1: Purpose and guidelines for pandas enhancement proposals #47444), and this wasn't very clear for several people involved): this is done by expanding the "PDEP definition, purpose and scope" section, and by adding a list of examples at the bottom of the page.
  • Updated the "PDEP authors" section to expand the explanation for non-core-team members and include they need a sponsor. IIRC the reason for this is two-fold: to have some form of quality assurance (to not be "spammed" by many drive-by PDEPs that might not need to be a PDEP) and to help the non-core-team submitter (the sponsor would be a core-team member who takes on the responsibility of guiding the PDEP author through the process).

The "sponsor" term is inspired on the PEP process (https://peps.python.org/pep-0001/#submitting-a-pep), but we should probably explain that in a bit more detail in the text here.

@noatamir if that sounds about right, I (or you) can update the top post to include this summary.

We introduced the PDEP as a formal process for pandas changes. I don't want to see that process compromised by allowing or encouraging revisions that would not have the same scrutiny or oversight.

The first PDEP-1 was done as a PR, this revision is using the exact same process: here is a PR with proposed changes that is open for discussion and needs to be approved by the core team.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just added a suggestion for the headers.

web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
Copy link
Contributor

@topper-123 topper-123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small suggestion.

web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
@jbrockmendel
Copy link
Member

PDEPs are appropriate for user facing changes

In #52140 @Dr-Irv wrote "Given that you are proposing a change in behavior (which would require a deprecation cycle anyway), I believe that a PDEP is warranted."

The idea that any behavior change warrants a PDEP is absurd. This is not viable.

@MarcoGorelli
Copy link
Member

I understand your concerns Brock, and if the suggestion was that any behaviour change warranted a PDEP, then that would indeed be absurd. But if a change is fundamental, then it could benefit from the PDEP process.

In #52140 specifically, the issue starts with "These changes are somewhat fundamental", and so I'm inclined to agree that it should be a PDEP.

Regarding the changes being proposed in this PR - are you on board with them? If not, that's fine, but could you please point out which specific parts you'd like to see changed?

Looking forward, we're going to have to make big decisions around nullable dtype and pyarrow, which I know you care about. By improving governance and decision-making, the expectation is that we will all together be able to make pandas even better

Co-authored-by: Terji Petersen <contribute@tensortable.com>
@jorisvandenbossche
Copy link
Member

@jbrockmendel can you respond to Marco's answer? Does that answer your concern, or do you still have concrete feedback for this PR?

@jbrockmendel
Copy link
Member

The problematic sentence is not made worse by this PR, so no objection. I only commented here because irv pointed to the PR specifically as relevant.

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more suggestions. (but my approval still stands)

web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
### Clear examples for potential PDEPs:

- Adding a new parameter to many existing methods, or deprecating one in many places. For example:
- The `numeric_only` deprecation affected many methods and could have been a PDEP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this have a GH ref like most of the others?

- The `numeric_only` deprecation affected many methods and could have been a PDEP.
- Adding a new data type has impact on a variety of places that need to handle the data type.
Such wide-ranging impact would require a PDEP. For example:
- `Categorical` ([GH-7217][7217], [GH-8074][8074]), `StringDtype` ([GH-8640][8640]), `ArrowDtype`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for ArrowDtype

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't directly find a clear issue for ArrowDtype, only various PRs implementing parts of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no probs. not a blocker.

web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
web/pandas/pdeps/0001-purpose-and-guidelines.md Outdated Show resolved Hide resolved
- On the other hand, `DataFrame.assign()` might. While it is a single method without backwards
compatibility concerns, it is also a core feature and the discussion should be highly visible.
- Deprecating or removing a single method would not require a PDEP in most cases. For example:
- `DataFrame.xs` ([GH-6249][6249]) is an example of deprecations on core features that would be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should maybe be a "On the other hand" case with an additional example that supports the "Deprecating or removing a single method would not require a PDEP" case. (to be consistent with the "Adding new methods or parameters to an existing method typically will not require a PDEP" examples above.)

Copy link
Member

@jorisvandenbossche jorisvandenbossche Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I would maybe change the example to DataFrame.append deprecation, which I think was a clear example of a controversial deprecation (at least for users) that already happened in the past (while I think for xs there is no much active discussion about it, the linked github issue's last activity is from a few years ago)

jorisvandenbossche and others added 2 commits April 11, 2023 18:39
Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com>
Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com>
Comment on lines 172 to 173
- Deprecating or removing a single method would not require a PDEP in most cases. For example:
- That said, `DataFrame.append` ([GH-35407][35407]) is an example of deprecations on core
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps also need to remove the For example: to make this scan better if not adding a case where a PDEP is not required.

@simonjayhawkins
Copy link
Member

Is there anything stopping this being merged now? (we don't have the voting procedure until the next revision of this document is accepted)

@jorisvandenbossche
Copy link
Member

No, after the edit for your last comment yesterday, I was just waiting for CI to pass ;) So going to merge now.

@jorisvandenbossche jorisvandenbossche merged commit 80001a0 into pandas-dev:main Apr 14, 2023
@simonjayhawkins
Copy link
Member

Thanks @noatamir and others.

@jorisvandenbossche
Copy link
Member

Thanks @noatamir for leading the effort, and thanks to everyone who contributed feedback both in the meetings as here on the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PDEP pandas enhancement proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.