Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scripts for facilitating generating release notes #3973

Merged
merged 6 commits into from
Oct 15, 2021

Conversation

fmassa
Copy link
Member

@fmassa fmassa commented Jun 5, 2021

Mostly copy-pasted from https://github.com/pytorch/pytorch/blob/master/scripts/release_notes/common.py with minor modifications to make it suited to torchvision

This implements the approach discussed in #3351, which gathers the GitHub labels for all the PRs between two commits, and outputs it in a json file, and was used to generate the release notes.

There is still some extra clean-up that needs to be done, including adding some functionality from some jupyter notebooks that we used to trim a bit more the data, but I'm sending this here as it can already be helpful.

cc @seemethere

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fmassa , LGTM.

Perhaps this could benefit from a few instructions? Like what should the commit hash be, etc.

Regarding the notebook, maybe we can just push it as a Github Gist and link to it from this file. It's likely that the notebook code will change a bit with every release, so I would avoid adding it in the repo.

import os
import json

categories = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed that some categories/topics we use on TorchVision are missing. For example code quality, tests, ci etc. Shall we add them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more like we should remove categories and topics as they're not used at all in this script ^^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should maintain them because they allow us to keep track of Better Engineering efforts. We could update the script to remap them under an umbrella topic related to improving infra.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could update the script to remap them under an umbrella topic related to improving infra.

What do you mean by that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI those categories were taken as is from PyTorch, and we haven't used them yet in our scripts. I can clean up this script further (it's mostly a dump of what I used for generating the release notes, not really cleaned up yet)

@NicolasHug NicolasHug marked this pull request as ready for review October 14, 2021 17:14
@NicolasHug
Copy link
Member

NicolasHug commented Oct 14, 2021

@fmassa @datumbox I took the liberty to push to this PR,

I removed some unused variables as agreed above, and added minimal (but I think sufficient) instructions on how to use the script.

As you can see I'm linking to https://nbviewer.org/gist/NicolasHug/248a72aa086fd874936f16eb56240226 which is the notebook I've been using. I'd rather not include the notebook in the repo, because its content will change often, and notebook + github = headache (diffs are unreadable, changes are thousands of lines long, etc.)

@datumbox
Copy link
Contributor

@NicolasHug Your modifications look fine to me. I agree that adding the notebook is messy. How do you feel about adding just the python code of the notebook?

@NicolasHug
Copy link
Member

I feel like we would need to copy/paste the code back into a notebook for it to be useful, but maybe there's a usage that I'm missing?

@datumbox
Copy link
Contributor

datumbox commented Oct 14, 2021

You are right, though you could just print stuff on output. The only reason I advocate adding them on the repo is because it will maintain history. A link on a notebook or a gist isn't as reliable. Not a blocker for me either way.

@@ -0,0 +1,121 @@
# In[1]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the script with these @datumbox , some IDEs like vscode and maybe pycharm understand these as a notebook so the .py file can be used directly as a notebook withing the IDE

@oke-aditya
Copy link
Contributor

Hey, The notebooks won't create a big diff I think, we ignore those using linguist documentation in .gitattributes.

https://github.com/pytorch/vision/blob/main/.gitattributes

@oke-aditya
Copy link
Contributor

E.g PR #3594 This did not flood with diffs from notebook

@NicolasHug
Copy link
Member

You're right @oke-aditya , thanks for noting this. But the fact that we completely hide the notebooks diffs also means that they're impossible to review :p

It's not just because of the diff that I'm reluctant to include the notebook though: notebooks are usually much larger files than pure text .py files as they contain meta-data + the output of the cells, which can be fairly big (in this case we output long dataframes), increasing the repo's size for little benefit.

@NicolasHug NicolasHug merged commit 3d7244b into pytorch:main Oct 15, 2021
@github-actions
Copy link

Hey @NicolasHug!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@fmassa fmassa deleted the scripts-release-notes branch October 15, 2021 10:33
mszhanyi pushed a commit to mszhanyi/vision that referenced this pull request Oct 19, 2021
* Add scripts for facilitating generating release notes

* remove unused lists, and added instructions, applied balck

* renamed file

* Added pandas script directly

Co-authored-by: Nicolas Hug <nicolashug@fb.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
facebook-github-bot pushed a commit that referenced this pull request Oct 19, 2021
Summary:
* Add scripts for facilitating generating release notes

* remove unused lists, and added instructions, applied balck

* renamed file

* Added pandas script directly

Reviewed By: NicolasHug

Differential Revision: D31758306

fbshipit-source-id: 58675f491b811dab8a58bca9e4235551684145bd

Co-authored-by: Nicolas Hug <nicolashug@fb.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
cyyever pushed a commit to cyyever/vision that referenced this pull request Nov 16, 2021
* Add scripts for facilitating generating release notes

* remove unused lists, and added instructions, applied balck

* renamed file

* Added pandas script directly

Co-authored-by: Nicolas Hug <nicolashug@fb.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants