Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let reuse look inside git submodules #118445

Merged
merged 10 commits into from
Dec 12, 2023

Conversation

jonathanpallant
Copy link
Contributor

Changes collect-license-metadata and generate-copyright so they can now look at the git submodules.

Unfortunately reuse chokes on the LLVM submodule - it finds the word "Copyright" or the unicode copyright symbol in all kinds of places, including UTF-8 test cases. The reuse tool expressly won't let you ignore folders, so we let it scan everything and then strip out the LLVM sub-folder in post. Instead, we add in a hand-curated list of copyright information gleaned by reading the LLVM codebase carefully, which is stored in .reuse/dep5 in Debian format where reuse can find and use it.

The .reuse/dep5 continues to track copyright info for files in the tree that do not have SPDX metadata in them (i.e. all of them)

jonathanpallant and others added 9 commits November 21, 2023 12:49
Required because spdx-rs 0.5.3 added support for SPDX  2.3 documents and made these fields optional.
LLVM copyrights are now condensed to those reported in the .reuse/dep5 file.
A `reuse --include-submodules lint` now passes.
The LLVM project is a combination of some NCSA licensed files and some Apache-2.0 WITH LLVM-exception licensed files, so you should follow both licences when using the combined work.

Also clarified where I got the copyright years and git hash from.
@rustbot
Copy link
Collaborator

rustbot commented Nov 29, 2023

r? @Mark-Simulacrum

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added A-meta Area: Issues & PRs about the rust-lang/rust repository itself A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Nov 29, 2023
@rustbot
Copy link
Collaborator

rustbot commented Nov 29, 2023

rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead.

cc @rust-lang/rust-analyzer

Comment on lines +108 to +109
# any time LLVM is updated, please revisit this section. The copyrights are
# taken from the relevant LLVM sub-folders: llvm, lld, lldb, compiler-rt and libunwind.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems likely that we're going to not do this - especially without a clearer story on what "revisit" means.

Can we either remove this ask or try to make it more clear-cut to comply with it? Maybe we should be pushing LLVM upstream to improve metadata tracking of some kind?

Copy link
Contributor Author

@jonathanpallant jonathanpallant Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM express their copyright through a combination of the CREDITS.txt file, and a collection of files called LICENSE.txt (example) dotted in various subfolders. There's no automated way to grok all that, so I had to do it manually.

The note were was as a reminder that, if someone bumps the LLVM subtree, they have to go and re-read the LICENSE.txt files and see if any of the copyrights have been updated or the LLVM license has changed. For example, as some point the NCSA license will be dropped and it will be Apache-with-exceptions only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It seems likely that this will not happen on LLVM tree bumps (those are done far too often for someone to do this analysis). That's probably OK; it just means we'll want to eventually schedule some audit of this file (yearly or quaterly perhaps) that goes through and updates this sort of thing.

@Mark-Simulacrum
Copy link
Member

r=me with the names added in the rustdoc file case

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 9, 2023
@jonathanpallant
Copy link
Contributor Author

@bors r=Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Dec 11, 2023

@jonathanpallant: 🔑 Insufficient privileges: Not in reviewers

@pietroalbini
Copy link
Member

@bors r=Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Dec 11, 2023

📌 Commit eba02ab has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 11, 2023
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Dec 12, 2023
…odules, r=Mark-Simulacrum

Let `reuse` look inside git submodules

Changes `collect-license-metadata` and `generate-copyright` so they can now look at the git submodules.

Unfortunately `reuse` chokes on the LLVM submodule - it finds the word "Copyright" or the unicode copyright symbol in all kinds of places, including UTF-8 test cases. The `reuse` tool expressly won't let you ignore folders, so we let it scan everything and then strip out the LLVM sub-folder in post. Instead, we add in a hand-curated list of copyright information gleaned by reading the LLVM codebase carefully, which is stored in `.reuse/dep5` in Debian format where `reuse` can find and use it.

The `.reuse/dep5` continues to track copyright info for files in the tree that do not have SPDX metadata in them (i.e. all of them)
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 12, 2023
…mpiler-errors

Rollup of 7 pull requests

Successful merges:

 - rust-lang#118445 (Let `reuse` look inside git submodules)
 - rust-lang#118534 (codegen: panic when trying to compute size/align of extern type)
 - rust-lang#118756 (use bold magenta instead of bold white for highlighting)
 - rust-lang#118797 (End locals' live range before suspending coroutine)
 - rust-lang#118840 (remove some redundant clones)
 - rust-lang#118844 (Monomorphize args while building Instance body in StableMIR)
 - rust-lang#118848 (Add myself back to review rotation)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 12, 2023
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#118445 (Let `reuse` look inside git submodules)
 - rust-lang#118756 (use bold magenta instead of bold white for highlighting)
 - rust-lang#118797 (End locals' live range before suspending coroutine)
 - rust-lang#118840 (remove some redundant clones)
 - rust-lang#118844 (Monomorphize args while building Instance body in StableMIR)
 - rust-lang#118846 (Fix BinOp `ty()` assertion and `fn_sig()` for closures)
 - rust-lang#118848 (Add myself back to review rotation)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 1ee8327 into rust-lang:master Dec 12, 2023
11 checks passed
@rustbot rustbot added this to the 1.76.0 milestone Dec 12, 2023
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Dec 12, 2023
Rollup merge of rust-lang#118445 - ferrocene:jp-support-reuse-in-submodules, r=Mark-Simulacrum

Let `reuse` look inside git submodules

Changes `collect-license-metadata` and `generate-copyright` so they can now look at the git submodules.

Unfortunately `reuse` chokes on the LLVM submodule - it finds the word "Copyright" or the unicode copyright symbol in all kinds of places, including UTF-8 test cases. The `reuse` tool expressly won't let you ignore folders, so we let it scan everything and then strip out the LLVM sub-folder in post. Instead, we add in a hand-curated list of copyright information gleaned by reading the LLVM codebase carefully, which is stored in `.reuse/dep5` in Debian format where `reuse` can find and use it.

The `.reuse/dep5` continues to track copyright info for files in the tree that do not have SPDX metadata in them (i.e. all of them)
@pietroalbini pietroalbini deleted the jp-support-reuse-in-submodules branch February 21, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-meta Area: Issues & PRs about the rust-lang/rust repository itself A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants