Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: go mod download should follow symlinks #39417

Closed
jba opened this issue Jun 5, 2020 · 5 comments
Closed

cmd/go: go mod download should follow symlinks #39417

jba opened this issue Jun 5, 2020 · 5 comments

Comments

@jba
Copy link
Contributor

jba commented Jun 5, 2020

When constructing a module zip, follow symlinks and include the linked files.

One application: nested modules can symlink to the LICENSE file at repo root instead of copying it, to avoid skew when the root file is modified.

Example: https://github.com/apache/arrow, which isn't exclusively Go code, has a module at go/arrow. The license at repo root is intended to cover the module, but since it isn't in the module tree, it's not in the zip, so the module appears to have no license. The maintainer would like to symlink the license at repo root (apache/arrow#7351).

@jayconrod
Copy link
Contributor

Unfortunately, we can't make any changes to the set of files that get packed into a zip file for a given version. That would change the hashes of the resulting zip files. Even if we gated this on the go version in go.mod, older versions of the go command would disagree with newer versions.

For reference, before #27093, the go command followed symbolic links to files and directories. They were stored and extracted as regular files and directories (symbolic links can't be reliably represented in the zip format, and they can't be extracted on all platforms). This caused problems for tests, so it was decided to ignore them. The fix for that changed hashes between 1.11.3 and 1.11.4, which haunted us for a long time.

About LICENSE: the go command will pull in a file named LICENSE from the repository root directory when building the zip for any module in a subdirectory. Unfortunately for github.com/apache/arrow, their file is named LICENSE.txt, so it doesn't get pulled in. They could either rename the file to LICENSE or copy LICENSE.txt to the go/arrow directory.

@josh-newman
Copy link

josh-newman commented Jun 6, 2020

@jayconrod, would it be possible for go to recognize and package LICENSE.txt (or LICENSE*) from the repository root, too? In the case of #39318, the Arrow maintainers were quick to help make pkg.go.dev work, but more flexibility would help other projects.

As a very rough metric, I counted ~3000 Github repositories containing Go code that use LICENSE.txt (vs ~80000 with LICENSE) using Bigquery. Or ~2000 (vs ~61000) with Go listed as their first language, in case that means most lines of code (also Bigquery).

@jayconrod
Copy link
Contributor

I won't say it's impossible, but it's very difficult to make any change to the set of files that get included in a module .zip file.

If different versions of Go choose different sets of files, they won't generate the same hashes, so when a module is downloaded by an old version of the go command it will look like someone has tampered with the module after it was released. That could be somewhat mitigated by gating the change on a new go version in go.mod. We'd have to choose a version far enough in the future (at least a year, probably more) that by the time it was current, versions of Go that don't support this are no longer supported.

It's also hard to decide what files to actually include: LICENSE, LICENSE.txt, sure. How about COPYING.html? copyright-license.pdf? pkg.go.dev has a list of names they use, and we'd probably use the same list. Not sure what they do if there are multiple matching files.

@jba
Copy link
Contributor Author

jba commented Jun 8, 2020

Not sure what they do if there are multiple matching files.

We remember all of them. In this case, the proxy would copy all of them.

@jayconrod
Copy link
Contributor

Just to close the loop on this: @bcmills, @matloob, and I discussed this last week, and we decided not to change anything here. Some mitigations are possible, but fundamentally, this would cause different versions of the go command to produce different sums for the same module, which would cause security errors. The cost of that outweighs the benefit.

iredelmeier added a commit to iredelmeier/lilliput that referenced this issue Aug 14, 2020
Go modules don't handle symlinks well due to
golang/go#39417.

This hacky fix works around the issue by replacing symlinks with regular
files.
iredelmeier added a commit to discord/lilliput that referenced this issue Aug 15, 2020
Replace dep symlinks with copies of the target

Go modules don't handle symlinks well due to
golang/go#39417.

This hacky fix works around the issue by replacing symlinks with regular
files.
@golang golang locked and limited conversation to collaborators Jun 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants