Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2: list dependencies from either import path or go binary #71
v2: list dependencies from either import path or go binary #71
Changes from 27 commits
7f8ce88
e020f62
55b8538
a571fd5
3ea68bd
2eafb6e
682cc82
5bd2c4a
b85f547
acfe58a
6f64046
f885c2c
b9da23c
6973c71
517e811
3cd4f5f
0749a9a
cee56ce
4f9aff3
1f90f0e
fff0db5
89f12c3
8d1a5b8
1fb170d
7b29d8f
6371c33
2357868
2d930b0
801af76
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes cause problems if the current module context doesn't match the same version as the binary. I think we'll need to rely on
go list -m
(or some library equivalent if available) instead.To test this I used a modified cli02 binary using
golang.org/x/tools@v0.1.3
, with a go-licenses@v2 usinggolang.org/x/tools@v0.1.4
(diff)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I didn't know we could use
go list -json -m <module@version>
to also download the module and get metadata at the same time. And I just tried, it's not restricted by the current module dir. I can run the command anywhere to get the modules.I think we are now very close to lift the restriction of go module dir, if we just
go list -json -m <the-entire-list-of-modules>
to get all the modules.There's one remaining challenge to me, the main module to build a binary is usually not versioned. How can we know what it is if outside of the go working dir?
For clarification, it was expected behavior in my design. How do we know where the main module is and whether it is the correct version? If we are in a working dir and found go modules version has a mismatch, a more serious problem is that most likely the main module's version is incorrect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can pull this out from
go version
, as long as the binary was built from a released module version. e.g.(
+b7a85e0003
corresponds to the go tool version, not the module version, so we can't infer the source commit from there)I don't know how we can infer version information from
devel
sources. 😞 The answer might just be we can't for now and just throw an error for the time being telling people they need togo install @version
if they want it to work.Assuming main module == binary module here, I think this is expected. The binary module is going to be different from the local working directory module, unless the local working directory module is the module that produces the binary (but if that were the case, my expectation is you would run go-licenses on the source package itself rather than the binary). That's why I don't think we can join the modules here, not only because of version mismatches, but also different replacements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My use-case is exactly running go-licenses in the same working directory that produces the binary, that's a prerequisite for running the tool.
The reason I chose to run the tool on the binary instead of source code is because some go module dependencies (test / tool dependencies) are trimmed when compiling to the binary, so we can check license only for the modules included in the binary.
When you said running go-licenses on the source package itself, I assume you are talking about using packages.Load like the same logic in go-licenses v1. I think both solutions work, if you have strong opinions, I can try to use packages.Load instead.
I personally feel that it's better to rely on the modules metainfo from built binary, because it's better for single source of truth -- the go compiler checks for all dependencies, compile to a binary and record the modules metainfo in the binary.
On the contrary, if we use packages.Load to get all the modules from a package, the go compiler will collect all dependencies once, and packages.Load will get all the modules for the second time. How do we guarantee the two are giving the same results? I think there are potential risks for:
If we read the modules meta info directly from the binary, these risks are removed, because only the go compiler decides what modules are pulled in.
I believe we are making a trade off here.
Option 1 - the tool only runs in the same workdir used to build the binary.
Option 2 - the tool only works for binaries built by
go install module@version
.Considering the major user scenario should be module authors adding license info to binaries built by themselves, I think option 1 is better.
After some experimentation using the new tool in github.com/kubeflow/pipelines, my typical licensing management workflow is like the following:
go-licenses csv
and verify the license csv is up-to-date. When there's an error, the PR author should check licenses for updated dependencies and update the licenses csv.go-licenses save
the notices to the container image containing the binaryIn all the scenarios, the person/CICD tool running go-licenses should be using it in exactly the same workdir as where the binary is built.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the additional context! I was under the assumption you wanted to include additional binary tools as part of the license output, or to generate a license summary from a standalone unrelated binary. IIUC, you want to use the binary metadata as a shortcut to generate a minimal license summary that only includes dependencies that would be included in binaries that you are distributing.
Test dependencies are already excluded unless
Test: true
is included in packages.Config (which is currently not enabled for go-licenses). Related: #62Tool dependencies (assuming this is following https://github.com/golang/go/wiki/Modules#how-can-i-track-tool-dependencies-for-a-module) would be treated like any other dependency and only be present if they are included transitively.
I think this would cover most cases to generate a minimal license summary, but if you have any examples in your project that don't fit well let me know! Would happy to dive deeper here.
I'm not particularly worried about go compiler vs packages.Load version skew. Go has a pretty good track record when it comes to backwards compatibility. packages.Load is used by other tools in golang.org/x/tools, so I would imagine/hope any breaking changes / incompatibilities would be noticed prior to major Go releases.
What I am more worried about is version skew between the code and the binary, since the binary could have been built from a different version than what is currently present in the repo / go.mod. I'd rather lean on the code as the source of truth as much as possible, since that's what
go.mod
is referring to.Although go-licenses does not directly accept build flags on its own CLI, we do support build flags via GOFLAGS.
If we wanted, we could pass through flags via
packages.Config
.Agreed - I also prefer Option 1, but would prefer to use the source code as the source of truth rather than the binary.
FWIW - projects I'm involved in follows an identical pattern to what you're describing in your release workflow using the existing tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @wlynch, I think the new update will be what you want.
As we discussed initially, we can support both getting dependencies from the binary or from code + import path.
I've included another implementation to get all the dependencies and their modules using packages.Visit.
Glad to know, that gives me more confidence of the UX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-P0 idea:
By the way, after thinking a bit more about the use-case for analyzing completely unknown go binaries.
I think it can be a good additional feature.
As discussed in #71 (comment), the current limitation is that we do not know the exact version of the main module. However, the limitations seem to be a good trade off.
e.g. consider this UX:
I use go-licenses to analyze random binaries taken from containers, go-licenses can get exact version of all the dependencies and get HEAD version of the main repo. The result is slightly inaccurate for the main repo, so we show a warning message that we do not know version of the main repo. In this case, we can just check HEAD version.
Despite the limitation, the overall result is good-enough, I have a very good idea of which licenses are used in the binary and only in rare cases will there be a mismatch. From what I heard, when a project changes license, most of the cases, it's switching from a permissive license to a more restrictive license, so analyzing HEAD version of the main repo is unlikely to give false negatives. False positives are easy to deal with, because people can just check them manually.