-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
License scanning for VMR #17442
License scanning for VMR #17442
Conversation
There are assumptions in the test infrastructure that all tests are based around testing the SDK output. But that's not always the case. This refactors the test suite to have a new TestBase class which can be used for tests that don't target the SDK and SmokeTest class has been renamed to SdkTests for those tests which do test the SDK. Also factored out the logic for parsing exclusion files into the Utilities class.
Here's an example of the output files from the scan for the installer repo: logs.zip |
...d/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/baselines/Licenses.arcade.json
Outdated
Show resolved
Hide resolved
src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/LicenseScanTests.cs
Outdated
Show resolved
Hide resolved
src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/LicenseScanTests.cs
Outdated
Show resolved
Hide resolved
src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/LicenseScanTests.cs
Outdated
Show resolved
Hide resolved
...ourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/LicenseExclusions.txt
Show resolved
Hide resolved
...ourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/LicenseExclusions.txt
Outdated
Show resolved
Hide resolved
...ourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/LicenseExclusions.txt
Show resolved
Hide resolved
...ourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/LicenseExclusions.txt
Show resolved
Hide resolved
...ourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/LicenseExclusions.txt
Show resolved
Hide resolved
...ourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/LicenseExclusions.txt
Outdated
Show resolved
Hide resolved
.../content/test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/baselines/Licenses.runtime.json
Outdated
Show resolved
Hide resolved
Thanks for working on this Matt. Do you see this detection as always living in the VMR or do you think this is something that can/should be pushed to the product repos? This feels like a type of compliance that repo owners should be responsible for. I worry about the burden this add to the source-build folks to monitor and push for resolution on license violations from individual repos. This may be something we would continue to run on the VMR but can this be backported to arcade and run on the repo level in addition? The idea is then the exclusions/exceptions would be owned by the individual repos. |
src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/LicenseScanTests.cs
Show resolved
Hide resolved
src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/LicenseScanTests.cs
Outdated
Show resolved
Hide resolved
src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/ScancodeResults.cs
Outdated
Show resolved
Hide resolved
...ourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/SourceBuiltArtifactsTests.cs
Show resolved
Hide resolved
I couldn't agree more. This is amazing! Thanks Matt!
That's a great point. A few reasons that make me less excited about product repos owning this:
|
...icrosoft.DotNet.SourceBuild.SmokeTests/assets/baselines/Licenses.source-build-externals.json
Outdated
Show resolved
Hide resolved
I share your concern about ensuring correctness and yes it feels similar to prebuilts. It is just that this feels like a compliance item better owned by the repo owners with the appropriate checks/sign-offs in place. |
Please capture some guidance documentation for how devs can run the license scanning test for a particular repo. This will be helpful in the future when logging repo issues about licensing issues. |
I've added documentation in this commit: bfe0446 |
I'm going to be updating the test to have some additional validation for the exclusion file. One of my concerns is with the maintenance of that file and it getting filled with obsolete content. While it's not going to be the perfect answer, I want to have some validation to check for whether paths listed in there actually exist in the VMR. It can only do this up to a point, considering that those paths can contain wildcards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
.../test/Microsoft.DotNet.SourceBuild.SmokeTests/assets/baselines/licenses/Licenses.arcade.json
Outdated
Show resolved
Hide resolved
src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/BaselineHelper.cs
Outdated
Show resolved
Hide resolved
@@ -60,6 +86,11 @@ src/fsharp/setup/resources/eula/*.rtf | |||
# False positive | |||
src/installer/src/core-sdk-tasks/BuildFPMToolPreReqs.cs|json | |||
src/installer/src/redist/targets/packaging/osx/clisdk/resources/en.lproj/welcome.html|cecill-c | |||
src/installer/THIRD-PARTY-NOTICES|proprietary-license |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reported as aboutcode-org/scancode-toolkit#3543
src/arcade/src/Microsoft.DotNet.Build.Tasks.Installers/src/BuildFPMToolPreReqs.cs|json | ||
src/arcade/src/Microsoft.DotNet.Build.Tasks.Installers/build/rpm_templates/copyright|cecill-c | ||
src/arcade/src/SignCheck/SignCheck/THIRD-PARTY-NOTICES.TXT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying out a quick fix to see if we can delete this file: dotnet/arcade#14090
/backport to main |
Started backporting to main: https://github.com/dotnet/installer/actions/runs/6497054013 |
@mthalman backporting to main failed, the patch most likely resulted in conflicts: $ git am --3way --ignore-whitespace --keep-non-patch changes.patch
Applying: Test suite refactoring
.git/rebase-apply/patch:358: trailing whitespace.
warning: 1 line adds whitespace errors.
Using index info to reconstruct a base tree...
M src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/BaselineHelper.cs
M src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/BasicScenarioTests.cs
M src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/DotNetFormatTests.cs
M src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/DotNetHelper.cs
M src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/DotNetWatchTests.cs
M src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/OmniSharpTests.cs
A src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/SourcelinkTests.cs
M src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/Utilities.cs
M src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/WebScenarioTests.cs
Falling back to patching base and 3-way merge...
Auto-merging src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/WebScenarioTests.cs
Auto-merging src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/Utilities.cs
CONFLICT (content): Merge conflict in src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/Utilities.cs
CONFLICT (modify/delete): src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/SourcelinkTests.cs deleted in HEAD and modified in Test suite refactoring. Version Test suite refactoring of src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/SourcelinkTests.cs left in tree.
Auto-merging src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/OmniSharpTests.cs
Auto-merging src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/DotNetWatchTests.cs
Auto-merging src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/DotNetHelper.cs
Auto-merging src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/DotNetFormatTests.cs
Auto-merging src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/BasicScenarioTests.cs
Auto-merging src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/BaselineHelper.cs
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Test suite refactoring
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128 Please backport manually! |
@mthalman an error occurred while backporting to main, please check the run log for details! Error: git am failed, most likely due to a merge conflict. |
Updates the source-build smoke tests to include a test which scans the sub-repos of the VMR, searching for license references. The intent is to be aware of any non-open-source license being introduced. The tool used for scanning for licenses is https://github.com/nexB/scancode-toolkit.
Each sub-repo of the VMR is scanned separately because of the amount of time it takes. If the entire VMR were to be scanned with one invocation of the tool, it would take ~12 hrs. By splitting it into separate jobs it takes ~6hrs.
When scanning is run, the test provides a list of files for the scanner to ignore. These include binary file types. It also includes
.il
/.ildump
file types which are massive, causing the scanner to choke and don't include license references anyway.Once the scanner returns the results, a filtering process occurs. First, any detected license that is in the allowed list of licenses is filtered out. The test defines a list of such licenses that all represent open-source licenses. Next, a license exclusions file is applied to the filtering. This file contains a set of paths for which certain detected licenses are to be ignored. Such a path can be defined to ignore all detected licenses or specific ones. These exclusions are useful for ignoring false positives where the scanning tool has detected something in the file that makes it think it's a license reference when that's not actually the intent. Other cases that are excluded are when the license is meant as configuration or test data and not actually applying to the code. These exclusions further filter down the set of the detected licenses for each file. Everything that's left at this point is reported. It gets compared to a baseline file (which is defined for each sub-repo). If the filtered results differ from what's defined in the baseline, the test fails.
Each job that runs this test will store pipeline artifacts to be used for logging purposes. It consists of two files:
The test suite was also refactored as part of these changes. Those changes have been included as a separate commit. See the commit description for more details on that: 109c3d8.
Fixes dotnet/source-build#3070
/cc @omajid