Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix handling of NuGet transitive dependencies #10449

Merged
merged 1 commit into from
Aug 16, 2024

Conversation

brettfo
Copy link
Contributor

@brettfo brettfo commented Aug 15, 2024

### Contains a temporary commit to redirect the smoke test branch. This commit needs to be reverted prior to merging, but is here to show the updated smoke tests pass.

Related smoke test PR: dependabot/smoke-tests#221

Fixes handling of transitive dependencies in the NuGet updater

The handling of multiple dependencies, including transitive dependencies, wasn't done correctly for NuGet. An example best illustrates this:

  1. file_parser.rb returns 3 dependencies:
    • Dependency1, name=Newtonsoft.Json, Version=12, locations={FileA.csproj}
    • Dependency2, name=Newtonsoft.Json, Version=11, locations={} // this was a transitive dependency, no location given to conform to expectations in the common code, but in reality, this transitive dependency came from FileB.csproj
    • Dependency3, name=Newtonsoft.Json, Version=13, locations={FileC.csproj}
  2. These dependencies are then combined via the DependencySet type into one single dependency, note that the first dependency's version was retained:
    • Dependency, name=Newtonsoft.Json, Version=12, locations={FileA.csproj, FileC.csproj}
  3. Eventually file_updater.rb is called and at that point we attempt to do the following updates:
    • Update Newtonsoft.Json from 12 to 13 in FileA.csproj => success
    • Update Newtonsoft.Json from 12 to 13 in FileB.csproj is skipped because it's a transitive dependency; this is possibly ok
    • Update Newtonsoft.Json from 12 to 13 in FileC.csproj => failure, because FileC.csproj already has 13, nothing to do

Now consider the exact scenario above, but this is a security update job. The difference is that we do want to update Dependency2 in FileB.csproj, but the combined dependency from step 2 doesn't retain any information about this, so we can't do anything.

Without rewriting a lot of the core logic, we can't get around the use of DependencySet and how it combines dependencies based on the name, but not the version, so to allow the NuGet updater to work the following changes were made:

  1. Gather all dependencies by name, but before we return them up to the core, we set the "common" version to the minimum found. This will ensure that if by chance the first dependency/version found happened to be the latest, we don't accidentally think that everything is up-to-date; we want to report the worst case scenario so we have an opportunity to respond.
  2. Retain in the metadata if the dependency was originally a transitive dependency.
  3. When update_checker.rb is called, add a previous_requirement metadata item. This is because we can no longer do a 1:1 mapping between requirements and previous_requirements to match the versions, we can really only trust the requirements array, so it has to contain all relevant update data.
  4. Also in update_checker.rb we need to know if this was a security update job, because that's important later, so we add a vulnerable: true item to the dependency's metadata.

Now the dependency when passed to file_updater.rb looks like this:

Dependency, name=Newtonsoft.Json, Version=11, locations={FileA.csproj, FileC.csproj}, metadata: {vulnerable: true}

We see the vulnerable: true metadata and know that the dependency given to us isn't complete, so at this point we rehydrate our initial dependency discovery and use those locations, including all with is_transitive: true. The result is that we do the correct update on each of the three files: FileA.csproj, FileB.csproj, and FileC.csproj.

There is one more change here. If an update job was started specifically requesting a certain package be updated and that dependency only exists as a transitive dependency (and thus isn't reported upstream), but the update isn't a security update job, we'll enter the file updater where vulnerable is false, but the list of files to update is empty. In that case we also rehydrate the original dependency locations because we were explicitly told to do something, but didn't have anything to do.

One bit of fallout that might be considered "incorrect", but I think is fine pertains to the example above, where we're trying to update a package version 11 to 13 in one location, and version 12 to 13 in another. When the PR description is created, we'll only report that version 11 is being updated to 13; there will be no mention that in another location version 12 was updated to 13. I think this is ok given that the reported changes fully encompass what's happening.

@brettfo brettfo added the L: dotnet:nuget NuGet packages via nuget or dotnet label Aug 15, 2024
@brettfo brettfo requested review from a team as code owners August 15, 2024 18:49
checked_files.add("#{downstream_file}-#{dependency.name}#{dependency.version}")
# rubocop:disable Metrics/AbcSize
sig { returns(T::Array[DependencyDetails]) }
def expanded_dependency_details
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was pulled out into a separate method because it made it really easy to unit test.

Copy link
Contributor

@kbukum1 kbukum1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L: dotnet:nuget NuGet packages via nuget or dotnet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants