Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Reduce number of variable sources #460

Closed

Conversation

m1kola
Copy link
Member

@m1kola m1kola commented Oct 13, 2023

Description

This PR reduces number of variable sources to one.

Closes #437

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 13, 2023
@codecov
Copy link

codecov bot commented Oct 13, 2023

Codecov Report

Attention: 21 lines in your changes are missing coverage. Please review.

Comparison is base (7156464) 83.75% compared to head (c9ee1dd) 83.60%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #460      +/-   ##
==========================================
- Coverage   83.75%   83.60%   -0.16%     
==========================================
  Files          23       20       -3     
  Lines         868      805      -63     
==========================================
- Hits          727      673      -54     
+ Misses         96       91       -5     
+ Partials       45       41       -4     
Flag Coverage Δ
e2e 63.35% <69.74%> (-1.74%) ⬇️
unit 78.10% <80.50%> (-0.35%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
cmd/manager/main.go 72.72% <100.00%> (ø)
internal/resolution/variablesources/bundle.go 100.00% <100.00%> (ø)
...al/resolution/variablesources/bundle_uniqueness.go 100.00% <100.00%> (ø)
...nal/resolution/variablesources/required_package.go 100.00% <100.00%> (+25.58%) ⬆️
...al/resolution/variablesources/installed_package.go 68.00% <64.00%> (-3.43%) ⬇️
.../resolution/variablesources/olm_variable_source.go 64.70% <64.70%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 17, 2023
@openshift-merge-robot openshift-merge-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 17, 2023
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 20, 2023
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 21, 2023
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 23, 2023
@m1kola m1kola changed the title PoC: refactor variable sources Reduce number of variable sources Oct 23, 2023
@m1kola m1kola force-pushed the shrink_variable_sources branch 8 times, most recently from f55d83b to 5d43ee6 Compare October 27, 2023 09:58
@@ -1,63 +1,34 @@
package variablesources
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially had everything in this package in two files: olm.go and olm_test.go. While olm.go was is still manageable, navigating olm_test.go was a bit difficult. So I decided to split into multiple files.

It resulted in a more complicated diff (while the files themselves are straightforward).

@m1kola m1kola marked this pull request as ready for review October 27, 2023 10:14
@m1kola m1kola requested a review from a team as a code owner October 27, 2023 10:14
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 27, 2023
}, nil
result := make([]*olmvariables.InstalledPackageVariable, 0, len(bundleDeployments))
processed := sets.Set[string]{}
for _, bundleDeployment := range bundleDeployments {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I create a bundleDeployment directly and it happens to reference an image that's in a catalog, now OLM is going to start including it in resolution?

That seems incorrect. We should only be looking at installed bundles and successors for the Operator objects that exist.

Copy link
Member Author

@m1kola m1kola Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it currently works in main and I'm trying not to introduce any behaviours in this PR (I know, the size of the PR makes it hard to figure out what happens, but I didn't manage to split it into smaller consumable PRs).

That seems incorrect. We should only be looking at installed bundles and successors for the Operator objects that exist.

It raises other questions such as - what is desired behaviour in case of conflicts (e.g. conflicting CRDs or dependencies)? How do we make OLM safe when conflicting with untracked BDs?

I think #443 is related to this. Should we leave the behaviour as in main and track it in #443 & address separately?

Edit: here is the relevant code in main.

@ncdc
Copy link
Member

ncdc commented Oct 30, 2023

I would prefer we remove the VariableSource concept entirely and instead create a DeppySolver on the fly when you need it (i.e. not thread safe/can't store as a field in a controller struct), and call something like AddVariables(...) 1 or more times when you want to give it inputs. I believe this will be easier to maintain, easier to test, easier to reason about, and it makes the set of inputs static.

I realize you have put a lot of time into this PR @m1kola, but I believe not merging this as-is and instead going in the direction proposed above will have significant long-term benefits.

Comment on lines 57 to 55
bundles, err := BundleVariables(allBundles, requiredPackages, installedPackages)
if err != nil {
return nil, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels off to me that we're using variables to compute more variables. For me at least, this structure makes it really hard to build intuition about what each variable is actually adding to the problem.

Without diving too deeply, it appears to me like we:

  • Figure out what bundles are required based on the Operator objects that exist (requiredPackages)
  • Figure out what bundles are required based on existing BDs and their successors (but this is for all BundleDeployments, not just BDs managed by Operators. (installedPackages)
  • For requiredPackage bundles and installedPackage bundles combined, build a set of Bundle variables that:
    1. include all requiredPackage and installedPakcage bundles (so bundles is a superset?)
    2. include all possible dependencies, recursively, based on the roots from (i)

Am I getting this correct? If so, what purpose do the requiredPackages and installedPackages variables actually serve for the resolver? If none (because bundles is a superset?), can we refactor such that requiredPackages and installedPackages are not actually variables and are instead just lists that we use to filter allBundles to bundles?

Copy link
Member Author

@m1kola m1kola Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Figure out what bundles are required based on the Operator objects that exist (requiredPackages)

Correct. It is like a desired state state set by an user.

  • Figure out what bundles are required based on existing BDs and their successors (but this is for all BundleDeployments, not just BDs managed by Operators. (installedPackages)

Yes, these are installed packages. More like an actual state of the cluster. This bit needs furthere discussion related to #460 (comment) + #443.

  • For requiredPackage bundles and installedPackage bundles combined, build a set of Bundle variables that:
    1. include all requiredPackage and installedPakcage bundles (so bundles is a superset?)
    2. include all possible dependencies, recursively, based on the roots from (i)

I think this is correct

Am I getting this correct? If so, what purpose do the requiredPackages and installedPackages variables actually serve for the resolver?

Am I getting this correct? If so, what purpose do the requiredPackages and installedPackages variables actually serve for the resolver? If none (because bundles is a superset?), can we refactor such that requiredPackages and installedPackages are not actually variables and are instead just lists that we use to filter allBundles to bundles?

Both RequiredPackageVariable and InstalledPackageVariable represent bundles so is BundleVariable, but they have different semantics/meaning:

  • RequiredPackageVariable - what user explicitly requested
  • InstalledPackageVariable - what we already have on the cluster
  • BundleVariable - everything else

@joelanford I think we can refactor as you suggest we can achieve the same result on successfull resolution. However on failed resolution error messages will be harder to understand: there will be no difference between a dependency of a dependency of a dependecny and a thing which user explicitly requested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However on failed resolution error messages will be harder to understand: there will be no difference between a dependency of a dependency of a dependecny and a thing which user explicitly requested.

I'm not convinced that's true (or at least necessarily has to be the case). I'd imagine something like this is what is (or could be) fed into the resolver:

  • OperatorVariable(Package: a, DependsOnOneOf(aCurrent, aSuccessor), Mandatory: true)
  • BundleVariable(Bundle: aCurrent, DependsOnOneOf(bCandidate), Mandatory: false)
  • BundleVariable(Bundle: aSuccessor, DependsOnOneOf(bCandidate), Mandatory: false)
  • BundleVariable(Bundle: bCandidate, DependsOnOneOf(cCandidate), Mandatory: false)
  • BundleVariable(Bundle: cCandidate, Mandatory: false)

Any resolution failure that involves cCandidate being in conflict will only be unsat in this case because of the mandatory Operator variable for package a. And the unsat error message would traverse this dependency chain, right?

Copy link
Member Author

@m1kola m1kola Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both RequiredPackageVariable and InstalledPackageVariable represent bundles so is BundleVariable

Just realised - this is incorrect. RequiredPackageVariable and InstalledPackageVariable represent packages which depend on specific bundles.

I'd imagine something like this is what is (or could be) fed into the resolver

Do I understand correctly that it will be an equivalent of combining RequiredPackageVariable and InstalledPackageVariable together into OperatorVariable? At first glance it makes sense and might work, but we need to look into it in more depth.

I suggest that we do it as a separate effort: in this PR leave the structure of the resolution problem as it currently is in main (this PR is already large) and separately look into combining variables.

I'll split #437 into two issues: geting rid of variable sources and combining variables. Edit: Done. I created a separate issue for reducing number of variable types and linked to this thread - #494

@m1kola
Copy link
Member Author

m1kola commented Oct 30, 2023

I would prefer we remove the VariableSource concept entirely

@ncdc I had similar thoughts recently. We removed the concept of entity sources from deppy and I think we can remove the abstraction of varaible sources from it too. Deppy can accept a slice of inputs and it should be good enough.

I'll create an issue in Deppy for this so we can discuss in it. Edit: created operator-framework/deppy#153

I realize you have put a lot of time into this PR @m1kola, but I believe not merging this as-is and instead going in the direction proposed above will have significant long-term benefits.

I think we should proceed with this PR. Splitting variable sources into separate functions will make the transition to deppy version without entity sources easier: we will be able to use the same functions which are not aware of variable source concept (and are tested separately). Also this change I think will help me to complete the remaining bit of implementation for initial upgrade support (#449) a bit easier: I won't need to shatter the implementation between multiple variable soruces (I will be focusing on #449 before anything else most likely).

@joelanford
Copy link
Member

I added some comments to this PR which will hopefully help navigate this change. I'll think more about how I can split/reduce this diff, but at the moment nothing comes to mind.

I do agree with @stevekuznetsov that this PR could have been made easier to review. I happen to be somewhat familiar with the state on main, so I was able to focus on the new code with my background knowledge of the old code.

One suggestion would be to have a commit that literally just takes the original code and moves it into the destination files; so no changes other than moving code. Then another commit that actually changes structures and function signatures, then finally any other commits with net-new changes.

@m1kola
Copy link
Member Author

m1kola commented Oct 31, 2023

One suggestion would be to have a commit that literally just takes the original code and moves it into the destination files; so no changes other than moving code. Then another commit that actually changes structures and function signatures, then finally any other commits with net-new changes.

I tried to move variablesources/crd_constraints.go to variablesources/bundle_uniqueness.go as the latter uses a big chunk from from the former. I can probably do the same for variablesources/bundles_and_dependencies.go & variablesources/bundle.go (it appears in the diff like a move anyway). But it did not seem very helpful to me (probably because some variable sources were combined together).

At the moment I'm trying to split it into a train of small incremental PRs, but it is taking a lot time. Bear with me.

Signed-off-by: Mikalai Radchuk <mradchuk@redhat.com>
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 3, 2023
@openshift-merge-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ncdc ncdc changed the title Reduce number of variable sources 🌱 Reduce number of variable sources Nov 6, 2023
@m1kola
Copy link
Member Author

m1kola commented Nov 6, 2023

Closing this now as everyhing from this PR got into one of the above mentioned PRs in one form or the other.

@m1kola m1kola closed this Nov 6, 2023
@m1kola m1kola deleted the shrink_variable_sources branch November 6, 2023 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce number of variable sources involved in resolution process
6 participants