Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-component dependency solving #4087

Open
grayjay opened this issue Nov 5, 2016 · 16 comments
Open

Per-component dependency solving #4087

grayjay opened this issue Nov 5, 2016 · 16 comments

Comments

@grayjay
Copy link
Collaborator

grayjay commented Nov 5, 2016

I don't think there is a clear design for component-based solving yet, so I wanted to create an issue where we can discuss it. #1575 addresses circular dependencies in test suites but is moving towards a different solution.

Related issues:
#779, #3978, #3492 (comment) - solver doesn't reject configurations that require unbuildable/non-existent components
#1575 - cabal can't handle cycles between packages that are not cycles between components
#2725, #5413 - cabal requires dependencies of components that aren't being built
#3662 - more per-component support
#3263 - fine-grained dependencies

@grayjay
Copy link
Collaborator Author

grayjay commented Nov 5, 2016

Here are my thoughts, so far. At a minimum, the solver should keep track of dependencies between packages' components. (I think that it currently tracks dependencies from components to packages.) Then it can stop requiring dependencies of components that aren't needed, and ensure that required components are buildable. The solver should still ensure that all components within a package are consistent (have the same version and flag assignments), unless they have different qualifiers. Maybe we can add a flag or use private dependencies to relax that restriction later.

@ezyang
Copy link
Contributor

ezyang commented Nov 7, 2016

@grayjay, thanks for making this ticket. Your plan seems eminently reasonable. Go for it!

@grayjay
Copy link
Collaborator Author

grayjay commented Mar 20, 2017

Here is a first pass at a design:

Summary

  • This design isn't 100% component-based: We keep package variables and add one
    boolean variable for each component. However, the nodes of the dependency
    graph are components.
  • We transform packages' dependency trees to handle all combinations of setup
    Cabal version, Buildable field value, component enabled/disabled, and
    --enable-tests/benchmarks. This design would also fix Non-buildable components don't work with new-build and pre-1.24 Custom setups #3881.
  • We add several search tree transformations to enforce new component-related
    constraints.

Tree structure

  • Per-component solving requires some changes to the variables used by the
    solver, which means that we also need to change the goals, as well as the
    choice nodes created for those goals.

Variables

  • Types of variables (including those that already exist):
    • Package:
      • The instance chosen for a specific package, as before.
    • Flag:
      • The boolean value for a specific flag, as before.
    • Flag-like (I'm not sure whether these should be implemented as flag
      variables, stanza variables, or something new):
      • Each has two possible values.
      • They are used in conditionals to control dependencies, just like flags,
        except that they aren't in the .cabal file. They need to be added when
        the solver converts the package indexes to the solver-specific index
        format.
      • Three types of flag-like variables per package:
        • Tests/Benchmarks enabled:
          • Two variables per package, just like current test and benchmark
            stanza variables.
          • The default value is disabled.
        • Component on/off:
          • There is one variable per component in the package.
          • The two choices are Required and Not Required On and Off.
          • Required On:
            • It means that this component must be built. Either it is a
              top-level target, or it is a dependency of another component.
            • Buildable: False is treated as a failure.
          • Not Required Off:
            • The component does not need to be built. It may be built anyway,
              because of an old Cabal setup version or
              --enable-tests/benchmarks.
          • The default value is off.
        • Setup script features:
          • There is only one variable per package. The two choices are New and
            Old.
          • New:
            • It means that the package has a simple build type or uses
              Cabal >= 1.24 for the Custom build type.
            • The solver can choose to build a subset of components.
            • The solver can ignore dependencies of unbuildable components.
          • Old:
            • It means that the package can use Cabal < 1.24 for the Custom build
              type.
            • The solver must choose dependencies for all lib/exe components,
              though the components are allowed to have Buildable: False.
            • The solver must also choose dependencies for all test (or benchmark)
              components when tests (or benchmarks) are enabled, though they may
              also be unbuildable.
            • Dependencies are required for unbuildable components (dependencies are checked even if buildable is False #1725).
          • The default value is "new".
          • TODO: How do we know when a setup script supports per-component
            builds? Do we only need the setup Cabal version and build type?
  • Why do we still have package variables?
    • Per-component solving needs to work with Custom setup scripts that don't
      support per-component builds.
    • Constraints from one component should still apply to others. Otherwise, we
      might break packages that only constrain dependencies for one component
      (usually the library).
    • --enable-tests/benchmarks needs to find all tests/benchmarks in the
      package.
    • We only need to choose the package instance once for all components. If we
      ever need to build components with conflicting dependencies, we can use
      qualified goals.
  • Each time we choose a package instance while building the tree, we immediately
    add all flag-like goals for that instance to the set of remaining goals.
    TODO: How do we handle component goals for installed packages? Do we know
    which executables are available? Can we find all components by package name
    and version in the installed package index?
  • We add package goals as we discover new dependencies while building the tree,
    as before.

Index conversion

  • This phase adds flags to the dependency tree for each package, in order to
    implement the behavior of the flag-like goals described above.
  • This isn't a big change in design. The solver already transforms the
    dependency tree in order to ignore dependencies of unbuildable components
    (Tests for #2731 (Ignore dependencies that are not Buildable) #3039).
  • Exe/lib component transformation

    Say x represents the contents of an executable stanza (The same applies to
    libraries). x would be transformed into:
executable my-exe
  if flag(my-exe-component-on)
    x, with "Buildable: False" fields converted to failures
  else
    if flag(old-setup-script-features)
      x
    else
      x, transformed as in #3039 (in order to ignore dependencies when
      the component is unbuildable) and all dependencies converted to
      qualified constraints
  • Test/benchmark component transformation:

    x represents the contents of a test suite stanza. The transformation is
    like the lib/exe transformation, except that it also uses the tests-enabled
    flag.
test-suite my-test
  if flag(my-test-component-on)
    x, with "Buildable: False" fields converted to failures
  else
    if flag(tests-enabled)
      if flag(old-setup-script-features)
        x
      else
        x, transformed as in #3039 (in order to ignore dependencies when
        the component is unbuildable) and all dependencies converted to
        qualified constraints
    else
      empty test-suite stanza

Dependency graph

Custom setup validation steps

  • We add two search tree transformations to ensure that the value of the
    "setup script features" variable is consistent with the actual features of
    the setup script. We could combine these two transformations if it turns out
    to be simpler.
  • One transformation disables the "New" "setup script features" choice for each
    package that has Custom build type and a setup Cabal version that is less than
    1.24.
  • The other transformation enforces Cabal >= 1.24 for the setup script for each
    package that chose "New" for the "setup script features" variable.
  • (There are two transformations because the solver can assign the variables in
    either order.)

Component dependency validation steps

  • We add two more tree transformations. These could be combined with the
    existing validation phase transformations.
  • Note that we need to perform two checks for each dependency on a component:
    The package instance must contain the right component, and the component must
    be enabled.
  • First transformation:
    • Create a set of required components while descending the search tree, by
      looking at build-depends, tool-depends, and build targets.
    • For each package instance choice, prune all instances that don't contain all
      of the components that are required so far.
    • For each component choice, if the component is required, prune the "Off"
      choice.
  • Second transformation:
    • Create a set of visited packages and a set of available components while
      descending the search tree:
      • For each package instance choice, add the package to the package set and
        add all components provided by that instance to the component set.
      • For each component choice that turns off a component, remove that
        component from the component set.
    • Every time a dependency is introduced by a build-depends or
      tool-depends, check that if the package was visited then the component is
      available. If the component is not available, prune the choice that
      introduced the dependency.

Enabling tests/benchmarks

  • --enable-tests/benchmarks doesn't need to change much. The current stanza
    variables have the same behavior as the flag-like variables I described above.

Internal dependencies

  • The solver also tracks dependencies between a single package's components. A
    dependency between components is handled as a dependency on an exact package
    version, so that both components come from the same package instance.

More questions

/cc @kosmikus @ezyang @Ericson2314 @dcoutts @edsko @hvr @23Skidoo

@ezyang
Copy link
Contributor

ezyang commented Mar 20, 2017

Thanks, this looks great.

How do we know when a setup script supports per-component builds? Do we only need the setup Cabal version and build type?

Per-component occurs if and only if:

  • We are new-build
  • build-type is not Custom
  • cabal-version >= 1.8
  • --disable-per-component is not specified

The canonical source of truth is why_not_per_component in ProjectPlanning. Probably best to just cross-reference two sites for now.

Constraints from one component should still apply to others. Otherwise, we
might break packages that only constrain dependencies for one component
(usually the library).

But note that to fix things like #3732 we'll have to relax how we do the constraints here. I guess this is covered by what you say later: "If we
ever need to build components with conflicting dependencies, we can use
qualified goals."

How do we handle component goals for installed packages? Do we know
which executables are available? Can we find all components by package name
and version in the installed package index?

At the moment, we won't know anything about executables, but we don't have to, because new-build will never have those as "installed" packages.

Internal libraries pose a special problem, because they show up in the installed package index (I recently committed a hack to handle this case). With our current featureset, we are allowed to assume that any internal library in the installed package index can only be reached by a corresponding "public" library from the package (so the "correct" way to handle internal libraries is to bundle them all up together with the public library into one 'package node'). The hacked code doesn't actually do this but that's what should be done morally.

What if a user specifies an unbuildable component as a build target? The
solver could try to build the component by flipping flags, but then the
install plan would depend on the list of targets.

Perhaps there should be a distinction made between "build targets" (what am I going to build this round) and "solver targets" (please work hard to dep solve so that X is buildable.) The latter is stable and shouldn't change even if you decide to build different things. This should solve the next bullet point too.

@grayjay
Copy link
Collaborator Author

grayjay commented Mar 25, 2017

Thanks for the feedback!

Per-component occurs if and only if:

We are new-build
build-type is not Custom
cabal-version >= 1.8
--disable-per-component is not specified

I see that I had this part backwards. These rules simplify things, because the solver doesn't need to choose whether to use per-component build for each package; there is only one choice. In that case, I think that we don't need to add the "setup script features" variables, for now.

I had also incorrectly assumed that every package that ignores dependencies of unbuildable components also supports per-component build. Since the two attributes are not linked, I think we should handle #3881 separately.

But note that to fix things like #3732 we'll have to relax how we do the constraints here. I guess this is > covered by what you say later: "If we
ever need to build components with conflicting dependencies, we can use
qualified goals."

I'm not sure that we should make cabal find a solution for the example at the top of #3732, which only constrains base in one component. It looks the same as a package where the author meant for the constraint on base to apply to all components but wanted to avoid duplication. I haven't searched Hackage to see how common that pattern is, though.

Per-component solving would still allow more flexibility in less extreme cases, even if it always applied all build-depends constraints. Take this example:

executable exe1
  build-depends: x > 2

executable exe2
  build-depends: y < 3

Say that x > 2 and y < 3 have conflicting constraints on z. Then per-component solving would allow exe1 and exe2 to be installed separately while the current solver wouldn't find a solution.

Qualified goals would only help by allowing components that were independently solvable to be solvable together with different qualifiers (possibly using different versions of the package containing the components).

At the moment, we won't know anything about executables, but we don't have to, because new-build will never have those as "installed" packages.

Okay, so cabal will behave as if those package instances provided libraries but not executables.

Internal libraries pose a special problem, because they show up in the installed package index (I recently committed a hack to handle this case). With our current featureset, we are allowed to assume that any internal library in the installed package index can only be reached by a corresponding "public" library from the package (so the "correct" way to handle internal libraries is to bundle them all up together with the public library into one 'package node'). The hacked code doesn't actually do this but that's what should be done morally.

I'm not sure I understand this part. Can we always reach all of the internal libraries if we have the main library from the installed package index? I think that is all the solver needs to do.

Perhaps there should be a distinction made between "build targets" (what am I going to build this round) and "solver targets" (please work hard to dep solve so that X is buildable.) The latter is stable and shouldn't change even if you decide to build different things. This should solve the next bullet point too.

That distinction makes sense. I still don't know how we should choose those two sets of targets, though.

For "solver targets", maybe we should just give the solver a list of packages to solve for and apply a preference to build all components within each of those packages. Is there any situation where we would want to give the solver a subset of a package's components as targets?

@hvr
Copy link
Member

hvr commented Mar 25, 2017

re

I'm not sure that we should make cabal find a solution for the example at the top of #3732, which only constrains base in one component. It looks the same as a package where the author meant for the constraint on base to apply to all components but wanted to avoid duplication.

IMO a better way to avoid duplication is to make it more explicit that multiple components ought to share common dependencies/constraints via the likes of #2832, and then we could unlock/enable per-component solving also in the aforementioned case when a high enough cabal-version: value is set (c.f. to how the semantics for build-depends & multiple components changed between cabal-version 1.6 and 1.8)

@ezyang
Copy link
Contributor

ezyang commented Mar 27, 2017

I had also incorrectly assumed that every package that ignores dependencies of unbuildable components also supports per-component build.

Yeah, but I think the converse is true (per-component = correct handling of non-buildable components.)

I'm not sure I understand this part. Can we always reach all of the internal libraries if we have the main library from the installed package index? I think that is all the solver needs to do.

Not necessarily. For example, we may have registered an internal library which was only linked by the test suite, but not by the actual public library of a package. @dcoutts would probably say that this just means we shouldn't have registered it but it seems better to me to not assume any relationship without tracing the dependency relations.

For "solver targets", maybe we should just give the solver a list of packages to solve for and apply a preference to build all components within each of those packages. Is there any situation where we would want to give the solver a subset of a package's components as targets?

Yeah I'm not sure what the default should be. What we have right now is best effort which seems to work decently well, so maybe every component would have a designation like, "never", "best-effort", "always", with everything defaulting to best-effort.

grayjay added a commit to grayjay/cabal that referenced this issue Jan 13, 2018
…askell#4161).

The solver already detected cycles involving more than one package, but it
allowed dependencies between components within a package.  This commit treats a
dependency between a package's setup script and library as a cycle in order to
allow the solver to backtrack and try to break the cycle.  A more thorough
solution would involve tracking all dependencies between components, as in haskell#4087.

This commit also fixes the internal error in issue haskell#4980.
grayjay added a commit to grayjay/cabal that referenced this issue Jan 17, 2018
…askell#4161).

The solver already detected cycles involving more than one package, but it
allowed dependencies between components within a package.  This commit treats a
dependency between a package's setup script and library as a cycle in order to
allow the solver to backtrack and try to break the cycle.  A more thorough
solution would involve tracking all dependencies between components, as in haskell#4087.

This commit also fixes the internal error in issue haskell#4980.
grayjay added a commit to grayjay/cabal that referenced this issue May 6, 2018
This commit generalizes the fix for issue haskell#4781
(e86f838) by tracking dependencies on
components instead of dependencies on executables.  Associating each dependency
with a component also moves towards the design for component-based dependency
solving described in issue haskell#4087.
grayjay added a commit to grayjay/cabal that referenced this issue May 6, 2018
This commit generalizes the fix for issue haskell#4781
(e86f838) by tracking dependencies on
components instead of dependencies on executables.  That means that the solver
always checks whether a package contains a library before using it to satisfy a
build-depends dependency.  If a version of a package doesn't contain a library,
the solver can try other versions.  Associating each dependency with a component
also moves towards the design for component-based dependency solving described
in issue haskell#4087.
23Skidoo pushed a commit that referenced this issue May 9, 2018
This commit generalizes the fix for issue #4781
(e86f838) by tracking dependencies on
components instead of dependencies on executables.  That means that the solver
always checks whether a package contains a library before using it to satisfy a
build-depends dependency.  If a version of a package doesn't contain a library,
the solver can try other versions.  Associating each dependency with a component
also moves towards the design for component-based dependency solving described
in issue #4087.
@grayjay
Copy link
Collaborator Author

grayjay commented May 13, 2018

After #4884 and #5304, the solver stores the two components that are involved in each dependency. The dependent component is a D.Solver.Types.ComponentDeps.Component, and the depended upon component is a library or executable. The two PRs also implement the part of "Component dependency validation steps" from #4087 (comment) that checks for the existence of components, though the solver doesn't track whether components are enabled yet.

hvr pushed a commit that referenced this issue Jun 8, 2018
This commit generalizes the fix for issue #4781
(e86f838) by tracking dependencies on
components instead of dependencies on executables.  That means that the solver
always checks whether a package contains a library before using it to satisfy a
build-depends dependency.  If a version of a package doesn't contain a library,
the solver can try other versions.  Associating each dependency with a component
also moves towards the design for component-based dependency solving described
in issue #4087.

(cherry picked from commit 6efb5e2)
@phadej
Copy link
Collaborator

phadej commented May 16, 2019

Lack of this feature required to split of test-suites and benchmarks from containers. And probably will need to do the same exercise for time. Also I cannot use QuickCheck in the tests of splitmix anymore, as QuickCheck depends on splitmix. This list grows as v2-build gets adapted.

IMO, not having this in 3.0 makes v2-build unconvinient for packages deep down in the dependency tree.

Is there anything one can do to make this happen sooner, than later.

@Ericson2314
Copy link
Collaborator

I would be very happy to teach Cabal cross immediately if only this were done, too.

@grayjay
Copy link
Collaborator Author

grayjay commented May 30, 2019

I'm less familiar with the change that would be required to allow cycles between packages, but I think it could be implemented separately from the other parts of this feature, like the changes to flag variables. Unless I'm missing something, it would mainly require changes to cycle detection and the conversion from the solver's dependency graph to the one used outside of the solver:

I'm not sure how much code outside of the solver will also need to be changed to maintain the dependencies between components and ensure that components are built in the correct order. I'm also not sure if there will be other issues with having cycles between packages in the solver, such as infinite loops.

I don't think I'll have time to work on this feature in the near future, though I could try to answer questions or review code if anyone else wants to implement it.

@Ericson2314
Copy link
Collaborator

@grayjay

Should we only relax cycle detection for tests and benchmarks for now?

I think the idea is that the cycles really and actually go away once we track on the level of components. No special-cases are needed.

I'm not sure what @phadej was thinking but I believe we need private dependencies for changes containers to not cause a rebuilding of the testing libraries using containers. Solving that without private dependencies would, however, be a special case for benchmarks and test suites.

@grayjay
Copy link
Collaborator Author

grayjay commented Jun 9, 2019

Should we only relax cycle detection for tests and benchmarks for now?

I think the idea is that the cycles really and actually go away once we track on the level of components. No special-cases are needed.

I meant that I was wondering whether there is a use case for other types of cycles now and whether we should disallow them in order to simplify install plans. We can always remove the restriction later, but it would be very hard to move in the opposite direction.

I'm not sure what @phadej was thinking but I believe we need private dependencies for changes containers to not cause a rebuilding of the testing libraries using containers. Solving that without private dependencies would, however, be a special case for benchmarks and test suites.

I'm not sure I understand. I think that private dependencies is a separate feature that doesn't require per-component dependency solving. The approach taken in #3422 is like a special case of private dependencies. I think that per-component solving would only help solve the test suite cycle issue if we used it to change the solver's cycle detection to use components instead of packages. It would cause rebuilds of the test framework, though.

@Ericson2314
Copy link
Collaborator

I meant that I was wondering whether there is a use case for other types of cycles now and whether we should disallow them in order to simplify install plans. We can always remove the restriction later, but it would be very hard to move in the opposite direction.

I agree let's not overcommit to allowing things we will want to band later. I'm not fan of "Postel's law"'s over-application with these matters :).

The approach taken in #3422 is like a special case of private dependencies.

OK glad we agree on that.

I'm not sure I understand. I think that private dependencies is a separate feature that doesn't require per-component dependency solving.

Right if #3422 is a special case of cycles and private dependencies, I rather grow it into "private dependencies in general" rather than "cycles in general".

I think that per-component solving would only help solve the test suite cycle issue if we used it to change the solver's cycle detection to use components instead of packages. It would cause rebuilds of the test framework, though.

Totally agreed. @phadej's changes to containers (make a package under a different name in other cabal project) both avoid the cycles and avoid the extra test rebuilds. I wanted to point out that per-component therefore is not enough on its own to obviate the renamed package---need the private dependencies part to avoid the rebuilds too or there's a usability regression.

@isovector
Copy link
Contributor

I just ran into this while trying to make polysemy depend on polysemy-plugin, but polysemy-plugin:test depend on polysemy. It's a frustrating limitation, and one that totally breaks the abstraction that these things form a graph. How much work is left to be done here? Could you use an engineering hand?

@grayjay
Copy link
Collaborator Author

grayjay commented Oct 8, 2019

@isovector None of the changes to cycle detection have been implemented yet, so we could definitely use a hand! I gave a summary of what I think needs to be done to only update the cycle detection in #4087 (comment). I'm not sure if it would require more changes outside of the solver, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants