Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for provides-mechanism #3061

Open
hvr opened this issue Jan 18, 2016 · 19 comments
Open

Proposal for provides-mechanism #3061

hvr opened this issue Jan 18, 2016 · 19 comments
Milestone

Comments

@hvr
Copy link
Member

hvr commented Jan 18, 2016

Problem Statement

Currently there is no satisfying way to express packages which
conflict with each other in the same install-plan.

Some limited form of conflicts can be expressed when two packages
depend directly on each other via build-depends relations.

However, this is not enough to handle cases where it is necessary to
make sure that two or more packages are mutually exclusive to each
other.

This the need for such mutual exclusivity occurs when entities which
may need to be unique throughout the install-plan, such as
(desugared-into) typeclasses or relocated orphan instances are
involved.

Motivating Use-cases

Relocating orphan instances into compat-packages

One recent use-case which can't be solved properly currently is the
following:

  • semigroupoids-4.3 conditionally defines an orphan instance Foldable (Either a) (for base<4.8)
  • Starting with base-4.8, base started providing instance Foldable (Either a)
  • Then base-orphans-0.1 started equally providing an such an orphan instance like semigroupoids-4.3
  • semigroupoids-5.0 now depends on base-orphans

However, there's currently no way to express that the combination

  • base >= 4.8
  • semigroupoids < 5, and
  • base-orphans

is forbidden to occur in an install-plan.

Alternative base/Preludes replacements with incompatible typeclass hierarchies

This use-case refers to a future not-yet implemented
ExportedRebindableSyntax extension, which would allow the currently
in scope Prelude module to export desugaring rules for things like
do-syntax, and therefore allow to provide seamless Haskell2010
support (with a different Monad class different from base's
Monad class) in recent GHCs.

Some packages could then opt in to explicitly support the
haskell2010 package by depending on haskell2010 or base
conditionally in their .cabal file:

if flag(haskell2010)
   build-depends: haskell2010 == 1.2.*
else
   build-depends: base =>4.3 && <4.10

However, since each package would have their own haskell2010 flag,
which are not synchronised by cabal, there is no way to currently
express that haskell2010 and base are mutually exclusive.

Proposed enhancement

Provide a new provides: <string-token> declaration (allowed
everywhere a build-depends: declaration is currently allowed).

The <string-token> would denote a token in a global namespace which
at most one package selected in a given install-plan may provide.

Application for orphans scenario

This way, semigroupoids-4.3's cabal file could be edited and have a

provides: orphans-base-foldable-either

Likewise, base-orphans (and possibly other packages such as
base-compat which also contain such orphans) would declare
provides: orphans-base-foldables-either as well in their respective
.cabal files.

Application for base/Prelude replacements

base.cabal simply declares something like

provides: core-primitives

and every other package which provides a base library which replaces
the core primitives (i.e. core typeclasses, and other entities
wired-in into the language via desugaring rules) and therefore don't
interoperate with base would state the same provides declaration.

Alternative constraint-mechanism

Once could allow to declare constraint: properties in .cabal files.

This way, the respective base-orphans package versions could start
declaring all package version ranges containing orphans that were
adopted into base-orphans, e.g. base-orphans.cabal could specify

constraint: semigroupoids >= 5

or dually, semigroupoids.cabal (for semigroupoids-4.3 and other
versions affected) could declare

constraint: orphans-base<0

if there is no orphans-base version compatible with
semigroupoids-4.3, and therefore orphans-base is incompatible.

However, constraint is very powerful. than the proposed
provided-mechanism. Being more general, this may be more complex to
implement in the solver. Moreover, when abused it can result in
confusing install-plan results.

Finally, when modelling the situation where more than 2 packages need
to be mutually exclusive, expressing this in the
constraint-formulation becomes more complicated and doesn't, as each
involved package needs to know about all other conflicting
packages. Specifically the base-replacements use-case with multiple
packages becomes impractical to express.

The proposed provided-mechanism is weaker, makes it easy to express
mutual exclusivity among N packages, and is IMO easier to implement
as well as easier to reason about.

Alternative conflicts:-mechanism

During the discussion a variant of constraint: was suggested, specifically to express conflicts directly via conflicts: which denotes the inverse of constraints:. I.e. rather than saying

 constraint: semigroupoids >= 5
 constraint: orphans-base<0

one specifies

conflicts: semigroupoids < 5
conflicts: orphans-base

which is easier to reason about for the use-case of specifying conflicting package versions.

Alternative orphan-instances:-mechanism

This would be a variant of the provides mechanism specialised to orphan instances, by providing a machine-verifiable (i.e. tooling can help making sure the enumeration is accurate) enumeration of orphan instances provided by a package.

See #3061 (comment)

/cc @kosmikus @ezyang @dcoutts @nomeata @bergmark @23Skidoo @ttuegel @RyanGlScott

@RyanGlScott
Copy link
Member

I'm a bit wary of the provided-mechanism for three reasons:

  1. There's no relation between the exported string tokens and the things they represent (be it orphan instances or core primitives) other than the name, so it will be very tricky to get other package maintainers to use a common string token vocabulary that will ensure that different packages exporting the same orphan instances won't be able to built together. Granted, the Hackage maintainers can retroactively edit .cabal files to use the agreed-upon string tokens, put that feels more like patchwork than a robust solution.
  2. For base-orphans in particular, enumerating every single orphan instance that it provides is going to be a nightmare. By looking at the source of Data.Orphans, you'll see some convoluted CPP guards such as #if __GLASGOW_HASKELL__ < 710 && if !defined(mingw32_HOST_OS) && !defined(__GHCJS__). Trying to encode the same information into the .cabal is highly redundant and quite challenging to get right with Cabal flags.
  3. The core-primitives string token in particular seems somewhat suspect. I regularly use packages that provide alternative base libraries and Preludes in combination with other modules that just use base. It seems strangely prohibitive to disallow this.

For these reasons, I'd be much more inclined to support a constraint-based mechanism.

@hvr
Copy link
Member Author

hvr commented Jan 18, 2016

Well, the vocabulary issue can be addressed by maintaining a dictionary with descriptions in Hackage (which is validated against on package uploads). This dictionary could be maintained by hackage trustees. Hackage would be able to list all package/versions which provides the given token.

As for the granularity, if you have a dictionary with descriptions, you don't need to enumerate each single instance. You could group into logical units. In a couple of years we may have a proper solution with backpack (@ezyang, right? :-) ) , but we need a solution now and independent of package-sets (a feature that isn't available either yet), as the install-plans are starting to decay now, and I see no way to fix the meta-data properly.

As for having to replicate CPP at the .cabal-level, you'd have that kind of problem with constraints as well. We already have it for other meta-data that needs to be accurate for the cabal solver to pick up on (*-extensions and build-depends). Otoh, we should try to reduce CPP usage anyway, and move logic more into .cabal files (which then allows to get subsume CPP usage in some cases). But this is another topic altogether...

Finally, the core-primitives token is not meant for alternates preludes which can be used alongside base, but rather for preludes that are incompatible with base! Without this, I see no feasible way to bring back proper haskell2010-support to future GHCs. More importantly, the constraint-mechanism simply doesn't work here, as we'd need something in the order of O(N^2) constraint-definitions to describe the mutual exclusivity (c.f. complete graph). so constraint is not an option here anyway.

@RyanGlScott
Copy link
Member

Well, the vocabulary issue can be addressed by maintaining a dictionary with descriptions in Hackage (which is validated against on package uploads). This dictionary could be maintained by hackage trustees. Hackage would be able to list all package/versions which provides the given token.

It's not clear to me how flexible this solution is. For example, suppose I am a package author that wants to export an orphan instance. Do I need to ask the Hackage maintainers for a string token first? If such a string token exists, how would I look it up easily? (there are some pretty strange-looking instances out in the wild, after all). If I fail to discover an existing instance string token and upload my package with a different, unapproved token, would my package be rejected? Would its .cabal metadata be revised? Something else?

We'd also need to be wary of different datatypes with the same name. For example, both base and old versions of transformers have an Identity datatype, and base-orphans and transformers-compat have orphan instances for both Identitys that have the same implementation (but are technically different).

As for the granularity, if you have a dictionary with descriptions, you don't need to enumerate each single instance. You could group into logical units.

That sounds like a good idea. (For my needs, it will probably be impossible to come up with a coherent group of tokens due to the sheer number of GHC version differences, but base-orphans is definitely the edge case.)

Finally, the core-primitives token is not meant for alternates preludes which can be used alongside base, but rather for preludes that are incompatible with base! Without this, I see no feasible way to bring back proper haskell2010-support to future GHCs. More importantly, the constraint-mechanism simply doesn't work here, as we'd need something in the order of O(N^2) constraint-definitions to describe the mutual exclusivity (c.f. complete graph). so constraint is not an option here anyway.

You're right, I didn't read the phrase "don't interoperate" closely enough.

And it's true the constraint is definitely not a perfect solution for all of these scenarios. For base-orphans, however, I think it might be just the ticket, and at first glance appears to be a much simpler solution than provides.

@ezyang
Copy link
Contributor

ezyang commented Jan 18, 2016

  • Chatting with @hvr on IRC, I discovered the reason why we can't just release a bugfix new version for semigroupoids-4.3 which depends on base-orphans is because the reason Herbert is tripping over this problem is he is expanding the matrix builder to test more versions of a package than just the latest, to discover when metadata is incorrect.
  • On the naming front, I don't think we should call this provides, since Debian-style provides mean something different: it implies you can depend on one of these tokens (they are "virtual" packages, which the solver picks an implementation to use.)
  • I think it might not be a bad idea to enforce a semantic naming convention immediately. Maybe we can relax it later but it seems quite sensible for now. Something like orphan-instances: base:Data.Either.Either base:Data.Foldable.Foldable (a package name, the module name, and the entity name. It's like a GHC original name but the unit ID is not fully resolved). If we have this, fancy orphan handling machinery can be built off of it later. Yes, this is not a general "provides" mechanism, but since we're just looking for an immediately actionable fix I'd prefer something more specialized. (Yes, this won't work for alternative-base; I'm not sure how strenuously we should attempt to support this use case)

@hvr
Copy link
Member Author

hvr commented Jan 18, 2016

to test more versions of a package than just the latest

Minor clarification, currently I mostly test the "otherwise unconstrained" solutions the cabal solver comes up with (when only constraining pkg, pkg-version, and GHC version, but nothing else) for the primary package report matrices.

However, that solution can easily shift into a different optimum with the slightest additional constraint. Also, if a package is built being a build-dependency of another packages, this provides us often yet another point in the configuration space. And the solution become even more interesting, if I start varying a single dependency (by forcing a specific version of that package) to test whether lower/upper bound for a single dependency are accurate. And the plan is to try to sample more such points in the configuration space than matrix.h.h.o does currently...

@nomeata
Copy link
Contributor

nomeata commented Jan 18, 2016

  • I agree with @ezyang that the naming is suboptimal. It is not a provides in the sense that something can depend on it. It is more a conflict; in this case a conflict not with a specific other package, but rather a set of packages defined by, well, the same conflict. Maybe mutex would be a good name?
  • I’d be very wary of adding arbitrary conflicts to cabal. It makes searching for solutions much harder, and a bunch of nice properties will no longer hold (e.g. that you can take two install-plans that agree on their intersection, and their union is a valid install plan).
  • We’d be making cabal hell hotter. Is it really worth the complication and confusion on the user side? Are there no other ways to contain the problem of shifting base instances? Is it feasible to expect library authors depending on both semigroupoids and base-orphans to put in tight enough version ranges? If this is a problem that does not occur often, then putting a bit of load on library authors might be better than making cabal yet more complicated for its developers and its users.
  • I dislike that the provides mechanism above needs cooperation from all affected parties. This might not always work well (maintainer might disagree, be unrepsonsive, updating base is non-trivial). It would be better to have a system where one party (or a least all but one) can declare “here be dragons” (and specify what exactly here is). The constraints mechanism is better in that respect.
  • If this is in favour of the constraints scheme, and we need to express mutual exclusiveness in constraints, we could add syntax for that.

@23Skidoo
Copy link
Member

I agree with @nomeata and @ezyang that this looks more like a mutual exclusion mechanism than APT's virtual packages. So conflicts: or build-conflicts: is a better name. And I think it should accept a list of package version ranges (like build-depends: does) instead of tokens.

@nomeata:
It makes searching for solutions much harder, and a bunch of nice properties will no longer hold

Would be nice if @kosmikus could comment on this. BTW, APT also supports specifying conflicts between packages, so maybe it's not too bad?

We’d be making cabal hell hotter. Is it really worth the complication and confusion on the user side?

I think that this feature will be used relatively rarely and only by experts. As an end user, I never found conflicts in Debian confusing.

@RyanGlScott
Copy link
Member

BTW, APT also supports specifying conflicts between packages, so maybe it's not too bad?

That's a very interesting option that might be worth its own subproposal. @hvr, would a conflicts field solve the problems you have?

@hvr
Copy link
Member Author

hvr commented Jan 19, 2016

@nomeata

We’d be making cabal hell hotter.

In the contrary, we need some facility (be it constraint, conflicts, provides, or something similiar) to avoid cabal hell, as otherwise we have no sensible means to tell the cabal solver which packages are not allowed to be used in the same install-plan (due to entities with singleton-nature). Hackage trustees aren't able to do their job of fixing up meta-data otherwise (mission statement: have cabal install foobar-x.y.z never run into compile errors (i.e. cabal hell) & avoid cabal edits that retroactively destroy previously valid/sensible install-plans)

Is it feasible to expect library authors depending on both semigroupoids and base-orphans to put in tight enough version ranges?

That would only help for packages that directly depend on both. There are a lot of packages that depend on semigroupoids and base-orphans indirectly, moreover, not all valid install-plans may actually pull in base-orphans at all. So no, I don't think this is feasible, as we'd have to add the transitive dependency closure containing such conflicting packages everywhere, and we'd be trying to fix the problem/symptoms at the wrong end, rather than (literally) at the root of the problem.

@RyanGlScott conflits is essentually a negated constraint field? So it's exactly as powerful as a constraint-field, but expresses the intent clearer.

So here's what we'd need to do for the simple case of base-compat, base-orphans, semigroupoids:

make sure all base-orphans >= 0.1 releases have the following in their base-orphans.cabal
(the conflicts may actually be made more granular via if impl(ghc ...) so this is just a first approximation):

conflicts: base-compat >= 0.3 && < 0.8
conflicts: semigroupoids >= 4.2 && < 0.5

Now, base-compat between 0.3 and 0.8 need to state

conflicts: semigroupoids >= 4.2 && < 0.5

If it turns out there's another package defining such orphan instances, this would result in 3 more conflicts declarations (one can choose where to place those, depending on which variant results in the least amount of cabal edits -- it's probably also desirable from the solver's POV to have the smallest amount of conflicts specifications in scope).

However, I'm not sure how easy this is to implement, as the solver would need to take into account the meta-data of already installed packages and honour their conflicts-specifications (otoh, same problem applies to the provides-mechanism).

@phadej
Copy link
Collaborator

phadej commented Jan 19, 2016

On the naming front, I don't think we should call this provides, since Debian-style provides mean something different: it implies you can depend on one of these tokens (they are "virtual" packages, which the solver picks an implementation to use.)

Actually virtual packages might make sense for library authors. If the naming of (orphan) instance providers is somehow standardised, then as a library author I could depend on instance-semigroup-hashmap and don't worry if there is an instant plan (containing base or semigroups and unordered-containers) which doesn't have this instance.

@nomeata
Copy link
Contributor

nomeata commented Jan 19, 2016

While chatting with hvr on IRC, I came up with a solution to this problem that relies only on existing Cabal features. Lets call it the “empty mutex package solution” in the further discussion.

Remember that we want to prevent an install plan that has

base >= 4.8
semigroupoids < 5, and
base-orphans

We can enforce that by uploading a package base-orphan-mutex (or better name) in two versions (1 and 2). This package is empty and has no dependencies.

Then semigroupoids < 5 depends on base-orphan-mutex == 1 conditionally (using a flag) when we have base >= 4.8. Similarly, Then base-orphans depends on base-orphan-mutex == 2 when base >= 4.8.

This clearly avoids the problematic situation, works without new features in Cabal, and hence even works with old versions of Cabal.

The downside is an empty package installed in some install plans, and that hackage would have to allow such metadata edits on existing packages (including adding a flag and new dependencies).

@phadej
Copy link
Collaborator

phadej commented Jan 19, 2016

@nomeata Your comment as well suggests that virtual package approach might be good way to solve this problem. Having concrete package is a backwards compatible thing, but Hackage could provide stubs for old Cabal versions for those.

Yet provides: is still clearer if there are N package-version-ranges which should be mutually exclusive.

One more use case for virtual packages: Moving modules from package to another

build-depends: virtual-module-data-semigroup
module MyModule where
import Data.Semigroup

@nomeata
Copy link
Contributor

nomeata commented Jan 19, 2016

Yet provides: is still clearer if there are N package-version-ranges which should be mutually exclusive.

Right. But can we, just to avoid confusionwith what Provides mean in other packaging contexts, call this mutex:?

One more use case for virtual packages: Moving modules from package to another

We should be careful not to reinvent Backpack here.. :-)

@nomeata
Copy link
Contributor

nomeata commented Jan 19, 2016

Another idea from IRC. Solve this problem properly and semantically explicitly. Let’s call this in the “orphan instance declaration approach”.

  • We’d add a field orphan-instances: to the library section of a .cabal file that lists all orphan instances declared by this package.
  • cabal build fails if this list is not acurate.
  • cabal-install rejects build plans where multiple packages provide the same orphan instance.

It is quite similar to the mutex approach, but restricted to one particular use case, and explicit about this.

Pros: It makes sure that the metadata is in place in every involved package, even before you know that there might be another package providing the same orphan instance. It makes authors aware of orphan instances, and nudges them to avoid them.

Cons: Needs changes to Cabal, adds more manual chores to authors. Might be tricky to get a suitable syntax for the description of the orphans (What about qualified package imports? What if the home module of a type (a usually hidden information) changes?)

(I’m not particularly fond of that approach, just including it for completeness.)

@ezyang
Copy link
Contributor

ezyang commented Jan 19, 2016

We should be careful not to reinvent Backpack here.. :-)

I am more than happy for you guys to explore the design space here! As it stands, Backpack has very little to say about how explicit interfaces are supposed to integrate with the dependency solver. It's all new ground for us.

@23Skidoo
Copy link
Member

(Just a self-cc. Ignore me.)

There's a "Subscribe" button on the right side of the page.

@BardurArantsson
Copy link
Collaborator

For some weird reason it always shows up as "Unsubscribe" even when I'm not receiving notifications (other than seeing posts on the main GH page.) Not sure why. I guess I could try going the Unsubscribe->Subscribe route next time...

@hvr
Copy link
Member Author

hvr commented Jun 10, 2016

Just an update to inform you and document how the specific case of base-orphans case was workarounded:

@RyanGlScott uploaded a base-orphans-0 version to Hackage, which is essentially an empty package. The idea is described in https://github.com/haskell-compat/base-orphans/tree/base-orphans-0#about-base-orphans-0 :

About base-orphans-0

base-orphans-0 is a special release that intentionally does not export any modules. base-orphans-0 is used when retroactively adding base-orphans dependencies to older Hackage libraries to ensure that they cannot be built in combination with more recent versions of base-orphans that would cause them to break. (For example, if a package defines an orphan instance which clashes with one in base-orphans.)

Also, base-orphans was whitelisted (besides base which we already had whitelisted) on Hackage (haskell/hackage-server#472)

So now we can, at least in the case of base-orphans, retroactively add base-orphans dependencies, and only need to make sure to allow base-orphan-0 (i.e. the empty package, which has the same effect as excluding base-orphans altogether from the install-plan).

@hvr
Copy link
Member Author

hvr commented Aug 27, 2016

Another minor update, here's a similiar current-tech workaround for the network-uri/network case:

The network-uri-flag package

@andreabedini andreabedini added the old-milestone: ⊥ Moved from https://github.com/haskell/cabal/milestone/5 label Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants