Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Generics #36

Closed
wants to merge 320 commits into from
Closed

Proposal for Generics #36

wants to merge 320 commits into from

Conversation

josh11b
Copy link
Contributor

@josh11b josh11b commented May 30, 2020

The docs with our current thinking on how to implement Generics in Carbon. Follow on to #24. Lots of background and alternatives considered are still missing. Definitely still in a rough state, but it seems better to make this visible earlier rather than waiting for more polish.

@jonmeow jonmeow added proposal A proposal WIP labels Jun 3, 2020
@chandlerc chandlerc changed the base branch from master to trunk July 2, 2020 03:21
look at the body of the `SortArray` function. These are in fact the main
differences between generics and templates:

- We can completely typecheck a generic definition without information from the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goal of being able to independently typecheck generic definitions and uses is excellent.
However, by itself, it does not solve the problem of long build times that we see with C++ templates.
To reduce build times, one needs to have the option (compiler flag?) of separate compilation of generics.
In generics-terminology.md, I see that Carbon generics "may also support separate compilation".
I'm worried about the "may". I advocate that Carbon generics support separate compilation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to compile a function parameterized by a generic completely separately if you are willing to accept some overhead, e.g. from boxing generic values. My expectation is that you will not want that overhead in a release build, but there may be a build mode optimized for quick compiles that compiles generics separately.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the critical thing here is not whether you get to execute the separately “compiled code” but whether the results of the work of type-checking the generic (and resolving any overloads—if we must have those—etc.) is done and saved in a form that can be used later—either executed directly or used for monomorphization. This is not just an efficiency argument, either: anything that would require type-checking after monomorphization opens the door to all kinds of unprincipled things that are better avoided. Happy to discuss in more detail later.

is an error to perform an operation the compiler can't verify. For templates,
name lookup and type checking may only be able to be resolved using information
from the call site.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This descriptions makes it sound like early checking of the body is all-or-nothing (which would be fine by me)
but that doesn't seem to jive with the design decision to choose between generic/template on a per-parameter basis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early checking generally can't be complete if there is any template parameter, but an expression in the body can be checked if it doesn't involve anything dependent on a template parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wary of partial early checking. It's not just an optimization that lets high-quality implementations give errors earlier; it's a real semantic change with user-visible effects, and often those effects are not great.

C++'s "if no valid specialization can be generated for a template, the program is ill-formed, no diagnostic required" rule is a pretty good demonstration of how this works out in practice. It ends up being one of the tricky and unintuitive parts of C++ templates -- it's yet another thing users have to keep in mind when writing template code, and it sometimes makes natural-looking code invalid for subtle reasons. I'd rather stick with all-or-nothing checking, which is simple, easy to understand, and doesn't have those traps for the unwary.

Copy link

@dabrahams dabrahams Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my suggestion here

computed from generic/template arguments, or other things that are effectively
constant and/or available at compile time.

A generic function member allows you define a local constant:
Copy link
Contributor

@jsiek jsiek Jul 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terminology "generic function member" is confusing to me (coming from C++ background).
The use of "member" is what is throwing me... functions don't have members (like classes do).
In this example, does it matter whether the function is generic or not? Probably not...
It looks like what is going on here is the declaration of a local variable that must be initialized
with a compile-time known value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terminology here does indeed make more sense for types than functions. Suggestions welcome!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the part about local variables with compile-time initializers
just be deleted from the generics proposal and be dealt with elsewhere?
Or is the generics proposal suppose to deal with all-things that
happen at compile time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed them local constants. It would be reasonable to move this text as part of a refactoring, but right now I don't have another home for it.


### Programming model proposals

(Quoting heavily from the linked docs.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a third model that should be considered, in which interfaces are used as constraints on types.
This is closely related to the model of interfaces as types of types, however it is more general, and the
convenient type-of-type syntax can be easily supported via desugaring.
Examples of the interfaces-as-constraints model include Haskell type classes and the C++0x concepts proposals.
The nice thing about interfaces as constraints is that it can conveniently express multiple constraints on a single type, constraints involving multiple types, and constraints on associated types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That model is approximately represented by "Carbon: types as function tables, interfaces as type-types", which unfortunately hasn't been ported to GitHub yet but is briefly summarized below. We switched to facets for a couple of reasons:

  • Facets keep the implementation of each interface in separate namespaces to avoid name collisions, say from one type implementing multiple interfaces, particularly an issue if you ever want to modify interfaces.
  • Facets more naturally model semantic conformance (which we are leaning towards with generics) over structural conformance (which you can opt into by using templates).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "interfaces are constraints on types" means something like "Type Self satisfies interface ConvertibleToString if it has a method ToString(Self)->String." I believe that is discussed in:

https://github.com/josh11b/carbon-lang/blob/generics-docs/docs/design/generics/terminology.md#semantic-vs-structural-interfaces

I also wrote something about facet types vs. other choices here:

https://github.com/josh11b/carbon-lang/blob/generics-docs/docs/design/generics/terminology.md#invoking-interface-methods

though that text doesn't discuss options for structural constraints.

@googlebot googlebot added the cla: yes PR meets CLA requirements according to bot. label Aug 6, 2020
@josh11b josh11b requested a review from a team August 28, 2020 18:41
Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small style comment

.codespell_ignore Outdated Show resolved Hide resolved
@dabrahams
Copy link

This change doesn't seem(?) to be linked into the larger TOC structure. At least I can't find it by running jekyll and browsing

after seeing the call sites (and you know which specializations are needed).

Read more here:
[Carbon Generics: Terminology and Problem Statement: "Generic vs. template arguments" section](https://github.com/josh11b/carbon-lang/blob/generics-docs/docs/design/generics/terminology.md#generic-vs-template-arguments).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this link explicitly to a different repository? Perhaps you could make this link relative

```
interface Iterator { ... }
interface Container {
// This does not make sense as an parameter to the container interface,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This does not make sense as an parameter to the container interface,
// This does not make sense as a parameter to the container interface,

differences from templates. Additionally, the "Carbon Generic" doc has
"[What are generics?](https://github.com/josh11b/carbon-lang/blob/generics-docs/docs/design/generics/overview.md#what-are-generics)"
and
"[Goals: Generics](https://github.com/josh11b/carbon-lang/blob/generics-docs/docs/design/generics/overview.md#goals-generics)"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether this is the place for me to make this case, but IMO unless we want to end up with an unmanageably complex language (hello, C++, I'm lookin' at you!), we should be thinking about unifying templates and generics in some way. At the very least, let's start thinking about “constrained and unconstrained generics” so they become, in our minds, two flavors of the same thing. This will also help us develop the tools to gradually migrate people from one to the other.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having thought this through a bit more, I am unconvinced we need to be able to create unconstrained generics at all. I think a constrained generic can serve perfectly well as a C++ template for interop purposes.

Static-dispatch witness table output looks similar to monomorphization output.
However, monomorphization does not require a witness table. The risk is that by
conceptualizing the implementation as monomorphization we may unintentionally
introduce cases that cannot be represented as dynamic-dispatch witness tables.
Copy link

@dabrahams dabrahams Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty certain that no such case exists. On the other hand, the converse does hold: a dynamic dispatch system can express non-monomorphizable programs. A classic from Swift:

struct X<T> {}

func f<U>(_ a: U, depth: UInt) {
  if n != 0 { f(X<U>(), depth: n - 1) }
}

f(a: "", depth: Int(CommandLine.arguments[1])!) // "main"

How many levels of nested X's get created around String (X<String>, X<X<String>>, …) depends on the command-line argument.

The important question we need to ask ourselves here is, do we need to support non-monomorphizable programs? Non-monomorphizability has some pretty serious negative implications for what it takes to support efficient generic programming and so far the only argument in favor I've seen is that it makes some categories of generic code easier to write, so I'm inclined to say no… but I haven't personally bumped up against those limits, so I don't have a good read on whether constrained generics become difficult when forced to be monomorphizable. A Rust programmer would probably have better information. I would be interested in discussing this issue in depth.

@google-cla google-cla bot added cla: no PR does not meet CLA requirements according to bot. and removed cla: yes PR meets CLA requirements according to bot. labels Nov 30, 2021
@github-actions
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active, please comment or remove the inactive label.
This PR is labeled inactive because the last activity was over 90 days ago. This PR will be closed and archived after 14 additional days without activity.

@github-actions
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active or becomes active again, please reopen it.
This PR was closed and archived because there has been no new activity in the 14 days since the inactive label was added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: no PR does not meet CLA requirements according to bot. inactive Issues and PRs which have been inactive for at least 90 days. proposal deferred Decision made, proposal deferred proposal A proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants