- Problem
- Background
- Proposal
- Rationale based on Carbon's goals
- Alternatives considered
We need facilities for defining new nominal record types in Carbon. They will need to support object-oriented features such as methods.
This is a follow up to #561: Basic classes: use cases, struct literals, struct types, and future work which laid a foundation.
The proposal is to update docs/design/classes.md as described in this PR.
This particular proposal is focusing on the Carbon goal that code is "easy to read, understand, and write." Future proposals will address other aspects of the class type design such as performance.
The proposed method syntax was decided in question-for-leads issue #494: Method syntax.
A wide variety of alternatives were considered:
- Using a different introducer such as
method
to distinguish methods from functions. - Putting a pattern to match the receiver before the method name.
- Omitting either the receiver name or the receiver type.
- Allowing the user to specify the receiver name instead of using a parameter
named
me
to indicate it is a method. - Using syntax between
fn
and the method name to indicate that it is a method, and whether the method takes a value or the address of the receiver. - Using a keyword to indicate that the first parameter is the receiver, as in
C++'s "Deducing
this
" proposal. - Using a delimiter in the parameter list after the receiver type, but the
main choice of
;
seemed pretty subtle. - Putting the receiver pattern in a separate set of brackets. This could be an
additional set of parens
(
...)
before the explicit parameter list. Or the deduced parameters could be put in angle brackets<
...>
and the receiver in square brackets[
...]
.
Examples:
method (this: Self*) Set(n: Int);
fn [Me*].Set(n: Int);
fn ->Set(n: Int);
fn& Set(n: Int);
fn &Set(s: Self*, n: Int);
fn Set(this self: Self*, n: Int);
fn Set[this self: Self*](n: Int);
fn Set(self: Self*; n: Int);
fn Set(self: Self*)(n: Int);
Note: These examples are written using the parameter syntax decided in issue #542, though that does not match the syntax used in the discussion at the time.
We wanted the option to specify the type of the receiver. In C++, the type of
this is controlled by way of special syntax, such as putting a const
keyword
at the end of the method declaration. We wanted to treat the receiver type more
consistently with other parameter types, like is being considered in
C++'s "Deducing this
" proposal.
That proposal also showed the value of including the entire type, rather than
just part such as whether the parameter would be passed by value or address. We
already plan to use this feature to conditionally include methods based on
whether parameters to the type satisfy additional constraints. In this example,
FixedArray(T, N)
has a Print()
method if T
is Printable
:
class FixedArray(T:! Type, N:! Int) {
// ...
fn Print[me: FixedArray(P:! Printable, N)]() { ... }
}
We did consider the option of making the type of the receiver optional. This is not what we decided to try first, but is an idea we would consider in the future.
Using the same introducer fn
as functions, as decided in
issue #463, along
with the method name immediately afterwards means that function and method names
line up in the same column visually.
struct IntContainer {
// Non-methods for building instances
fn MakeFromInts ... -> IntContainer;
fn MakeRepeating ... -> IntContainer;
// Methods
fn Size ... -> Int;
fn First ... -> Int;
fn Clear ...;
fn Append ...;
}
This is to ease scanning and to put the most important information up front.
We also considered using another 2-character introducer that would distinguish
methods from associated functions, like me
, but it did not garner much
support. The name me
was not evocative enough of "method", and people
generally preferred matching other function declarations. We also considered
using different introducers to communicate whether the receiver was passed by
value or by address, for example ro
for "read-only" and rw
for "read-write".
Putting the receiver pattern inside the explicit parameter list, as is done in
Python, would lead to an unfortunate mismatch between the arguments listed at
the call and the function declaration. That motivated putting the receiver
pattern inside square brackets ([
...]
) with deduced parameters. There were
concerns, however, that this parameter did not act like deduced parameters.
Having the receiver name be fixed is helpful for consistency. This benefits readers, and simplifies copying or moving code between functions. A fixed receiver name also allows us to identify the receiver pattern in the function declaration, and identify it as a method instead of an ordinary function associated with the type, like static methods in C++.
Finally, we wanted the bodies of methods to access members of the object using
an explicit member access. This means we need the name used to access members to
be short, and so we chose the name me
.
We considered using just whether the receiver type was a pointer type to indicate whether the receiver was passed by address, but we preferred type information to flow in one direction. Otherwise this would cause a problem if we want to allow deduction of the object parameter type. This deduction can't be used to select between pointer and not pointer.
We also considered supporting reference types that could bind to lvalues without explicitly taking the address of the object, but references would have added a lot of complexity to the type system and we believed that they are not otherwise necessary.
We decided to adopt an approach where the parameter declaration can contain a marker for matching an argument's value category separately from its type. This marker would be present for methods that mutate the object, and so requires that the receiver be an lvalue to match the pattern. The marker means "first take the address of the argument and then match the rest of the pattern." We considered a few different possible ways to write this:
fn Set[&me: Self*](n: Int);
fn Set[*(me: Self*)](n: Int);
fn Set[ref me: Self*](n: Int);
We eventually settled on using a keyword addr
.
fn Set[addr me: Self*](n: Int);
This doesn't use up a symbol, makes it easier to find in search engines, and
allows us to add more keywords in the future for other calling conventions, such
as inout
. We may even find that those other keywords have preferable semantics
so we may eventually drop addr
.
This keyword approach also is consistent with how we are marking template
parameters with the template
keyword, as decided in
issue #565 on generic syntax.
We considered making it visible at the call site when the receiver was passed by
address, with the address-of operator &
.
(&x).Set(4);
This would in effect make the mutating methods be methods on the pointer type rather than the class type.
Question-for-leads issue #494: Method syntax also considered what would be different between functions and methods:
- Methods use a different calling syntax:
x.F(n)
. - Methods distinguish between taking the receiver (
x
) by value or by pointer, without changing the call syntax. - Methods have to be declared in the body of the class.
- Methods and associated functions both have access to private members of the class.
- Only methods can opt in to using dynamic dispatch.
- The receiver parameter to a method varies covariantly in inheritance unlike other parameter types.
We considered various access modifiers such as internal
or private.internal
that would enforce internal linkage in addition to restricted visibility. We
decided to postpone such considerations for the time being.
Even if we do need explicit controls here, we did not feel the need to front-load that complexity right now and with relatively less time or experience to evaluate the tradeoffs. If and when linkage becomes a visible issue and important to discuss, we can add it. Until then, we can see how much mileage we can get out of purely implementation techniques.
We needed some way of opting a nominal class into the fieldwise behavior of a
data class. In Kotlin, you
precede the class
declaration with a data
keyword. Carbon already has a way
of marking types as having specific semantic properties, implementing
interfaces. This leverages the support for interfaces in the language to be able
to express things like "here is a blanket implementation of an interface for all
data classes" in a consistent way.
We chose the name Data
rather than DataClass
for the interface since tuples
implicitly implement the interface and were "product types" rather than
"classes".
We wanted a consistent interpretation for let
declarations in classes and
function bodies. Experience from C++ const int
variables, where we try
constant-evaluation and then give the program different semantics based on
whether such evaluation happened to work, suggested we wanted to instead require
:!
in all places where a value that's usable during compilation is introduced.