Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically generating missing methods #631

Open
eernstg opened this issue Oct 18, 2019 · 18 comments
Open

Automatically generating missing methods #631

eernstg opened this issue Oct 18, 2019 · 18 comments
Labels
feature Proposed language feature that solves one or more problems

Comments

@eernstg
Copy link
Member

eernstg commented Oct 18, 2019

This issue is a proposal for supporting a simple automatic class member generation feature. It would enable automatic generation of a set of instance members of a class with an implementation that is structurally identical for sets of several methods. Being a language mechanism, it would not involve a separate code generation step, nor any specialized tools: the code generation would be performed by the front end such that all tools (analyzer, compilers, runtime) would see the generated code as-if written manually.

For instance, it could support forwarding by implementing all members of a given interface such that they call forwardee.g as the implementation of a getter named g, forwardee.m(a1) as the implementation of a method m taking one argument, etc.

This mechanism is not intended to support a more elaborate static meta-programming system where each generated method could have an implementation which is computed via a powerful meta-level computation. For instance, with this example in D, setField has a body that contains a switch where the list of cases is computed based on the members of the given type denoted by Object. That's a non-goal for this proposal.

However, this mechanism does allow for automatically and concisely obtaining (and maintaining) a set of member implementations that have a syntactically nearly identical implementation, and that's a kind of feature that Dart does not have currently.

Static Meta-programming

The need for some amount of static meta-programming has come up many times in Dart language debates. For example, #418 requests better support for proxy objects, #370 and #493 requests a concise syntax for passing all parameters from one constructor to another one, #582 mentions property delegation, many requests have been made for concise declarations of immutable classes, etc. It would be possible to use an approach based on static meta-programming for all of these requests (and, of course, there are also other approaches).

Static meta-programming as in Rust would be great, but that is a large feature, and it may not fit well into Dart because the languages are so different.

However, in one particular area, static meta-programming may be a low-hanging fruit for Dart: We can re-use the machinery known as 'noSuchMethod forwarders'. This is a proposal to do just that.

Rely on noSuchMethod Forwarding

In general, noSuchMethod is invoked whenever an instance member invocation is attempted, but no implementation of the requested kind (method, getter, setter) and with the given name exists.

In order to enforce normal typing constraints (and for performance reasons), this feature has been implemented using a mechanism known as noSuchMethod forwarders: In a concrete class with an implementation of noSuchMethod different from the one in Object, a member is generated for every member in the interface of the enclosing class that has no implementation.

This proposal re-uses that approach: Let C be a concrete class with interface I. Currently, it is an error unless C declares or inherits an implementation of every member in I. However, this proposal allows C to declare a method template, a getter template, and a setter template, and they are then used to generate code for any such missing implementations.

Templates

The member templates would have the following form:

// Method template.
<targets>? template R name(P) ...

// Getter template.
<targets>? template R get name ...

// Setter template.
<targets>? template void set name(P) ...

The body of each template is represented as .... It would be a regular function body, possibly using R, P, and name. These meta-names are detected based on the syntactic form of the template, so with template Ret m(Parameters) ... they would be Ret rather than R, and so on.

The <targets> part can be used to specify that the template is used for specific member names or specific interfaces. It would be a comma separated list where each element is an <identifier> or a <typeName>. For instance, I template R get name => throw "Not supported"; would give rise to generation of throwing implementations of all otherwise unimplemented members of I. A template matches a given member if it has the right kind (method, getter, setter), and if either its basename occurs in the target list, or the target list contains a type name that denotes an immediate superinterface whose interface contains that member. If multiple templates match a given member then the one that occurs textually first will be used.

The body is subject to normal parsing, except that the meta-name that denotes the parameters in a method can only be used as an actual argument list. The setter does not have this restriction: P in the setter above expands to the name of the parameter in the body, which is just an expression.

Code generation for a method named m replaces name by m, R by the return type of m in the interface of the class, and P by the list of formal parameters as arguments. (So when generating code for void foo(int i, {bool b = false}), bar(P) in the body would expand to bar(i, b: b).) Operators are methods, and are treated as such. Getters and setters are treated similarly as methods.

The code generation step for a class iterates over all methods in the interface of the given class, and performs code generation for each member that has no implementation. (In particular, for any specific unimplemented member where we do not wish to have that implementation, we simply write a normal member declaration.) If a member m is not implemented, and it is a method, and there is no method template, a compile-time error occurs. (This is not new, it simply occurs because the class is concrete, but does not fully implement its interface.) Similarly for getters and setters.

The code generation step for a mixin iterates over all members that occur in the combined implements interface, and not in the on interface, nor in the body of the mixin itself, and otherwise works the same as with a class.

A class or mixin with templates may contain all the regular kinds of member declarations, including regular instance member declarations, and static methods and variables.

Example

Consider the case where we wish to create a class that holds a final reference to some other object, forwardee, and forwards all invocations to forwardee:

class Cforwarder implements C {
  final C forwardee;
  C_as_I(this.forwardee);

  template R name(P) => forwardee.name(P);
  template R get name => forwardee.name;
  template void set name(P) => forwardee.name = P;
}

Consider the case where the class Foo does not implement an interface Bar, but with suitable imports it is possible to get access to extension methods such that every member of Bar can be invoked, with some invocations calling instance methods of Foo and others calling some extension method. Except foo—which is just an example, intended to show how to provide a hand-written implementation of any additional members if needed.

We can bridge the gap by generating code that will perform all these invocations in a setting where the extension methods are available, and then we'll get a wrapper object that actually implements Bar, providing the oddball foo method manually:

class FooImplementingBar implements Foo, Bar {
  final Foo forwardee;
  FooImplementingBar(this.forwardee);

  // This is a normal member declaration.
  void foo(String s) { ... }

  template R name(P) => forwardee.name(P);
  template R get name => forwardee.name;
  template void set name(P) => forwardee.name = P;
}

mixin FooImplementingBarMixin on Foo implements Bar {
  Foo get forwardee; // Must be implemented by subclass.

  // This is a normal member declaration.
  void foo(String s) { ... }

  template R name(P) => forwardee.name(P);
  template R get name => forwardee.name;
  template void set name(P) => forwardee.name = P;
}

In the case where different sets of members require a different template, it is possible to split the templates using targets:

abstract class A { void foo1(); int foo2(int i); }
abstract class B { int get bar; int get baz; }

class C1 implements A, B {
  final A a = A();

  A template R name(P) => a.name(P);
  // Yields the following:
  //   void foo1() => a.foo1();
  //   int foo2(int i) => a.foo2(i);

  B template R get name => 42;
  // Yields the following:
  //   int get bar => 42;
  //   int get baz => 42;
}

Enhancements

Obviously, it would be possible to generalize the templates in a thousand ways. However, keeping them minimal would allow us to get this feature with a moderate amount of work, and it would still be rather powerful.

Conversely, it would be easy to allow the parameter meta-variable in a method to be used in other syntactic locations where a list of expressions is allowed, and make it a compile-time error if there are any named parameters, or simply omitting them if the parameter meta-variable is not used as an actual argument list of a function call. This would make it possible to create a list literal containing all the parameters. If all parameters are named then it might be possible to use them to obtain a map literal. Similarly, we could allow f(x, P, y: 42) to denote an invocation of f that passes a longer list of actual arguments than the one which is denoted by the meta-variable P.

It would require a more involved approach to meta-syntax if we were to allow for more elaborate patterns. For instance, a template could specify that only certain return types match, or only certain parameter list shapes match, and that template would then be skipped for non-matching members:

abstract class I { int foo(int i); String bar(); }

class C implements I {
  template (R:int) name(P) => ...; // Matches `foo` and not `bar`.
  template R name() => ...; // Matches `bar` and not `foo`.
}

The name of a member could be made available as a symbol or as a string literal, which could be relevant for situations like the ones discussed in #251. We could use special "functions" like stringOf(name) and symbolOf(name) to obtain such strings or symbols.

It could also be useful to declare that some or all members should be generated, even if they are already implemented:

class C ... // Some useful class out there.

class LoggingC extends C {
  // Suppose we can use `override` to request generation of an implementation
  // when and only when an implementation of that member is inherited.
  override template R name(P) {
    print("Invoking ${stringOf(name)}.");
    return super.name(P);
  }
  // And similarly for getters and setters.
}
@eernstg eernstg added the feature Proposed language feature that solves one or more problems label Oct 18, 2019
@eernstg eernstg changed the title Automatically generated methods Automatically generating missing methods Oct 21, 2019
@icatalud
Copy link

icatalud commented Nov 18, 2019

++50, this is very useful, I'm enthusiastically looking forward to see this implemented.

If you add this feature, could you make it possible to implement generics? I want to create a library that wraps any class created by a user of the library. This could allow creating generic Proxies, for example:

class MyProxy<T> implements T {
  T forwardee;

  template R name(P) {
    ...do something here
    return forwardee.name(P);
  }
  
  template R get name {
    {
    ...do something here
    return forwardee.name;
  }

  template void set name(P) {
    ... do something ...
    return forwardee.name = P;
  }
}

As alternative keyword names I propose "implement" or "default" instead of "template". What do you think? In any case I think it would be best to list all possibilities and hear others opinions.

Update:
An even more ambitious feature that could be "bundled in" with this one is that sometimes you do not only want a forwardee, you actually want to wrap the forwardee object itself in such way that "this" becomes the wrapper object and the original object can only be accessed by the wrapper. From the VM perspective this would be like forwarding the pointer of the reference. For example:

class A {
  int x = 1;
  int get y => x;
}

wrapper class B implements A {
  A forwardee;

  template R name(P) {
    ...do something here
    return forwardee.name(P);
  }
}

A a = A();
B.wraps(a);

a.x;   //  calls B.x, then B.x calls A.x;
a.y   // calls B.y, then B calls A.y, then A.y calls B.x and B calls a.x

This feature could be very useful to wrap libraries or frameworks. At the very least sometimes it is desired to catch every call to methods of a certain object where you have no access or control over it's creation.

@eernstg
Copy link
Member Author

eernstg commented Nov 18, 2019

@icatalud wrote:

could you make it possible to implement generics?

class MyProxy<T> implements T {
  ...
}

That idea is considerably more radical. In particular, the value of the type argument passed when an instance of MyProxy is created is not decidable (that's a general property of type arguments), so we'd have to generate code for MyProxy such that an instance of MyProxy<T> would implement a set of types that we can't know anything about at compile-time, because we can't know T. We get in trouble even in very simple cases (in this case we know statically that the type argument is String, but we can't detect that in general):

T f<T>() => MyProxy<T>();

void main() {
  String s = f<String>();
  print(s.substring(1));
}

We can't expect s to be able to handle the invocation of substring, because the compiler couldn't know that we would want MyProxy to implement String (which is by the way also an error in the first place). So this kind of mechanism could only work with extra restrictions on MyProxy such that it would always be guaranteed to receive statically known type arguments, and that's a completely new concept for Dart. We could do that, but this is basically stepping into the domain of a complete macro system for Dart, and that's a much bigger thing than what's intended with this proposal. Currently, this kind of need is addressed via a separate code generation step.

"implement" or "default" instead of "template'

That's certainly possible. I believe all of these choices would be rather easy to disambiguate for the parser; "default" matches the semantics quite well in the case where we use these templates when there is no implementation.

wrap the forwardee object itself

That is again a useful feature (transparently wrapping an object). It would require the runtime to support manipulations of the identity of an object (similarly to become: in Smalltalk), and that is again a very deep change.

So in both cases I can see how the extensions would be useful, but they are so radical that it would become a quite different proposal. I think I'll keep this proposal simple, such that it is obvious that it wouldn't be hard to implement, and it wouldn't have any semi-unknown performance implications.

@icatalud
Copy link

icatalud commented Nov 18, 2019

Being a language mechanism, it would not involve a separate code generation step, nor any specialized tools: the code generation would be performed by the front end such that all tools (analyzer, compilers, runtime) would see the generated code as-if written manually.

I would consider then to actually generate the code underneath, maybe it's even simpler to implement, more general and more transparent on what is actually happening. In that case the class itself should have a preceding “template” or “generate” word, like:

template<C> class Cforwarder implements C {
  final C forwardee;
  C_as_I(this.forwardee);

  template R name(P) => forwardee.name(P);
  ...

About the original solution

We can't expect s to be able to handle the invocation of substring, because the compiler couldn't know that we would want MyProxy to implement String (which is by the way also an error in the first place).

In theory currently if you implement a class:

class B {
  @override
  noSuchMethod(Invocation invocation) {
    ...
  }
}

B is a class that implements all interfaces, but currently this is not recognized so by the language static analysis, but it should be easy to create a special interface "SuperInterface" which is an interface that implements all interfaces (it could also be a "SuperClass"). Implementing that interface in a class would allow to pass the static type analysis. As long as the class implements noSuchMethod then the class implements SuperInterface. I think that should be easy to do, correct me if I'm wrong.

If you have that superinterface then you could define the proxy as:

class MyProxy<T> implements SuperInterface {
  …
  // note that noSuchMethod must always be implemented.
}

In that case MyProxy is actually a generic proxy, the problem is that the code auto-completion and static analysis does not work as desired, because SuperInterface implements all methods.

If it is currently possible to easily do that, all that is required is to modify the static analysis a little bit to help direct to methods that are generated by template. Alternatively, noSuchMethod could not be implemented if the SuperInterface is allowed to be generic, such that:

class MyProxy<T> implements SuperInterface<T> {
  …
  // note that noSuchMethod must always be implemented.
}

And the static analysis recognizes that MyProxy<T> implements whatever is SuperInterface<T> implements, which is T, any non-existing method in T is forwarded to noSuchMethod.

Essentially SuperInterface would be the special language Class equivalent to what noSuchMethod is for class methods. So you have noSuchMethod to refers to all methods, and you have SuperInterface which refers to all classes.

@eernstg
Copy link
Member Author

eernstg commented Nov 19, 2019

@icatalud wrote:

I would consider then to actually generate the code underneath

It would be a transformation on the intermediate representation that we call 'kernel'. That's a language which is similar to Dart, but lower level, and with explicit representation of all the implicit elements of Dart (e.g., inferred types and dynamic checks are expressed explicitly). So apart from the fact that we're working on abstract syntax trees, I'm aiming at something that you could certainly call "actually generating the code".

the class itself should have a preceding “template” or “generate” word

That's again a step further than I aimed for: This implies that one class declaration can give rise to multiple classes (like a macro that we can expand more than once). I'm proposing a simpler mechanism where a class is a class, but some classes can be partially filled in by the compiler (because we specify templates that it would use to generate all the specified methods, typically: methods in the interface that have no implementation).

B is a class that implements all interfaces

We used to have a slice of that, but it has been eliminated. It is quite costly in terms of run-time performance, and it's very unlikely that there would be support for going back to anything like that. ;-)

Concretely, we used to have support for special treatment of a class marked with @proxy: Access was granted to any member on any type with that annotation, or with a supertype that has that annotation. For instance x.foo() was OK when the type T of x was a @proxy no matter whether the interface of T had a foo method or not. But this mechanism didn't even allow that object to be stored in a variable of type S unless S was a supertype of its run-time type. So we never had a "dynamic object" that could magically be allowed to go everywhere, and we are not likely to get it. (Also, it's inherently a huge loophole in the type safety properties of Dart, and we don't want that, either).

any non-existing method in T is forwarded to noSuchMethod

Dart actually invokes every member access which is statically checked as an invocation of the specified method (there are no checks for "if this receiver doesn't have that method then call noSuchMethod"). This is good for performance, and it is possible because we generate 'noSuchMethod forwarders' such that every object implements the full interface of its type. One of the consequences is that normal (non-dynamic) method invocations can use techniques such as vtables (and variants thereof to handle "interface calls").

If we were to support a "SuperClass" then these implementation techniques could not be used, and normal method/getter/setter invocations would have significantly worse performance.

@tatumizer wrote:

is this code valid?

template R name(P) { print([P]); return forwardee.name(P); };  // sample 1

The proposal does not have all details at this point, but my intention was that the meta-variable that stands for the parameter list in the declaration is mapped to an actual argument list in the body. It could not in general be used as in [P], because that could yield things like [a1, x: x].

The ideal trade-off is "simple, but sufficient to be useful". That's a never-ending tug of war, and I'm sure we can have fun discussing whether [P] should just be allowed and cause an error if the result is "bad" in some sense, or we should prohibit it entirely because it might be bad. ;-)

Probably not. But then, why is this code valid?

template void set name(P) => forwardee.name = P;  // sample 2

Yes, that's valid. For a setter the parameter list is guaranteed to have exactly one parameter, and it is not named, so P in the body of a setter is guaranteed to expand to a single identifier. No problem.

@eernstg
Copy link
Member Author

eernstg commented Nov 19, 2019

Right, I was aiming for the low-hanging fruit here. We already have the ability to generate noSuchMethod forwarders, that is, we can already get the compiler to generate all the "missing" methods for a given class, with a specific body (that sets up an appropriate Invocation and calls noSuchMethod). Allowing that machinery to be used for forwardee makes sense, but (in terms of language complexity, performance implications, implementation effort, and more) it's a small step extra to allow for some developer control over the generated method bodies. The forwarding mechanism would be yet another fixed method body generation rule, and some notion of templates would allow developers to write the method body themselves.

One could also be worried about the 1-2-infinite rule: NoSuchMethod forwarders is 1 kind of auto-generated method, forwarders makes it 2. Having 2 kinds of anything isn't great, and it won't take long before we want a third one. With templates developers can write forwarders and other things as they want, so — even though this is not a very powerful mechanism — it does allow us to write infinitely many different code generation schemes, not just a fixed number.

@icatalud
Copy link

icatalud commented Nov 19, 2019

@eernstg:

Dart actually invokes every member access which is statically checked as an invocation of the specified method (there are no checks for "if this receiver doesn't have that method then call noSuchMethod").

I assume that the exception of that rule is when a method is invoked on a dynamic type:

dynamic o1 = Object();
o1.foo();    // runtime noSuchMethod exception

If we were to support a "SuperClass" then these implementation techniques could not be used, and normal method/getter/setter invocations would have significantly worse performance.

So the reason why you say the VM is not prepared to handle a SuperClass<T> is that the Superclass has a type, but it would have to be treated as type dynamic (checking if it has a certain method implemented), which is slower and therefore any class implementing SuperClass is treated differently by the VM, which is undesired since interface definitions should not go beyond the static type checks (shouldn't affect the run-time).

Taking into account all your considerations I think it's better to come back to discuss the original class MyClass<T> implements T.

Generic interfaces could be inferred at compile time creating separate classes for every generic parameter, which shouldn’t be very complicated (that’s the only requirement that a generic interface would need).
One class should be created per generic inferred type during static analysis. They all should extend the dynamic version of the generic interface which has whichever methods were defined in the class + the generated object interface by this template feature (depending on whether this feature decides to consider Object class methods as part of the interface or not).

I don’t know how generics are currently implemented in Dart, but it could have been done by creating different classes for every generic parameter inferred during static analysis instead of storing the generic type data as variable and compare for types (I’m assuming that is how it is implemented). If generic classes were implemented as different classes, generic interface would be a trivial extension of what already exists. Implementing classes with generics as different classes should definitely take more memory, but it would have to be measured what percentage of the VM base memory is actually taken by the class definitions (maybe it’s low enough to consider it insignificant).

Segmenting the issue

I think there are two issues that can be separated:

  • Template function generation
  • Generic interfaces

I believe that both of them are closely related, more precisely the existence of one greatly potentiates the existence of the other. Currently generic interfaces would only be useful when using the mirrors-reflection package or for creating Mocks, in both cases noSuchMethod would need to be implemented. By itself generic interfaces is a weird not very semantically clear feature, because it forces the implementation of noSuchMethod. However with the addition of template functions, generic interfaces becomes a very intuitive feature, essentially generic interfaces will almost exclusively be created using the proposed function templates.

I’m always rooting for this feature to be released with a generic interface approach, although I agree it’s a different problem which can be developed separately (but hopefully coordinated). This is a very interesting feature, I think many Java developers have dreamed for decades for something like this. It essentially allows creating libraries that have never been achievable at the grammar layer of static typed languages. It opens a wide variety of possibilities for the creation of very useful libraries that require class wrappers like Mocks, Profilers or any alternative side effect that someone can come up with. Libraries can only be used through predefined APIs, but this could allow the user to define the API and have the library “inject” behavior, allowing a flip in the way libraries are designed. In mockito for example the users could create a mock like A mock = Mock<A>() without the extra verbosity of having to create a special class MockA (which is more annoying than one might think when some of the Object class default methods have been overridden with optional parameters).

Essentially it would make Dart a language with a versatility close to what dynamic typed languages provide but with the convenience of static typing. Features like this are breaking-change and are the ones that attract people to a language (something different and very useful).

Another approach (plain code generation)

After taking into account your considerations I came to the conclusion that plain code generation is more general and it works on a different layer without incorporating complexity to the language syntax itself, which is probably preferable.

Default code generation should be supported and come built in with the dart engine, because it is just too convenient (the explanation above applies). It is possible to incorporate code generation with minimal semantic complexity (leave the language almost intact).

By allowing general code generation, the code generation is not constrained to a certain predefined template in the language, it allows developers to generate static typed code with arbitrary templates. It is the code generators responsibility to verify the inputs and create valid consistent classes. Good useful generators would naturally be reliable and become widely used. The way I think this could be done is by using the word external or generate external, which currently already exists for accessing the VM base classes. For example:

import ‘package:generate/generate.dart’ as gen

class A { foo() {} bar() {} }

external class ProxyA extends gen.Proxy<A> implements A = gen.proxy(A, onMethodInvocation: (String methodName) => print(‘$methodName invoked’) );

A a = ProxyA(A());
a.foo() // prints 'foo'

external class WrapperA extends A = gen.wrap(A, onMethodInvocation: (String methodName) => print(‘$methodName invoked’) );

a = WrapperA();
a.foo();   // prints 'foo'

Where gen.proxy and gen.wrap are methods that returns a String. In theory any function that returns a string could be used as an external generator so it’s probably better that it is specified in the generating library that a certain function is a class generator. The analyzer should detect external classes first and append the class string to the file (maybe creating a hidden generated part of file). This is would solve both, generic interfaces and function templates (although it would still be necessary to define a class for mockito for example).

library generate;

generator proxy(Type type, onMethodInvocation: Function(String) ) ) {}  // What is the type of a class?

factory would be quite suitable for this kind of code generation, since it’s actually a “class factory”. It should be possible to use generate as a function that returns a string. That kind of solution would generalize the class code generation, it does not require reflection of any kind if some kind of class package exists, like the one proposed in this issue.

With this approach the template proposition of this issue is direct to address and common patterns like that one could come bundled with sdk generators.

The main disadvantage of this solution is that generators could give bad String inputs which are not detectable prior to running the code, but the same can be said about using an external library, it can always fail during run-time because there are bugs.

@eernstg
Copy link
Member Author

eernstg commented Nov 20, 2019

@tatumizer wrote:

it's OK to have 2, 3 and more ways to do the same things
assuming that some of them are specialized versions of others

I didn't sign any documents on the philosophy, so I'll respond more concretely, without promising to have a complete underlying conceptual framework. ;-)

However, I believe the specialized versions of features that already exist were syntactic shortcuts. For instance, void foo(T x) => x.someMethodWithASideEffect(); is more concise than the form using {...}, and we have it just for that reason. It breaks the rule that function return statements in a void function shouldn't return the value of an expression with a non-void static type, and we decided that this wasn't so bad here, because the return type void is right there on the screen, because => is used for small functions. If you do write a 50 line function with => you're probably violating the style. All the null-aware constructs are similar: You could always expand x?.foo() by writing an explicit null test. I believe that those specialized versions are very much about brevity.

With the templates proposed here we are considering more or less powerful variants. Having just forwarding seems (to me) to be too restricted: We can have much more power for a small investment.

Having a semantics where a single class C ... {...} gives rise to the creation of an unlimited number of classes is far beyond my original proposal (and it certainly has a number of implications that we'd need to study and handle carefully), and I think that ought to be discussed separately.

In that sense it all comes down to 'low-hanging fruit'.

  1. how to use P? 2) how to use R?

Right, that is a crucial point. I do not propose that P or R should have a type. They are purely syntactic devices, and they'd be used to control the code generation step, and the resulting code will be type checked just like it would have been type checked if it had been written by hand. But this means that P denotes a comma separated list of terms, of each which is the name of a formal positional parameter, p, or n: n when the parameter is named, in the declaration order of the method which is currently being generated, which means that it is created according to a member signature of the interface of the enclosing class. (If we have MyInterface template .. then it is also a member of MyInterface, but the general rule is that only members of the interface of the enclosing classes are considered, and MyInterface template .. is only used to select a subset thereof).

P can certainly be passed as an actual argument (so we can have foo(P) and bar(this, P) and possibly baz(42, P, 43), which will be an error whenever P ends in one or more named arguments).

R denotes a type and it can be used in certain positions where a <type> can occur.

All issues concerned with the semantics or static analysis of a template are irrelevant: They are used to guide code generation, and only the resulting member declarations are subject to static analysis and have a semantics.

This is a reasonable approach because this mechanism isn't an abstraction over classes: For each class C (or generic class, e.g., C<X extends B>) with a template, the compiler will fill in the details and make it a regular class, possibly generic. As opposed to, say, C++ templates, this mechanism does not support the creation of new classes by some notion of template instantiation, and this means that we don't have the problem where a templated declaration is buggy, but the bugs only emerge when the template is instantiated somewhere else.

@icatalud
Copy link

@tatumizer, @eernstg:

I can see three propositions over the discussion. From simplest to more general it would be:

  1. Direct forwarding support: final C forwardee provides /* or even "implements" */ C;.
  2. Template forwarding support (allow creating methods)
  3. Class constructor from a string

I believe both 1 and 2 would greatly increase their utility if also generic interfaces exist.

Advantages of 1 is that it is simple and clear. It directly solves the forwarding problem and it doesn’t add strange complexity to the language (just one keyword).

The advantages of 2 is that it is more general, it allows in theory to create more complex stuff, however this more complex method generation would require introducing and understanding a sub-grammar inside the language. If the extension of the sub-grammar is left just as it is proposed originally, there is not much else that is possible to do besides invoking a fixed method previously or invoking methods conditionally by checking for the name of the method (a non general use case).
I think almost all of what can be achieved with this solution would be achievable with number 1 (forwardee) if it were generalized in such a way that it was possible to forward within method calls by implementing noSuchMethod, like it was proposed in #418 . Templating function is semantically more explicit and transparent, however doing the logic inside noSuchMethod would allow using the whole Dart language (it doesn’t introduce sub-grammar), but on the other hand it is more inefficient. If this solution was to be implemented, then it has to be decided which would be more appropriate (either this or #418 ‘Deletage’). I incline for this one, but I would also add a 'generate' keyword to denote that it is a class that has to be generated at compile time. This differentiation immediately make possible generic interfaces. generate class MyGeneratedClass<T> implements T { template... }

Number 3 is interesting because it generalizes static class generation. Its use have unlimited applications. It has minimal interference in the grammar like option 1. Essentially it says there is a class MyClass that has this interface and it comes from an external source, so don’t ask, just assume this interface is available. The one strange thing that could occur is that it would be possible to access variables hidden from the interface by using the object in a dynamic var what would otherwise be impossible to access.
By having this kind of generation, proxy generation could come bundled in with SDK as it is the most simple direct use case.
But a number of other very useful use cases that come to my is creating Models, Mocks, Profilers, Rest APIs, etc. There is plenty of code generation libraries for Modeles and APIs, because currently it is the only path available to generalize when there are static type constraints.
On the other hand I think the idea of the language having some kind of sub-grammar that would allow statically generating classes is better if it is complete. Do you think it's possible that this templating grammar would allow creating any possible generic class that could be created with a plain string? (assume the plain string is only generating the inside of the methods, because that is the interface available to the consumer). I don't see why not if is is possible to conditionally check for all the details of the parameters, like names, types, etc, which is the same that a string generator would have available.

What are your opinions and preferences of each of them? They are not exclusive, but they can solve the commonly required Proxy-Forwarding pattern.

@eernstg
Copy link
Member Author

eernstg commented Nov 20, 2019

@tatumizer wrote:

forwardee.m(42, P). Turns out, you can't do it because P, as a
whole, was supposed to match the signature required by the
interface, which includes ALL parameters,

That's not quite true: In the body of a template class C, P expands to an actual argument list containing the formal parameters of the current method in order, based on the interface of C. There is nothing in my proposal that requires the forwardee to have any particular interface, so if P expands to a, b, c: c and forwardee.m(42, a, b, c: c) is type correct then forwardee.m(42, P) will just work. This requires the interfaces of C and of the forwardee to be in sync, but you can turn that around and say that it ensures by a static check that they are in sync, and then you avoid writing all those methods that do the "forwarding with extra argument".

Forwarding with an extra argument is a well-known technique to encode true delegation (the extra argument, self, is the receiver for "self sends", that is, method invocations on the logical object which is the delegation network).

Also, you can write mixins in order to have a specific template for a selected set of methods and then create a combination by mixing in different mixins, and you can use SomeInterface template similarly. For example, you could forward all I methods to forwardeeForI and all the J methods to forwardeeForJ. So it's not true that we can't express anything other than simple forwarding.

(@icatalud, I will respond, but I have to run now. ;-)

@icatalud
Copy link

icatalud commented Nov 23, 2019

@tatumizer

I think we need examples from real-life code - e.g. by demonstrating that some (Flutter?) program would benefit from the template mechanism, but the same cannot be achieved by "simple forwarding".

People would definitely find some use cases for this feature. By default it serves to decorate functions (all with the same method) and it could be used to “decorate” without forwarding, just plainly extending the class. Something that comes to my mind is a “switch”, where the forwarding is done only if the switch is on, or it could have two forwarders and switch from one to the other. It could be used to make a “concentrator”, a class that forwards to multiple objects (the concentrator has the same interface), etc.
Another use case can be Mocks, where they could be configured to let certain functions pass and others not.

@eernst:

By making the exercise of extending the template and see what could be done, by default an “if” would be needed:

template R name(P) {
in case P.len == 0:
    return myMethod0();
in case P.len == 1:
    return myMethod1(P[1]);
  
  for(int i=0; i<P.len; i++) {
in case P[i]==’arg’:     /// this cannot be valid, i cannot be used 
  }
}

It would be something like the c++ define. But suddenly there are limitations, it is desired to define the function looping over all the parameters of P or to segment a long definition into sub-functions (define inside define), Dart vars can be confused with the P vars. Generality becomes something complex in this templating language.

This leads me to believe that having a class generator might be the best option, because it introduces minimal extra syntax and leaves the complexity of generating classes on a second plane. A more organized class generator in “static terms” than a function that returns a plain string could be a special class ClassGenerator class.

ClassFactory {
  Class<T> static createClass<T>(ClassGenerator generator);
}

class ProxyGenerator<T> extends ClassGenerator { … }

external class Proxy<T> = ClassFactory.createClass<T>(ProxyGenerator<T>());

Note that proxy has a generic interface. The interface of ClassGenerator has to be defined and but it could be either that or a String, both achieve the same.

@eernstg
Copy link
Member Author

eernstg commented Nov 25, 2019

@icatalud, I've been thinking about the possible extensions of my original proposal in the direction of supporting class abstractions. As I mentioned, it is a delicate balance, and I certainly don't think we want to have anything that works like a traditional (unhygienic) macro, but I think we could find a sweet spot where a mixin can be used to abstract over templates. It's used in the example below.

@tatumizer wrote:

Counter-argument: then the most likely place to pass self would
be: in a constructor of forwardee class, as opposed to doing it in
each individual method :-)

That won't quite work. If you are emulating delegation in a language where ordinary (static, class based) inheritance is already available, you surely want to model it in a way that offers something new. In particular, it shouldn't be taken for granted that a delegation graph is immutable, and it shouldn't be taken for granted that there is only one entry point (that is, only one "facade object" which has the desired interface and represents an object identity for the delegation network).

Here is an example where I've spelled out the emulation of a delegation mechanism. The example delegation network (created in main) has two entry points and one shared object. Objects that are shared by multiple entry points can be used for many purposes, but a typical one would be to model "semi static" state, in the sense that some objects share this state, almost as if it were static members of their class, but it is determined by the concrete networks exactly which objects are sharing this state.

The following uses getters in order to illustrate a difficulty that I noticed: When we wish to add an argument in a forwarding invocation we will need to change a getter to a method, and a setter is changed to a method taking two arguments. This creates a name clash, and that motivates some kind of identifier computation mechanism. I'm using m##Set below in order to allow the current member name (denoted by the metavariable m) to be extended with the suffix 'Set'; this means that the generated method body for the name getter can call a name method, and the generated method body for the name setter can call a nameSet method.

I'm using a template mixin to enable the expression of delegating methods once and for all (and then any class that wants to be a delegator to T would use with Delegator<T>, and any class that wants to be a delegatee to T uses with Delegatee<T>).

If we allow that kind of mixin then it allows us to avoid writing a near-identical copy of the template methods for forwarding and for delegation for each class which is supposed to be a delegator/delegatee, but it does raise a number of questions about software engineering (e.g., maintainability and readability).

If we don't want to do that then we just need to write the template methods into the classes that currently have the with clause.

In any case, note that the templates will automatically give rise to generation of any number of method implementations for otherwise unimplemented methods (so when a new method is added to any relevant interface it will automatically be implemented in each delegator/delegatee class). For instance, DelegatorPerson will only need to specify an implementation for the methods that it actually implements, and the rest are generated automatically without mentioning their names.

abstract class Person {
  String name;
  String address;
}

template mixin Delegator<X> {
  X get delegatee;
  template R get n => delegatee.n(this);
  template set n(P) => delegatee.n##Set(this, P);
}

class DelegatorPerson with Delegator<DelegateePerson> implements Person {
  DelegateePerson delegatee;
  DelegatorPerson(this.delegatee);
  toString() => "$name, $address";
}

abstract class DelegateePerson {
  DelegateePerson next;
  DelegateePerson(this.next);
  String name(Person self);
  void nameSet(Person self, String value);
  String address(Person self);
  void addressSet(Person self, String value);
}

template mixin Delegatee<X> implements X {
  X get next;
  template R m(P) =>
      next != null ? next.m(P) : throw "'stringOf(m)' unimplemented";
}

class DelegateeAddress with Delegatee<DelegateePerson> {
  String _address;
  DelegateeAddress(DelegateePerson next, this._address): super(next);
  String address(self) => _address;
  void addressSet(self, value) => _address = value;
}

class DelegateeName with Delegatee<DelegateePerson> {
  String _name;
  DelegateeName(DelegateePerson next, this._name): super(next);
  String name(self) => _name;
  void nameSet(self, value) => _name = value;
}

void main() {
  var house = DelegateeAddress(null, "5th Anevue");
  var adamName = DelegateeName(house, "Adam");
  var eveName = DelegateeName(house, "Eve");
  var adam = DelegatorPerson(adamName);
  var eve = DelegatorPerson(eveName);
  print("$eve & $adam."); // 'Eve, 5th Anevue & Adam, 5th Anevue.'
  eve.address = "Broadway";
  print("$eve & $adam."); // 'Eve, Broadway & Adam, Broadway.'
}

The intended semantics is illustrated by a desugared version of the above example here.

The reason why it won't work to associate each delegatee with the original receiver (by storing self in a field) is that the delegatee that handles the address is shared by both adam and eve, so it can't know which original receiver to choose for each method invocation. A similar problem arises if a delegation graph is modified.

@adjentz
Copy link

adjentz commented Dec 12, 2019

Want to +1 this issue, but also suggest widening its scope to support a full macro system.

Metaprogramming has been with dart forever (transformers, build runners, etc), but it's always required an extra build step and configuration from the user that wasn't always transparent.

The goal with macros should be to (statically)

  1. evaluate/crawl dart AST in order to..
  2. emit real dart code.

This should happen at compile time (no user interaction or configuration needed).
However, emitted functions/members should also be tool discoverable (perhaps by doing an initial pass of macro results and caching the results to make available to the analyzer).

For example, you might have a macro
macro JsonSerializable<T>(T instance) { // evaluate instance AST and emit valid dart code } (example syntax, not actually being suggested)

which might (transparently to the user) emit something like:

extension JsonMethods<T> on T {
   String toJson() {
       // impl filled by macro
    }
}

and then in user-code

Point p = Point(3, 7);
print(p.toJson()) // toJson available/suggested in intelliSense. 

Possible use-cases:

  • static/type-safe JSON (de)serialization
  • Dependency Injection
  • Equatability checking
  • Compile-time reflection (get member names/types/metadata)
  • Immutability (e.g. Object.freeze)
  • Generating HTML
  • Generating SQL queries

Reading/inspiration from other languages: nim, rust

@eernstg
Copy link
Member Author

eernstg commented Dec 20, 2019

@adjentz wrote:

support a full macro system

@tatumizer wrote:

a single "template" statement can be applied to everything

The topic here is the generality of the code generation mechanism. We obviously have a whole spectrum of expressive power to explore, and the initial sketch of a proposal here is absolutely minimal.

A very expressive model could be achieved by relying on kernel transformations (Dart is translated to an intermediate representation where all implicit elements have been made explicit, e.g., all inferred types are now specified explicitly, generic function instantiation has an explicit syntax, etc). A kernel transformation could be given access to static information (in particular: types), and it could use arbitrary computations (in Dart, presumably) to manipulate explicit representations of kernel code. This mechanism could transform any program to any program, and it may or may not have a closed world perspective. Hence, this would offer a maximal amount of expressive power. Conversely, it's a very delicate matter to transform kernel code, because the implementations of the whole Dart tool chain may rely on a set of invariants, and all sorts of breakage could occur if they can be violated.

In contrast, my initial proposal here is minimal: There is no access to static information, only the syntax is available, and the template mechanism only allows for expressing a small set of variants of the code. The only mechanism available is to introduce meta-variables (each of which stands for a snippet or code), and use meta-variables. For instance template R get g => other.g; introduces R and g in the member signature R get g, and it uses g in the body. A list of formal parameter declarations can be introduced as P in template R m(P), and usages of P would be a transformation of the syntax matched by P: The formal parameter declarations are transformed into an actual parameter list.

We could easily walk back and forth on part of this spectrum of expressive power. For instance, we could introduce metavariables explicitly (say, using #<identifier>#), such that template int get #g# would introduce a meta-variable g for the name of the getter, and the template would only match getters with return type int (because we didn't say #int#, or, more reasonably #R#). That is, we could specify parts of the member signature and thus require that the template is used if and only if the member under consideration matches the given pattern.

Other parts of the spectrum of expressive power are not so easily within reach: It is well known that the ability to compute code in a type safe manner is a complex enterprise (e.g., check out MetaOCaml). So we're probably not going to define mechanisms for type checking templates before they are expanded. Similarly, it's probably completely out of reach for a kernel transformation mechanism to ensure that every kernel transformation will only generate code that doesn't have compile-time errors. This may be OK for a mechanism where the template is located in the same context where the generated code lives, but if it amounts to a non-hygienic macro mechanism then it's not likely (in my opinion) to be robust enough for serious software development.

So there is indeed a very important debate to be taken about how expressive the code generation mechanism should be. I started out with a minimal idea, but it's quite likely that we'd want something which is a bit more powerful.

@eernstg
Copy link
Member Author

eernstg commented Dec 20, 2019

I think the ability of a template to abstract over methods, getters, and setters may be useful in order to avoid duplicating code.

But it would get harder to write the body of such templates. You mention that we could allow for calling a getter using method call syntax, but I'm not convinced that this feature would be particularly useful (for anything else), and it would still require us to have a richer template language (such that we could express that a single template for a member m should call something.m(P) when the template matches a method, but it should call something.getterFor##m() when the template matches a getter).

Unification is always a noble goal (because it's quite likely that a single mechanism that handles several jobs will automatically ensure greater consistency, greater orthogonality, and thus greater expressive power), but in this case I think the unification itself will be so delicate that it may not be worth the trouble.

Another thing to keep in mind is that the template language is (probably, as I see it) more likely to get more expressive rather than less expressive, and the ability to special case methods/getters/setters is just a tiny bit of pattern matching.

@icatalud
Copy link

This proposal is a complete solution to the template function for generic interfaces using an approach that uses regular Dart code.

Section (i) is only relevant to understand the optimizations, the proposal starts at (ii).

i. Techniques to rewrite code

If a computer program depends only on constant values it will always throw the same output. By applying this principle on every section of the source, it is possible to improve it by:

  1. Values: The output of any code path that depends on processing only constant values can be computed as a constant.
  2. Branching: Conditional statements that depend on constant expressions can be resolved to the valid path.
  3. Roots: Code paths whose result always points to the same field on a certain data structure can be simplified to the field.
  4. Iteration: Iterating a constant range can be inline-expanded (replace the loop for code repetition).
  5. Recursion: Function calls that have constant recursive paths can be inlined.
final A a1 = A();
final A a2 = A(a: a1);     // constructor: const A({this.a});

var x;

foo() {
  var r = 0;
  // 1. Values
  for(var i=0; i<3; i++) r += i;     // var r = 3;
  // 2. Branching
  if(r<1) ...    // eliminated
  else// 3. Roots
  a2.a;            //   a1;
  // 4. Iteration
  for(var i=0; i<2; i++)  x += 1;    //   x += 1; x += 1;
  // 5. Recursion
  bar(2, x);        // baz(x); baz(x);
}

bar(int i, int v) {
  if(i<1) return;
  baz(v); bar(i-1, v);
}

ii. Prerequisite: Source inspection and generic method invocation (#701)

A keyword that can be used on anything that can be referenced from within the code that translates into an object that allows retrieving information related to their typed definition. These objects should also provide methods inspect(x).invoke('foo')==x.foo() and inspect(x).value==x.

iii. Proposal: Macro type <M>

A macro type <M> represents a group of types. The return types of functions can be group of types that depend on macros received as parameters, but they cannot return self generated macro types.

var x = 0;

main() {
  <M> m = foo(#baz, 'baz'); 
  m = foo(m, 'baz');           // invalid
  <N> n = foo(m, ‘baz’);
}

<List<<M>>, T, int> foo<<M>, T>(<M> a, T b) {
  if(x>0) return b;
  if(x==0) return [a];
  return 0;
}

<<M>> bar(int x) {}       // invalid

Code optimization of macro types

Access to fields on macro objects that are immutable resolves to their root values by (i.3). If the origin of an immutable macro object is static, then by (i.2), unnecessary type checks on the roots are resolved. By (i.4) iterations that depend on the length of these data structures can be inline expanded. Collapsing types by using macro vars can be equally efficient as typing them separately.

iv. Generic implementations (template functions)

Basic implementation

This generic implementation is compatible with any method:

class Foo<T> implements T {
  <<M>> genericInvocation<<M>>(InspectMethod<<N>> method, List<<P>> args, List<<Q>> optionalArgs, Map<String, <R>> namedArgs) { ... }
}

Special implementation

  <M> <method>(...args, [...optionalArgs], ...namedArgs) { ... }

Having the inspect type variable in the spot of the method definition reflects that the var can only be InspectMethod constants from the interface T. The notation represents that the lists and map args cannot be arbitrary like the parameters in a normal method, they always wrap roots with a static origin.

A. Optimization

Applying the techniques defined in (i).

Code paths that only use constant values (notably values of Inspect types):

  • Computations resolve to constants.
  • Conditional statements resolve to the valid path.

Paths that manipulate roots:

  • They must resolve either to a constant, a root or an object that wraps roots. These objects always have the same structure because they are generated from constant values. Any access to a leaf field in these data structures resolves to a constant or one of the roots.
  • Because the root values are not used, iteration ranges must be constant. They are inline expanded.

Paths that depend on root values:

  • Unnecessary type checks are resolved.
  • Function calls where their recursive paths do not depend on roots are inlined.

The rewritten instructions will either:

a. Perform a root invocation (create a root)
b. Iterate according to a value computed from the roots
c. Compute a local value initialized from a root or inside an iteration
d. Branch conditionally to a root or a computed value
e. Invoke a function that has a recursive nature

In order for another set of instructions to have identical behavior:

a. If is is assumed that root invocations can have unknown side effects, the same root invocations and in the same order must be performed.
b. If it were iterated on a different range, then the computations inside the iterations would not produce the same results.
c. If it is assumed that all computations are necessary and performed optimally, then instructions that are computing a value cannot be rewritten into more efficient ones.
d. Not branching on the same conditions does not produce the same output for every possible input.
e. A recursive path can only be emulated with another recursive path.

Changing instructions from the rewritten code would produce either a different behavior or a less efficient function.

Conclusion

Generic implementations have access to the same static data that any code generator has available. All function behaviors that are created with generated code can be emulated with generics.

By applying the code rewriting techniques it was proven that generic implementations are as efficient as any generated code could be.

Bonus feature

Collapsing function parameters: The macro method arguments definition from (iv) could be generalized to any method, making it possible to collapse arguments using foo(...List<<M>> args).

@eernstg
Copy link
Member Author

eernstg commented Jan 7, 2020

@tatumizer wrote:

Consider <targets>? template R name(P) ...
It says: for each method 'name' apply this template ...

Right, I can see that we would most likely be able to unify the templates. In a template instantiation where name matches a getter T get g, the occurrence of forwardee.name(P) would be transformed into forwardee.g; and in an instantiation where name matches void m(T1 t1, T2 t2), it would be transformed into forwardee.m(t1, t2), and I already assumed that something similar would occur for operators. Based on the general rule that unification brings additional generality and expressive power, it could be a good idea. It could also be convenient if it would allow us to write some templates which are more concise.

A counterpoint could be that this generalized member template is a bit harder to read and understand than a separate getter, setter, and method template. It would at least make sense to make it syntactically explicit that a given template is a general 'member' template, not just a method template.

This would also make it easier to allow for separate getter/setter/method templates if we make the template sub-language a bit more powerful.

@Levi-Lesches
Copy link

Levi-Lesches commented Apr 7, 2021

With all the discussion of macros at #1482, is this still being worked on? I feel like it's a lot of extra syntax/implications to introduce and then replace with a full macro system. Macros are looking simpler with #678 and some deeper inspection of the analyzer package.

@eernstg
Copy link
Member Author

eernstg commented Apr 8, 2021

This proposal is a very lightweight approach, it simply amounts to generating some member declarations based on a simple template. We wouldn't add this to Dart if we were to have a system like those under scrutiny in #1482, but if we wish to go for something that's guaranteed to have low cost in terms of dependencies and compilation times/sizes, and we can live with the lower level of expressive power, then this proposal would still be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems
Projects
None yet
Development

No branches or pull requests

4 participants