Proposal: An algorithm for composable rewriting of expressions #3633
Replies: 28 comments
-
I think these same comments apply here. This proposal has the same issues, IMO. If |
Beta Was this translation helpful? Give feedback.
-
Is it really that distasteful? After all, the compiler is already doing this quite heavily. If I place a |
Beta Was this translation helpful? Give feedback.
-
I think those cases are very different since the code generation is done by the compiler itself and in a very prescribed manner where the semantics of the code written are preserved. That wouldn't be the case with these source generators. |
Beta Was this translation helpful? Give feedback.
-
Yes, but so what if a source generator rewrites the code? If you've opted in to using a source generator you are already signalling your acceptance of this. It's not that different to using a code analyzer to detect issues in your code - you've opted in, you accept the diagnostics. Roslyn already allows for syntax rewriters. Expanding on this to make it part of the compilation phase of code doesn't seem like a bad thing. It could be a move towards supporting functionality like C++ meteclasses, or some sort of hygienic macro functionality. |
Beta Was this translation helpful? Give feedback.
-
These are the reasons alluded to by the language team as to why
I don't think that the language team would agree with that notion at all. There's a big difference between interrogating an analyzer that offers a code fix and opting-in to using its suggestion vs. having that suggestion automatically and silently applied to the code during compilation. I'd personally love to see some AOP capabilities return to source generators. I think for any proposal to be successful it will need to determine exactly what aspects of those AOP generators made it unacceptable to the language team and try to navigate what concessions such a proposal would need to allay those concerns. |
Beta Was this translation helpful? Give feedback.
-
It's unclear to me if there really were language objections for not doing it or the team's current stance on that is just a ret-con to make the outcome more palatable. Prior to unveiling the current version of source generators, every time the team had previously discussed |
Beta Was this translation helpful? Give feedback.
-
Genuine question: which IDE experience has to work seamlessly in order for a language feature to be considered/implmented? It sounds a bit like the tail wagging the dog! |
Beta Was this translation helpful? Give feedback.
-
The general experience has to be achievable. You should be able to edit C# code with major frustrations from editor and intellisense slowness. Debugging the code should be possible, looking at the actual code that it running and how it maps to the original. It's not as hard a requirement, but we also consider whether ENC/interactive experiences will need work. C# is product, and we think the whole experience is part of that product. We consider the end-to-end ramifications of every change to make sure that we're not regressing the whole product with any language change. |
Beta Was this translation helpful? Give feedback.
-
I think you're trying to avoid saying "Visual Studio" in your answer, which suggests that unless Microsoft is unable to support a language feature in its product then it doesn't go in. |
Beta Was this translation helpful? Give feedback.
-
The VS was implied... I didn't realize that we supported other editors 😅. As to the latter part: yes, if we don't think that we can make the full experience good, then we don't ship a feature. The editing experience is absolutely part of C#. It may not be part of the language definition, but it's absolutely part of the experience of using the language. |
Beta Was this translation helpful? Give feedback.
-
Well, there's also VSCode...
Sorry, it was my understanding that the C# language design was independent of Microsoft, even though its steering group may contain a large number of Microsoft employees. These days a large number of people aren't developing C# applications on Windows, they're doing on macOS and Linux, too. Having the future direction of the language dictated by a Windows IDE suggests your priorities are to a commercial product first, and the language second 😢 |
Beta Was this translation helpful? Give feedback.
-
I think you're confusing "IDE experience" with "a specific product supported by Microsoft". What @333fred is talking about are conceptual issues for any IDE, owned by Microsoft or others. The problem with open ended rewriting of code is that in order to provide code completion, signature help, tooltips or light-bulb refactorings the IDE has to invoke all rewriters. You need to conceptually solve the problem when to invoke them and in what order. Yes, we are saying that one bar for language features is that they can, in principle, be tooled by an IDE. In fact, 99% of said tooling isn't some mysterious code, it's part of the same repo that contains the compiler, precisely because we think of C# as "language + tooling". |
Beta Was this translation helpful? Give feedback.
-
I do my full-time job in VSCode on Arch Linux, I'm well aware other platforms exist. Additionally, remember that VS includes VS for Mac, which is certainly not a Windows IDE. Omnisharp would also likely be able to leverage the work that would need to be done in the Roslyn IDE layer, whatever that is. We think that, regardless of the editor, there are serious concerns here. Having good tooling is an inescapable part of having a good language, and while we do look at that work through the lense of what would need to be done in Roslyn, we don't shy away from hard work. Nullable reference types, for example, has probably 100+ dev-years worth of effort in it at this point. However, this started from a point where we could see an end experience. We believed that it was possible to make the whole experience good. Our concern with source rewiters is that, no matter how much effort we put into them, we aren't sure if we can tool them in a good fashion. |
Beta Was this translation helpful? Give feedback.
-
I would argue that the design of the C# language does (and should) take tooling into account. For example: we could propose a lot of features that make the language use less characters on the basis that users don't need to type as much. That argument is not generally accepted though as things like OmniSharp (which supports many editors such as Sublime, Vim, Emacs, etc.) and Rider exist. It is assumed that most developers will interact with the language via some tooling at this point. If we take a language feature that reduces typing it is normally because it does so incidentally and the real value is in reducing the number of concepts needed to represent something (auto props feel like a good example of this). We are also unlikely to ever change the property of the language that the body of a method cannot change the type definitions for things outside it. We know that would break pretty much every optimization tooling devs have done to make completion work across the eco-system. The problem I have with source generators that re-write code isn't that it makes more work for the Visual Studio team. All changes to the language do that. It is that I cannot articulate any model that would allow any tools developers to deliver a good experience. That's always been the reason I am worried about doing this. @foxesknow is there a language you use that has this feature and has a great developer experience around it? I would love to learn about other approaches that have been done. |
Beta Was this translation helpful? Give feedback.
-
That wasn't really the case with expression-bodied members, though. Now the language has two ways of expressing a method/property, all in order to save writing some braces and the return keyword 😒
C++ template meta programming allows you to get involved in what the compiler sees and generates, particularly once you start using Rust has support for macros. Whether you define it as a "great developer experience" is hard to quantity. nullable reference types aren't a great developer experience, but they're part of C# now! |
Beta Was this translation helpful? Give feedback.
-
The proposal here does try to take into account the IDE experience. I wonder which parts would be considered adequate, and which require more work? |
Beta Was this translation helpful? Give feedback.
-
Before you needed to create a statement context (a block) even if your method body only contained expressions. Expression-bodies-members allows you remove statement context from your methods if their context is not needed (as you could always do with lambdas). The feature was not about reducing typing but making the language more consistent across different constructs (methods, local functions, and lambdas).
Shapes is currently exploring something around this (and would be my preferred way to tackle some of the same problems that templating solves in C++). I would not like to replicate template meta-programming as it exists in C++ today as explaining to the user why a C++ template failed to compile is a very difficult problem. I am excited to see how C++ 20's Concept implementation works in practice as its adopted by all the C++ compilers as this will hopefully solve some of the footguns that exist today with template meta-programming.
Do you enjoy using them and find the development experience pleasant? I am fine with at least getting your opinion. |
Beta Was this translation helpful? Give feedback.
-
My understanding of this proposal is that typing inside a method body now requires a potentially unbounded number of re-writers to run before we can begin to analyze the semantics of a method body. |
Beta Was this translation helpful? Give feedback.
-
I think that was the major concern at the time, yes, but the conversations in Gitter seem to imply that there was a suspicion around AOP-generators affecting the written source and how that would impact the developer expectations and IDE experience in situations like debugging. If I had to guess the reason the team went with the current flavor of generators is that they have zero impact on the language spec so it would be easier to evolve them over time or call bankruptcy and remove them entirely without leaving weird warts in the language.
I don't recall the
This is definitely true, both in which order the generators themselves run (and whether they can influence one another) and then again when it comes to the order in which the compiler weaves the replaced methods back together. The behavior would need to be deterministic in both cases to ensure a good experience when composing aspects. These concerns don't seem nearly as insurmountable as the IDE issues described above, though. In either case, I do hope that if/when the tooling experience around additive source generators is deemed satisfactory that the team will be willing to reconsider AOP or rewriting options again. |
Beta Was this translation helpful? Give feedback.
-
Rust's approach (and tooling experience) has been mentioned before: #107 (comment) I wonder if it has changed since. |
Beta Was this translation helpful? Give feedback.
-
Expression-body members don't fix this as they only allow one expression, not a sequence of expressions. If you need to change an expression-body member to have an The Shapes work is interesting, and it's good to see new features that aren't just syntax fluff! It's a pity that C# won't commit to better support for meta-programming. I find this particularly strange as C# has strong support for generating code on the fly , either via direct Emit or my using the Expression framework, and also as so much good work has gone into the Roslyn framework.
The few times I've used them, yes. In my distant pass I uses Scheme macros also and didn't have any issues with them. Mind you, using Scheme isn't exactly a pleasant experience either! |
Beta Was this translation helpful? Give feedback.
-
Those are stements not expressions. You had the same limitations with lambdas. Sounds like you do not like the feature. That is fine. There are parts of this feature I do not like either.
C# isn't going to commit to any language feature. If that is what you want you are going to be disappointed.
Can you describe what you liked about using them specifically? what problems did they help you solve? |
Beta Was this translation helpful? Give feedback.
-
Interestingly enough, many of the 'Shapes' proposals are just syntax fluff. i.e. they allow you to express something you can already do today, albeit in a much terser and ergonomic fashion. The same is true for most of every release. It's rare that a lang feature is truly allowing something that couldn't be done at all before. The only recent stuff that comes to mind is a lot of the Turns out, syntactic fluffery is what makes a language, and what generally provides the most benefit and enjoyment for customers release on release. |
Beta Was this translation helpful? Give feedback.
-
It was some time back, and was part of a DSL for a banking application. What I liked is that they allowed you to make user defined actions appear as if they were first-class parts of the language. What I didn't like was that they syntax could quickly become difficult to understand, but that's probably a complaint about Scheme in general! Most of my meta programming experience has been in C++ where I've used it in low latency applications to optimize message parsing, select implementation based on type etc etc. It's not an answer to every problem, but it's an answer to some. Herb Sutter's proposal to add meta-classes to the language is an interesting extension to this.
"commit" has probably a bad choice of word on my part. |
Beta Was this translation helpful? Give feedback.
-
Yes, but recently C# is getting particularly bad as adding so much fluff you'll need a delinter 😀 For example in c# 9 we've got:
C# 8 gave us:
C# 6/7 gave us
For top level programs I especially like the reasoning that:
Believe me, having to stick a class with a
Yes, that's fair. But look at the relational pattern and logical pattern stuff going into C# 9. Seriously, how many slightly different ways do we need to express the same thing? Don't get me wrong, there's been plenty of good stuff over the years:
|
Beta Was this translation helpful? Give feedback.
-
Well the language is Turing complete so technically no language features will ever add new capabilities :) Meta-programming via new type-system concepts like Shapes and potentially higher-kinded-types is how I would like to solve a lot of these problems. Allowing generators to be a catch-all for these is not the correct approach in my opinion.
I don't think any argument like "People find x overwhelming" is very convincing because you can never predict what overwhelms the person you are talking to. Though this is something we observed in people learning programming for the first time. I am in favor of top-level-programs because it helps us bridge script-like syntax dialects with the main language. Using C# in more script-like contexts is a useful expansion of the domains the language can be used in imho. But you are still welcome to not like it. |
Beta Was this translation helpful? Give feedback.
-
Your statement presumes that these features are bad. That's your own personal view based on what you do or do not find valuable in a language. We believe that the addition of these features is a good thing for the lang and ecosystem and they'll help make it easier for people to write better programs more productively. All the things you mentioned in C# 6-7 are all things we use ourselves extensively. And we will do the same with teh C# 8 features now that they're available.
You'd be very very very surprised about the different communities and user bases out there, and just what can be problematic for them. The trick is to not assume that all devs are like yourself :)
Given that you couldn't do any of that stuff (which is super useful) inside patterns. We went from 0 ways to do it to one way... |
Beta Was this translation helpful? Give feedback.
-
Well, this is anecdotal, but this very thing happened to me. I only ended up learning C# instead of some other language because the thing that I actually started with was tied to the .NET ecosystem. For me it is a question of relative complexity. Why would you choose to learn something that opens up like this: using System;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
}
}
} over this: System.Console.WriteLine("Hello World!"); To me it would seem natural for people without prior knowledge to assume that the 1st language is just inherently that much more complex, so why even try? |
Beta Was this translation helpful? Give feedback.
-
Currently source generators are additive only. This is great for some scenarios, but significantly less so for others, where one would hope for 'macro' style functionality which can rewrite existing code. This issue will explore an algorithm which might be suitable for such a source rewriter.
In truth this is a suggestion for tooling, rather than the language, but I think csharplang is a better place to discuss such proposals than roslyn, since such source rewriting would help solve many of the currently open issues on this repo.
Motivating examples
I'm going to focus on two examples here:
await expression
withawait expression.ConfigureAwait(false)
when expression has a suitableConfigureAwait
method. We'll call this "AwaitRewriter".logger.Debug(expression)
withif (logger.IsEnabled(LogLevel.Debug)) logger.Debug(expression)
. We'll call this "LogRewriter".Prerequisites.
Because this proposal is about rewriting expressions, we will require expressions to be capable of expressing almost any code, such as conditionals. Specifically this will require some form of expression blocks to work. I will use the syntax suggested in #3086, namely a block surrounded by
{}
with no;
in the final line e.g.var x = { var a = 42; a };
.Constraints
The suggested solution should adhere to the following constraints:
1. Composablity
It is quite likely that multiple source rewriters will act on the same code. This must work well, not just not fail.
For example, lets say we originally have some code like this (Api is made up):
AwaitRewriter rewrites this to:
LogRewriter rewrites this to:
We must make sure that this is then rewritten to:
But what we don't want is for the LogRewriter to then see that the code isn't exactly how it would expect it to be and rewrite it again, leading to an infinite loop.
At the same time it is not an aim of this proposal to deal with source rewriters which are inherently incompatible. For example, if one source rewriter always adds
ConfigureAwait
to tasks, and another source rewriter always wrapsConfiguredTaskAwaitable
s in aTask
, then these will not work together, and the compiler will presumably run for some maximum number of iterations before giving up and producing an error.2. debugability of emitted code
step in, step over, setting breakpoints etc. should all work fine on code which has been rewritten. Ideally ENC should work as well.
It should also be possible to view the final rewritten code and step/set breakpoints on it. This means the output of the source rewriter should ideally be legal code.
3. debugability of rewriting process
It should be possible to view the set of transformations that happened on a peice of code, and view the final output, in order to see what may have gone wrong. This means the output of the source rewriter should ideally be legal code.
4. deal with ordering in which source rewriters run
If multiple source rewriters run on the same expression, then one of two things should happen:
a. there is a well defined predictable order in which the rewriters run
b. the order in which the rewriters run shouldn't matter
5. discourage arbitrary rewriting of code.
It should not be encouraged to use source rewriters to create arbitrary DSLs, by allowing arbitrary rewriting of code. The orinal code that was written should strongly correspond to the final code which is output.
Proposal
Source Rewriter API
The source rewriter API should be designed to allow the following usage:
The source rewriter can visit an expression. In the act of visiting an expression the orginal_expression is given to the source rewriter. The source rewriter can return the original_expression unchanged, or can wrap one or more usages of he original_expression inside a larger expression.
The original_expression cannot be modified, nor can the source rewriter return an expression not containing at least one usage of the original_expression.
There are various ways such an API could be designed. I will not suggest any specific one here, since they all have merits and costs, and I would like to focus here on the wider proposal, not the specifics of the API.
For now pretend every source rewriter overides the following method:
The purpose of this API is to help with constraints 2. 3. and 5.
Since all expressions in the original code must appear in the rewritten code, source mapping should be relatively straightforward, and e.g. setting a breakpoint on an expression in the original code, shouldhave a straightforward meaning.
Since the api returns an ExpressionSyntax, it should be easy to convert it back into code which would be displayable in an IDE.
Since all expressions in the original code must appear in the rewritten code, arbitrary rewriting of the code is discouraged.
Of course it would be possible to write a source rewriter which replaces every expression with
{ if (false) expression }
or something, and the proceeds to generate whatever code it wants, but this is clearly a misuse of the source rewriter, and so less likely to be used.Source Rewriting Algorithm
When rewriting a member, we have a set of applicable source rewriters (for efficiency, maybe a source rewriter is checked to see whether it's applicable to a member before we use it).
The member is parsed into a syntax tree. Every single ExpressionSyntax in the syntax tree must be visited in bottom up order by all applicable source rewriters. Specifically, an ExpressionSyntax cannot be visited by any source rewriters until all descendant ExpressionSyntaxes have been visited by all applicable source rewriters, and an ExpressionSyntax cannot be visited twice by the same source rewriter.
When a source rewriter visits an original_expression, that original_expression map be wrapped in a larger rewritten_expression. The original_expression in the member syntax tree is replaced with the rewritten_expression. The original_expression is marked as visited by that source rewriter, but the rewritten_expression and its other descendant ExpressionSyntaxes are not, and must be visited by all source rewriters before the parent of the original_expression is visited. When the parent is visited, it is the rewritten parent which is visited, with the rewritten_expression as a child, not the original_expression.
This process continues until all ExpressionSyntaxes have been visited.
The order in which the nodes are visited is arbitrary but deterministic, so long as the bottom up invariant is maintained. Similiarly the order in which each source rewriter runs on each node is arbitrary but deterministic. It would be perfectly ok for one source rewriter to run on nodes in a different order to the order to the other, so long as the bottom up invariant is maintained. In order to discourage people from accidentally coming to rely on the order in which the ExpressionSyntaxes are visited, or the order in which the source rewriters run, I would suggest that the order is randomized based on a hash of all syntax trees in the entire compilation. This will partially help with point 4, by discouraging people from writing source rewriters that are dependant on the order in which they might be run.
This algorithm is designed to allow source rewriters to compose easily (constraint 1). All source rewriters will visit any new expression syntaxes that are generated by other source rewriters, whilst skipping over nodes that they have already visited.
In the example given in constraint 1, the following might occur:
logger.DebugAsync(expression)
logger.DebugAsync(expression)
:logger.IsEnabledAsync(LogLevel.Debug)
logger.DebugAsync(expression)
so it does not visit it again. There are no other expressions here that wouuld trigger it. Similiarly for AsyncRewriter.Open Issues
1. What about methods?
It is also often useful to wrap entire methods rather than individual expressions. By treating an entire method as a single expression, it might be possible to use the same API to do so. However I think it would make more sense to use a similiar, but more specific API for wrapping methods.
2. Should you have to mark a member to allow source rewriting
Should you have to in some way mark a method (e.g. with a modifier), to give a visual cue that something is being rewritten in order to source rewrite that method? For some source rewriters that seems to make sense. For example, with the log rewriter you might wonder why a breakpoint on
log.Debug
wasn't getting hit, and a marker on the method might help remind you that it's being rewritten. On the other hand it would be a pain to add a marker to everyasync
method, which would be required if you wanted to use the AsyncRewriter.3. Should diagnostics/analyzers run on the original or rewritten code
I think this should be configurable per analyzer. Style analyzers should definitely run before. Analyzers warning about incorrect usages of APIs (e.g. Obsolete) should definitely run after. Others, like warning on unused variables are more questionable.
Beta Was this translation helpful? Give feedback.
All reactions