Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Haxe AST -> C# AST -> C# Printer #13

Open
jeremyfa opened this issue Feb 21, 2024 · 5 comments
Open

Haxe AST -> C# AST -> C# Printer #13

jeremyfa opened this issue Feb 21, 2024 · 5 comments
Assignees
Labels
TODO Big todo feature

Comments

@jeremyfa
Copy link
Collaborator

I started working on migrating current code from DirectToStringCompiler to GenericCompiler as well as make it use a printer instead of simply returning strings as expression. That (unfinished) work is in the generic-and-printer branch.

The initial rationale was to print C# code directly from Haxe AST because it was assumed the two languages are pretty close, and differences could be handled by manipulating Haxe AST directly, but @Simn advised on Discord against doing that and instead suggested to add a step to generate C# AST and print from it, in order to be sure we can carry the information we need and avoid making wrong assumptions.

Simon's explanation for reference:

What I can tell you is that the late C# target went for this "make the Haxe-AST more C#-compliant" appraoch with its gencommon framework, and that turned into a bit of an assumption hell. Things would frequently break when those assumptions turned out to no longer hold, and overall there was always this shadow of uncertainty because the data structure would not accurately match the output.
As for concrete examples, IIRC one major problem with C# in particular is that its generics require type parameter information on calls which the Haxe AST doesn't carry. I don't remember the details, I think you had to figure out how concrete types mapped onto declared type parameters. You'll have to do this either way, of course, but doing it in a single, well-defined place instead of somehow trying to keep this information in the Haxe AST would just be way more pleasant.

As we don't want to repeat the mistakes of previous C# target, I decided to follow that advice and will work on generating an actual C# AST and print from it. This certainly requires some more work than without the intermediate C# AST but I agree it's for the best, long term.

Thus I'm pausing what I started in the generic-and-printer branch (there are still a lot of C# printing snippets that will be useful for later), and will start working on a Haxe AST -> C# AST -> C# Printer version (on a new branch).

This issue is there to let you know I'm working on this and keep track of the progress.

Current plan

(might change depending of how it goes when implementing it or from feedback in this issue):

  • Add a CSTypedExpr typedef which will be very similar to Haxe's TypedExpr, but modified as needed to match exactly the intended C# output. Do the same for any AST type needed to represent C# hierarchy (like ModuleType, ClassType etc...).
  • Use a GenericCompiler that is typed to return C# AST instead of strings
  • Create a CSPrinter class that outputs C# code from this C# AST

Timeline

I want to take time to make this right, so no plan to rush it. However, this project is very important for me (because I need it for Ceramic) so I'll dedicate a good amount of time to work on that C# target in 2024. Initial iteration for that C# AST implementation might take several weeks, then from that point if it goes right it will be merged to development branch and contributions from more people could be easier to manage without too much overlap.

A lot of work ahead, but exciting as well 😄

@jeremyfa jeremyfa added the TODO Big todo feature label Feb 21, 2024
@jeremyfa jeremyfa self-assigned this Feb 21, 2024
@SomeRanDev
Copy link
Owner

Sounds good, let's do it!

@SomeRanDev
Copy link
Owner

SomeRanDev commented Feb 22, 2024

Notes about my contributions so far:

  • Using classes instead of typedefs for AST. Not sure if it matters one way or the other, but having methods would be nice?
  • Separated expressions and statements as everything is NOT an expression in C# (at least not for the version of C# we're targeting?)
  • Renamed compiler component classes so CSClass and CSEnum can be used for AST structures.
  • See my notes on CSType. Right now it stores a reference to CSClass/CSEnum, but that's a lot of unnecessary data. Could possibly convert those to just storing a dot-path for a type? We can figure it out as we go.
  • Let's just develop on development branch. I made a v0.1 branch to retain our "working" version, Plus I turned off the nightly build actions so haxelib git will not be affected by our failing changes.

I'm open to changes for anything, so don't hesitate to make big structure changes if necessary. It's hard to tell what is or isn't going to be necessary until it's being printed, so we'll just have to feel it out as we go.

@jeremyfa
Copy link
Collaborator Author

That was fast!

This starting point looks good to me, and ok let's stick to development then.

Not sure about using classes for AST though, because you get some free perks when using typedefs, like being able to print their structure on console instead of just having cscompiler.ast.CSStatement for example. For additional methods, we could always have static extensions no? Other option is to do use classes but then we'll probably have to provide a way to easily print AST structure on console to make debugging easier. Maybe @Simn has insights on what is best to use here?

I've read your note on CSType. I guess at places where we don't need to point to a fully parsed type (class or enum), we could use a CSRef<CSType> if that becomes necessary (like what Haxe does with Ref<T>). That said, I would assume it's ok to generate C# AST for an extern ClassType. Seems more convenient, and fields may have meta like @:native() that will affect what we output, so we'll have to keep that information for each field in a CSClass type anyway. At least there are usually no method bodies to parse in externs, so should still be cheaper to parse than a regular ClassType.

@jeremyfa
Copy link
Collaborator Author

On second though, you are probably right that we may not need to actually parse a whole extern class that we don't need to generate, especially if we are just passing an instance of extern type through some haxe code without inheriting from it. We can decide when we actually implement that

@Simn
Copy link

Simn commented Feb 22, 2024

I usually use classes for "heavy" data which might even be built and refined over time, and which is supposed to have an identity. So this seems like a good choice here.

Regarding CSType, I'd make sure to design it in a way that allows you to easily implement checkCast(from:CSType, to:CSType), which might require the ability to check class relationships and such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TODO Big todo feature
Projects
None yet
Development

No branches or pull requests

3 participants