Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any need for a contributor? #11

Closed
ChenNingCong opened this issue Feb 19, 2022 · 5 comments
Closed

Any need for a contributor? #11

ChenNingCong opened this issue Feb 19, 2022 · 5 comments

Comments

@ChenNingCong
Copy link

Hi, I wonder whether this project needs a student contributors. I am an undergraduate student from Peking University (in China). I am mainly interested in the rewriting the lowering pipeline (defined in julia-syntax.scm) of Julia's frontend. Previously I was working on static compilation of Julia (using LLVM JITLink) and I found that it would be helpful to rewrite Julia's frontend (to get rid of some gensym and record necessary information) if a more static approach is needed. But people working on this stuff told me that Julia is unlikely to work towards this direction and will stick to solution like system image + parallel compilation, which renders my though unnecessary. Besides, rewriting the frontend is time-consuming and hard to review.

The developers point me to this project. This project is really great and looks really promising! I would like to devote my free time to this project, mainly the lowering part (it seems that currently you are focusing on the parser part). I am looking forward to hearing your thoughts on it.

@c42f
Copy link
Member

c42f commented Feb 20, 2022

Hi @ChenNingCong!

I'd definitely like to tackle lowering as part of JuliaSyntax.jl.

Currently JuliaSyntax.jl doesn't have the data structures in place to enable anyone to start rewriting lowering — the first step will be to figure out what these should be with some prototyping. My plan for such prototyping is roughly:

  • Choose a very restricted subset of Julia syntax (so we don't need to write all of lowering right away)
  • Choose some distinct parts of lowering and write a working sketch of the code so that we get a feel for
    • What the data structures need to be
    • Which algorithms/macros we need to expressively and efficiently write lowering passes
    • How the data flow between lowering passes will work. In particular -
    • How will we preserve precise source code locations as the code is transformed during lowering?
    • How can we write reusable lowering passes which can somehow be used to annotate the original surface-level AST with metadata? (For example, this will be required for refactoring tools.)
  • Read about how other compilers or refactoring tools deal with metadata and AST annotation. rust-analyzer has been a good source of inspiration so far. It may also be useful to read about the Roslyn APIs.

We have been discussing some of these questions already on the JuliaLang Zulip (https://julialang.zulipchat.com) in the #compiler-frontend channel — please join there and read the discussion so far for some context.

In particular

  • We've discussed AST pattern matching and @BenChung has started working on tools for that which will have a big effect on what lowering (and AST manipulation in general) looks like.
  • We've discussed annotating ASTs with metadata using side-tables, potentially using an Entity-Component-System (ECS) -like data structure.

To summarize, it'd be great to have people working on lowering but just beware that we might need to try and discard several different prototypes before we settle on a satisfying way to do this. Once we're convinced we've got the right data structures, we'll be able to start porting the whole reference compiler frontend from flisp to Julia.

@ChenNingCong
Copy link
Author

ChenNingCong commented Feb 20, 2022

Thanks for your quick reply @c42f !

Previously I was trying to use CSTParser.jl as my starting point to rewrite lowering pipeline (so I directly lower EXPR to Julia IR). But I quickly found that CSTParser.jl is problematic because it's too type unstable, which makes standalone compilation impossible, check this issue. Tokenize.jl is a good example of static compilation, of which I have produced linkable object files. Another problem is that the source location information is mainly represented by span and trivia , which are a bit inconvenient to manipulate. Before I knew this project, I think the basic design of CSTParser.jl is still fine, if we don't care about the performance or the aforementioned problem.

It seems that currently JuliaSyntax.jl is focusing on design of surface AST item, like macro expansion/parsing, which is mainly for better IDE support (correct me if I am wrong). I am more interested in semantics analysis of Julia code. Currently Julia's frontend has become a major obsolete for downstream code analysis. One notable example is debugging. Another one is the type inference issue. Even if JET.jl can detect the existence of type unstablity, it's still hard for programmers to locate the cause of it due to loss of lowering information.

So I am mainly concerned with these two points you have raised above:

  1. How the data flow between lowering passes will work. In particular -
  2. How will we preserve precise source code locations as the code is transformed during lowering?

After reading conversations on Zulip, I found that they are mainly concerned with macro/AST matcher (the prerequisites of a reusable frontend), not lowering. I personally don't think AST matcher is a must for the lowering pipeline, we can directly manipulate AST just like the old flisp implementation (that's how I work with CSTParser.jl). Macro expansion has its own problem, but it's hard to solve the problem with a new frontend. The difficulty is intrinsic and discussed by many PL researcher, but still unsatisfying. The description on github is also biased toward parsing and doesn't emphasize lowering.

I think it would be helpful to firstly have a rough working prototype of lowering pipeline, from which we can derive what information is needed in the parser. Of course, this prototype needs iterations to get the final product but it can provide a platform for further experiments. This is a somehow demand-driven design and I think this is more natural. That's why I ask whether contribution on lowering is needed.

Note: The lowering pipeline is consisted of two parts. The first part is expand-forms, which outputs Expr and is used to remove syntax sugar in Julia. The second part is the actual one emitting Julia IR. I think we can firstly try to port the expand-forms part since it's closely related to the syntax. Emitting of Julia IR needs to modify Julia's IR and will be a huge project.

@c42f
Copy link
Member

c42f commented Feb 21, 2022

It seems that currently JuliaSyntax.jl is focusing on design of surface AST item, like macro expansion/parsing, which is mainly for better IDE support (correct me if I am wrong).

Parsing is what's been implemented so far, but keep in mind that this project is only a few months old, so the current state of the code is naturally immature. The goal is not just to improve IDE support, but to eventually replace the flisp parser and lowering stages in the reference compiler. This is mentioned at the top of the README.

I personally don't think AST matcher is a must for the lowering pipeline, we can directly manipulate AST just like the old flisp implementation

Certainly the AST can be matched "by hand". I did this in an old PR which rewrote a fair part of the flisp desugaring pass in Julia Base: JuliaLang/julia#32201. This is also how the existing conversion to Expr works within JuliaSyntax.jl right now. (By the way JuliaLang/julia#32201 wasn't merged because it was a big job which was difficult to do incrementally inside Base.)

Knowing that we want a neater way to do matching and having various aspects of parsing still to complete, I've been content for now to leave lowering until later. But matching is not essential; just nice to have.

Of course, this prototype needs iterations to get the final product but it can provide a platform for further experiments. This is a somehow demand-driven design and I think this is more natural. That's why I ask whether contribution on lowering is needed.

Right perhaps I wasn't clear about this. To answer more directly:

  • I'd happily accept quality contributions for lowering. This is in scope for JuliaSyntax.jl
  • I don't want a full and complete implementation of lowering to start with. I'd prefer to start with some experiments of a partial lowering pipeline (for example, handling a subset of surface syntax forms) to understand the design problem. Then build things up from there.

Emitting of Julia IR needs to modify Julia's IR and will be a huge project.

There's another option here: we write code to emit the existing Julia IR in order to get JuliaSyntax into Base in the medium term. Then consider how the IR needs to change in the longer term. Like you say, it's a big project. So having some intermediate steps like this is a good thing.

@ChenNingCong
Copy link
Author

Ok, Thanks for your clarification! Now I am having a much clearer picture of current situation. I will join the discussion at Zulip and follow the development work there.

@c42f
Copy link
Member

c42f commented Feb 22, 2022

Excellent, I hope to see you there! Zulip may be the best place for these kind of project-level questions anyway — there's more people joining the conversation there, including @simeonschaub, @BenChung and @thautwarm who all have some experience and interest in Julia lowering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants