Skip to content

Qi Compiler Sync Aug 26 2022

Siddhartha Kasivajhula edited this page Dec 15, 2022 · 11 revisions

Core Stratification and Bindingspec Hacking

Qi Compiler Sync Aug 26 2022

Adjacent meetings: Previous | Up | Next

Summary

We spent our time (about 4 hours!) almost entirely on pair programming to incorporate the bindingspec meta-DSL into the Qi source (WIP Pull Request). Along the way, we made the distinction between core and non-core forms of the language more clear, introduced some new ways to handle certain core forms, and uncovered some issues that we had to work around which may warrant additional consideration down the line.

Background

As a hosted DSL, Qi consists of a single Racket macro that expands Qi syntax to Racket syntax. This macro contains about 60 forms that together comprise the Qi language. That is, every one of those forms is a rule in a single macro (the flow macro).

With the recent work to add macro-extensibility to the language, it became possible to write Qi macros that extend the syntax of the language in much the same way that Racket macros extend Racket -- that is, the syntactic extensions have the same status as "built-in" forms. Side by side with the ability to allow users to extend the language, this development also allows Qi itself to be reorganized internally in terms of "core forms" and "extensions" (built-in macros). The benefit of doing this is that it allows us to decompose the expansion phase of (the Racket macro) flow into two components -- expansion of Qi macros (not Racket macros) to "core" Qi forms, followed by compilation of the core language to Racket, bringing many benefits including the ability to improve performance.

In the last meeting, we had separated the ~60 forms of the language into about 30 core forms and 30 macros. This laid the foundation to start exploring using the bindingspec meta-DSL which models these two layers concretely and thus provides many benefits including automatically generating a "good" expander.

Since this was almost entirely a pair programming session, these notes will attempt to describe what that involved, including any useful notes for future hacking sessions.

Stratifying the Qi Core

In order to incorporate bindingspec, the Qi core needed to be properly stratified in terms of core and non-core forms.

Qi Source Refactor

The code was refactored before the meeting to explicitly distinguish the core language from the extensions. This is the current layout of the code.

├── flow
│   ├── aux-syntax.rkt
│   ├── core
│   │   ├── compiler.rkt
│   │   ├── impl.rkt
│   │   └── syntax.rkt
│   └── extended
│       ├── expander.rkt
│       ├── forms.rkt
│       ├── impl.rkt
│       └── syntax.rkt
├── flow.rkt
├── info.rkt
├── macro.rkt
├── main.rkt
├── on.rkt
├── private
│   └── util.rkt
├── switch.rkt
└── threading.rkt

The main thing to note here is that core/compiler.rkt expands the core Qi language to Racket, while extended/forms.rkt contains syntactic forms that were formerly lumped together with the core forms in the original flow macro, but which are now implemented as Qi macros (for reference: A more complete description of these modules).

We worked on adding bindingspec to the formerly empty flow/extended/expander.rkt module. As part of that, we needed to make some modifications to the core forms. Those are described below.

Tagging Special Core Forms

Some Core Qi forms (i.e. in core/compiler.rkt) facilitate partial application of Racket functions. To do this in a tractable way that could be properly modeled as a core form, we chose to adopt Racket's approach for this kind of thing -- interposition points. So now:

  • Blanket template patterns (a ... __ b ...) are tagged by the expander as (#%blanket-template (a ... __ b ...).
  • Fine-grained template patterns (a ... _ b ...) are tagged as (#%fine-template (a ... _ b ...).
  • Partial application patterns (a ...) are tagged as (#%partial-application (a ...)).

This allows the compiler to define #%blanket-template, #%fine-template and #%partial-application as core forms.

Additional Delineation of Core Forms

  • The rule to match function identifiers (and interpret them directly as Racket functions) was demoted to a non-core expansion rule by wrapping the expansion using esc (which is a core form), so, a rule f -> (esc f).
  • The rule to match literals in the compiler was moved up to the expander as a rule v -> (gen v).
  • The Qi macro-expansion rule which was formerly in the core was removed since it is now generated by the (bindingspec-produced) expander.

Bindingspec Integration

Specifying the Core Language Grammar

We spent a lot of time initially on declaring a grammar for the core language in the bindingspec meta-DSL, declaring the syntax of core Qi forms one at a time.

Once we were done with that, we were able to generate the required Qi expander simply as:

(define (expand-flow stx)
  ((nonterminal-expander floe) stx))

... where floe was defined as a nonterminal in the grammar specification and entailed the core language.

Issues Encountered

The rest of the time was spent debugging various issues. Some common issues encountered were:

Exclusive Use of Core Language in the Compiler

In the core (compiler) level, only core forms may be used. With the additional delineation of core forms above, there were many core form productions that needed to be modified as they were no longer exclusively in terms of core forms. In particular, any function-valued identifier needed to be changed from func to (esc func) since identifiers when used on their own are now expanded so that they are wrapped by esc in a production rule in the expander.

Likewise, core forms could no longer use literals directly, since these, too, were now expanded to (gen ...) at the expansion stage. All instances of direct use of literals had to be modified from v to (gen v).

Similarly, use of partial application templates had to be wrapped in the relevant "interposition" core form, e.g. (#%fine-template ...).

Name Collision Betweeen Qi and Racket Forms

Bindingspec takes a strict position on such collisions and assumes that the user meant to use the DSL form, raising an error if it doesn't match the expected syntax. In Qi, since syntax that doesn't match one of the forms of the language may still match a partial application of a Racket function which may happen to share the same name, in some cases we may wish to explicitly signal this possibility and allow the use of partial application syntax for these Racket functions.

We wrote a hack to achieve this, by naming the actual core form something different (e.g. appleye instead of apply), and then handling the original name of the form as a production at the expander level, by expanding the Qi syntax (apply used as an identifier) to a use of the core form (appleye). This allows the use of it as partial application to be handled correctly since it is no longer considered a Qi form.

Debugging Strategies

Some strategies employed were:

Bisection to find problematic expansion rules

"Bisect" to a problematic expansion rule in the bindingspec specification by adding a rule to match anything and raise an error (e.g. (~> v:expr (begin (displayln "hello!") (error 'bye)))), and moving this rule around until the source of the original error is identified.

Printing the expander inputs and outputs

(define (expand-flow stx)
  (displayln (~a "input: " stx))
  (let ([result ((nonterminal-expander floe) stx)])
    (displayln (~a "output: " result))
    result))

Printing the compiler inputs and outputs

(define-syntax (qi0->racket stx)
  (let ([result (syntax-parse (cadr (syntax->list stx))
    ... compiler rules (i.e. core form expansions) ...
    (displayln (~a "qi0->racket output" result))
    result))

Getting Detailed Syntax Info on Suspect Patterns

... by adding this to parse the input to expand-flow:

(define (expand-flow stx)
...
  (syntax-parse stx
      [(a:id . _) (displayln (~a "syntax info: "
                                 (syntax-debug-info
                                  ((make-interned-syntax-introducer 'qi) #'a))))]
      [_ (void)])
...)

Getting Source Location Information for Syntax Errors

(raise-syntax-error #f "Error!" this-syntax)

instead of using Qi's report-syntax-error utility. TODO: Modify report-syntax-error to show source location information.

Bindingspec Improvements

Quirks of Binding Spaces

We ran into an issue where the threading interface macro ~> was in some way shadowing the ~> defined in the qi binding space. This has something to do with the way binding spaces use the "set of scopes" model, that is, by simply augmenting the existing set of scopes with a new scope. Thus, in some sense they are not totally independent, and that could lead to problems like this one.

Workaround: We used different names for the Racket-level threading forms R~> and R~>> instead of ~> and ~>> but use rename-out at the module level to restore the correct names.

Syntax Properties and Syntax Parameters

Qi uses a chiral syntax property to map a syntax object to a version that is aware of its "chirality", or the side on which it accepts arguments. This is modulated by the right-threading form (~>>) which makes component forms right-chiral, where otherwise, by default, forms are left-chiral. Bindingspec doesn't currently copy over syntax properties, and adding that would be necessary to support this case.

But a potentially better implementation of the chirality behavior is to use syntax parameters instead, so that, within a threading form, all forms are parameterized as left- or right-chiral, depending on the threading form, and this would be overridden by nested threading forms that would modulate the chirality in either direction. Bindingspec doesn't support syntax parameters yet, either, and it might be a more involved effort. This support may also be added.

Compiler Resources

Danny shared this paper as a good introduction to writing compilers:

An Incremental Approach to Compiler Construction

He's also using bindingspec's predecessor ee-lib in the typed-nanopass project.

Next Steps

  • Support syntax properties and/or syntax parameters in bindingspec (see above), and resume the integration.
  • Review what's necessary to be able to support a define-qi-alias (i.e. binding-space specific) form similar to define-alias (which uses make-rename-transformer).
  • Write down some example optimizations (pseudocode) that we'd like to see in the Qi compiler, without worrying about the implementation yet.
  • Review CI failures resulting from exported Qi macros (e.g. this and this).

Participants

Benjamin, Danny, Figgy, Michael, Sid, Super

Clone this wiki locally