Skip to content

Qi Compiler Sync Sept 30 2022

Siddhartha Kasivajhula edited this page Dec 15, 2022 · 5 revisions

Designing Bindings for Qi and Bindingspec Syntax

Qi Compiler Sync Sept 30 2022

Adjacent meetings: Previous | Up | Next

Summary

We discussed some examples for bindings in Qi, considered the binding model they each implied, and seemed to converge on a design. We also discussed Bindingspec syntax and how the expander is implemented at a high level.

Background

Qi doesn't have bindings and this may lead some to believe that Qi is pointfree and proud to remain so. But that's not the case! Qi is more concerned with clarity, and while having names in some cases obscures the underlying process, in other cases, they reveal it more clearly. This is especially the case for:

  • incidental values in the flow
  • intersecting paths in the flow

The former refers to values that are necessary to the result but don't participate in the main flow. They lead to nonlinearity in the flow structure that we'd prefer to factor out.

The latter refers to cases where an intermediate result in one part of the flow is desired downstream, but where laying it out in "two dimensions" is not trivial. In such cases, without there being a "multidimensional Qi" available, we'd prefer to have a way to avoid modeling the structure.

Bindings are one answer here.

Examples

Basic

In this example, we name an intermediate value computed in the flow using the proposed as form and refer to it later.

((☯ (~> (-< (~> list (as vs))
            +)
        (~a "The sum of " vs " is " _)))
 1 2)
;=> "The sum of (1 2) is 3"

"State Monad"

In this example, we perform a series of operations on an input number, while also accumulating a list representing the sequence of resulting values. We want to do this in a way that doesn't require modifying the original (numerical) functions.

(~> (5)
    (-< _ (~> list (as S))) ; assuming `as` produces no output so that only one value flows
    (-< sqr (~>> list (append S) (as S)))
    (-< add1 (~>> list (append S) (as S))))
;=> 26, '(5, 25, 26)

More examples we didn't talk about are in Add the ability to define bindings.

Initial Design Goals

Threading should bind successors

(same as "basic" example above)

((☯ (~> (-< (~> list (as vs))
            +)
        (~a "The sum of " vs " is " _)))
 1 2)
;=> "The sum of (1 2) is 3"

Since flows may be arbitrarily nested, it seems necessary that an as binding is always scoped to the outermost containing threading form. In the above example, the binding is scoped to very first threading form rather than the nested one, and that's why it is available in the final ~a result-formatting flow.

Tee junction should not bind peers

(~> (1 2)
    (ε (~> list (as vs)))
    (-< (~> (>< sqr) list (as vs))
        (+ (first vs) (second vs))) ; this should refer to the _prior_ value
    (~a "The new list is " vs " and the sum of the original values is " _))
;=> "The new list is (1 4) and the sum of the original values is 3"

Should as propagate the bound values?

(as v) could either produce no output or it could propagate its inputs. In the latter case, we could use it as (effect (as v)) instead of (-< (as v)) to ground the output.

Binding Spec

Tree vs DAG Binding Models

The "not binding peers" design goal implies a DAG binding model instead of a tree binding model. It would introduce some cases of ambiguity including:

  • downstream of the tee junction, would identifiers be bound by one of the tines of the junction, or by binding forms upstream of the tee junction?
  • if more than one tine binds an identifier, which one binds downstream flows? (Note the "diamond" shape).

These could be resolved but they would require a DAG binding model (a DAG is necessary to express a scope that is contained in two other scopes) which is currently not supported in bindingspec. In order to transform this into a tree model we'd need to either say that:

  • bindings remain within a tee junction and don't bind identifiers downstream of the tee junction.
  • earlier tines of the junction bind later tines.

The latter option seems acceptable. If there are cases where we would like to treat the different tines of the tee junction as independent (e.g. in compiler optimizations, or in Qi dialects that use independent processes for each flow in the tee junction, which could be "joined" prior to generating output -- like "process composition" instead of "function composition"), then there could be an extra check in the compiler to see whether such optimizations are possible, i.e. whether any of the tines bind identifiers used in other tines.

Why is it necessary to notate the binding rules in the expander?

The scoping rules of the DSL are not necessarily the same as those of the underlying language. Especially considering that the DSL may compile to arbitrary core language syntax that may naively imply arbitrary scoping rules, it's necessary to preserve the scoping rules of the DSL through those compiler transformations. By notating the binding rules in the expander (using bindingspec), it ensures that the appropriate scope sets are attached to the expanded syntax so that intended scoping rules and hygiene are preserved through compilation.

Notating the Scoping Rules in Bindingspec

This is done by writing out a template of a tree that we want the binding structure to be shaped as.

Nonterminals vs Syntax Classes

Nonterminals declared using bindingspec are compiled to nonterminal expander functions which take care of expansion including applying binding rules. Once the input syntax has been matched to a particular production rule, the corresponding expander function is invoked with the input syntax. Indicating a nonterminal using : doesn't use any form of matching, however (as syntax classes do when used with similar : syntax) -- it just indicates the expander function to use. Once the expander function has been delegated to, the expander does not try to match subsequent production rules if this ends up failing for some reason. As a result, there can't be two rules that have the same form but indicate different nonterminals (but using different syntax classes is OK?).

Next Steps

  • Implement bindings as a distinct pass in the compiler
  • Notate the binding syntax in bindingspec

Attendees

Michael, Sid

Clone this wiki locally