Qi Compiler Sync Dec 15 2022

Bindings Fixes and Syntax Spec Design

Qi Compiler Sync Dec 15 2022

Adjacent meetings: Previous | Up | Next

Summary

We fixed the existing binding scoping rules to match the design for bindings in Qi. We also discussed some design considerations in the syntax-spec language.

Background

Last time, we completed the initial implementation of bindings. We didn't have a lot of tests for it so these were added in the interim and uncovered some problems and some deviations from the initial specification.

Unbinding Bindings

Loosening Bindings in the Threading Form

Since threading forms are the basis for scoping rules in Qi, last time, we over-exuberantly handled any use of the as binding form within non-threading forms as a syntax error:

flow.rkt> ((flow (~> list (-< vs (as vs)))))
; ...qi/qi-test/tests/flow.rkt:320:23: as: Syntax error in (as vs)
; Usage:
;   (as ...) may only be used inside ~>
;   in: (as vs)

In reality, we only want to ensure that there is a containing threading form, not necessarily that the binding form should be directly within (as opposed to nested within) that threading form. We relaxed this requirement, so that now we get a more appropriate (though more generic) error:

flow.rkt> ((flow (~> list (-< vs (as vs)))))
; vs: undefined;
;  cannot reference an identifier before its definition
;   in module: "...qi/qi-test/tests/flow.rkt"

Allowing Bindings to Escape Tee Junctions

Our initial implementation did not allow identifiers bound in a tee junction to be in scope outside of the tee junction, yielding errors like:

flow.rkt> (~> (3) (-< (as v)) (gen v))
; as: undefined;
;  cannot reference an identifier before its definition
;   in module: "...qi/qi-test/tests/flow.rkt"
; Context (plain; to see better errortrace context, re-run with C-u prefix):
;   ...qi/qi-lib/flow/core/compiler.rkt:218:11

We moved the tee junction syntax inside the threading-floe nonterminal, which allowed bindings to be considered to be within the scope of a containing threading form.

flow.rkt> (~> (3) (-< (as v)) (gen v))
3

Allowing Bindings to Escape Relays

Binding a subset of inputs using a relay wasn't recognized.

((☯ (~> (== (as n) _) sqr (+ n)))
 3 5)
; as: undefined;
;  cannot reference an identifier before its definition
;   in module: "...qi/qi-test/tests/flow.rkt"

This was fixed in the same way as for the tee junction, i.e. by listing relay as a production of the threading-floe nonterminal.

((☯ (~> (== (as n) _) sqr (+ n)))
 3 5)
28

Using a Bound Identifier in a Flow Specification

Using a bound value in the specification of a feedback loop wasn't working:

((☯ (~> (as n) 5 (feedback n add1)))
 3)
; make-list: contract violation
;   expected: exact-nonnegative-integer?
;   given: #<undefined>
;   argument position: 1st
;   other arguments...:
;    #<procedure:add1>

This turned out to be because the implementation of feedback in the compiler evaluated the value n at compile-time. We wrapped the implementation in a lambda to allow evaluation to be delayed until the binding is available, that is, this was the same as the "anaphoric references" problem that was addressed earlier in the context of partial application.

((☯ (~> (as n) 5 (feedback n add1)))
 3)
8

Syntax Spec Design Considerations

Convenient Way to Declare Host Language Syntax

Currently, host expressions are declared using the #:binding keyword:

(gen e:expr ...)
#:binding (host e)

We discussed whether it could be inferred from the syntax class / nonterminal, as something like:

(gen e:host-expr ...)

... or even just:

(gen e:expr ...)

This possibility had been considered before but it would mean that we cannot have distinct syntax classes for matching in this case. It may be that matching host expression syntax is a bad idea in general, though -- an overreach on the part of the DSL -- so this may be OK. TBD.

Backtracking and Nonterminals vs Syntactic Categories

Being able to declare the grammar in a mutually hierarchical way (as in traditional grammar specifications) would allow some flexibility in cleanly separating out "sub-languages." For instance, currently, all of the feedback syntactic rules are present in a single nonterminal (simple-floe) alongside most other Qi forms. Another way could be to have it in a distinct nonterminal:

(nonterminal feedback-floe
  (feedback ((~datum while) tilex:floe)
            ((~datum then) thenex:floe)
            onex:floe)
  (feedback ((~datum while) tilex:floe)
            ((~datum then) thenex:floe))
  (feedback n:expr onex:floe)
  #:binding (host n)
  (feedback onex:floe)
  feedback)

Similarly, we could define the threading nonterminal and binding form as something like:

(nonterminal/nesting threading-floe
  (thread f:binding-floe ...)
  #:binding (nest f nested))

(nonterminal/nesting binding-floe
  f:as-floe
  f:floe)

(nonterminal/nesting as-floe (nested)
  (as v:racket-var ...+)
  #:binding {(bind v) nested})

This would allow us to have all of these nonterminals at the top level inside the floe nonterminal, instead of introducing a simple-floe to "terminate" the hierarchy:

(nonterminal floe
  #:description "a flow expression"
  #:allow-extension qi-macro
  #:binding-space qi
  (gen e:expr ...)
  (~> ((~literal _) arg ...) #'(#%fine-template (_ arg ...)))
  _
  ground

  ... etc ...

  f:threading-floe
  #:binding (nest-one f [])
  f:feedback-floe

  ... and all other Qi forms ...

  (#%partial-application (arg:arg-stx ...))
  (~> f:id #'(esc f)))

But this isn't supported at the moment since the syntax-spec implementation doesn't backtrack. That is, once it commits to a particular nonterminal, if that doesn't end up matching, it doesn't attempt subsequent ones. So for now, it may be more accurate to think of these "nonterminals" as syntactic categories since they represent a strict, non-overlapping hierarchy.

Backtracking may be added in the future, but it doesn't appear to add any functionality that cannot be achieved without it (though in an aesthetically less modular way). The reason it isn't currently implemented is that it introduces a large space of problems that haven't yet been addressed. For instance:

How should errors be reported when an error has been encountered deep in the nonterminal tree? Which nonterminal is the most relevant?
Since each nonterminal may be extended using a macroexpansion form, how should syntax errors in such macros be handled, in the case where the extension form is the same as that of a nonterminal at a higher hierarchy level (e.g. floe) vs the present hierarchy level (e.g. feedback-floe)?

Relationship to Syntax-Parse

We could think of syntax-spec as a higher abstraction level on top of syntax-parse, but there are some differences that make this not a neat conception for now. For instance, although syntax-spec uses syntax-parse as much as possible, syntax-spec has a stronger requirement that there should be a deterministic mapping pat : syntax -> environment with pattern variable bindings (defined by the syntax pattern) as well as from such environments back to their source syntax. This inverse mapping is necessary because syntax-spec must reconstruct the original syntax in many cases. Syntax-parse does not need this, and indeed, maintaining this in typical cases would be unnecessary extra work.

Examples of information-destroying patterns are _ which matches anything, and (~or ...) which binds syntax matching any pattern to the same pattern variable.

In order to preserve this information, it may be necessary for syntax-spec to reimplement some syntax-parse functionality in order to maintain such inverse mappings.

There are also cases where syntax-parse currently does not have enough information to provide language-level error messages like the ones that Qi provides. We implement these as ad hoc rules using syntax-spec, for now, just as we formerly did directly using syntax-parse.

(~>/form (amp f0:clause f:clause ...)
         (report-syntax-error this-syntax
           "(>< flo)"
           "amp expects a single flow specification, but it received many."))

But it would be better if syntax-spec were able to provide such errors on its own, out of the box, by inferring such errors from the declared grammar. It cannot rely on syntax-parse for this, so implementing this may require either parallel work in syntax-spec, or, if it is sufficiently generic, additions upstream to syntax-parse.

It will be some time before the ideal relationship to syntax-parse becomes clear.

Next Steps

Add docs for syntax-spec
Review the bindings PR towards merging it into the compiler integration branch.
Review the initial design for bindings and see what other bindings cases we could support next.
Re-generate the benchmarks showing the performance difference of each Qi form with respect to the main branch, so that we can take stock and start restoring parity towards merging back to main.
Discuss how to, and then put syntax-spec on the package index

Notes

This was moved to Thursday from the usual Friday meeting time to accommodate schedules. We return to the usual Friday meeting time next week.

Attendees

Michael, Sid

Provide feedback

Saved searches

Use saved searches to filter your results more quickly