Qi Compiler Sync Nov 24 2023

A fine-grained migraine

Qi Compiler Sync Nov 24 2023

Adjacent meetings: Previous | Up | Next

Summary

We noted that "fine-grained" argument templates are not currently supported by the deforestation implementation, which operates specifically on "right-chiral" expressions arising from use of ~>>. We also discussed some improvements to testing compiler optimization rule application, and some challenges in providing helpful and accurate error messages to users after having applied optimizations.

Background

Last time, we identified some items to focus on for release, which included error messages and testing.

Deforesting fine-grained templates

Chirality vs Templates

Qi allows us to easily lay out nonlinear sequences of functions that may be applied to input data. A typical functional pipeline that operates on an input collection such as a list is one such sequence -- one that we happen to be interested in "deforesting" in our optimizing compiler. But even in this specialized case, as functions syntactically treat different argument positions differently, there isn't a standard way to know which argument position to supply the input collection in.

Qi provides three ways to indicate this:

Chirality
Fine-grained templates
Blanket templates

Here is a typical functional pipeline:

(~>> () (range 10) (filter odd?) (map sqr) (foldl + 0))

The use of the right-threading form ~>> is a way to implicitly indicate that the inputs are to be supplied to the component expressions on the right side, and we say the expressions are right chiral. The left-threading form ~> is likewise a way to implicitly indicate supplying arguments on the left side, and we say the expressions are left chiral. Note that, if no arguments are pre-supplied, left- and right-threading are identical, as chirality does not affect the order of arguments in any way. Note also that, once the expression is expanded, the chirality is an attribute of each component expression (determining how it handles input arguments) and not of the "type" of threading, which, in both cases, is standard function composition in left-to-right order. As illustration, the core Qi form that implements this is simply called thread, rather than left-thread or right-thread.

So far in our implementation of deforestation, due to the standard function signatures of filter and map expecting the input list on the right side, we specifically check that the chirality of expressions is right, before deforesting them. Yet, the above expression could also be written this way:

(~> () (range 10) (filter odd? _) (map sqr _) (foldl + 0 _))

The component expressions here are all left-chiral, but through the use of fine-grained templates (i.e. the second method listed above), this expression is effectively saying the same thing as before. Fine-grained and blanket templates, being more explicit, always override chirality. We should deforest this expression, too, but we currently don't.

Likewise, this expression is equivalent and isn't currently deforested:

(~> () (range 10) (filter odd? __) (map sqr __) (foldl + 0 __))

Chirality as a reducible concept

A main idea guiding the stratification of the language into separate, explicit expansion and compilation phases was that the expansion phase deals with macros, which include many bundled forms (just as many "Racket" forms are actually built-in macros), while the compiler deals only with core forms.

We went to great lengths to ensure that this would be the case in the early days of the Qi compiler, for instance, rewriting, during expansion, an expression containing an argument template to an explicit use of a new #%fine-template core form.

In a way, the concept of chirality is an exception to this rule. As we saw earlier, it is implemented as a syntax property attached by the use of a macro like ~>>, and then, in the compiler, instead of relying exclusively on core forms like thread, we also read any properties on the expression to find out its chirality.

This could potentially be made into an explicit use of the core language by rewriting the implicit chirality of each expression into an explicit use of a fine-grained or blanket template. Another benefit could be that we would have fewer pattern matching forms in deforestation, since we would only need to handle the cases of explicitly declared argument positions. For instance, we could always rewrite:

(~>> () (range 10) (filter odd?) (map sqr) (foldl + 0))

... to:

(~> () (range 10 __) (filter odd? __) (map sqr __) (foldl + 0 __))

... and then handle it via "vindaloo"-style currying that we already use for handling (implicit) chirality as well as in our current code-generation step (i.e. the qi0->racket macro) for the blanket template.

Combinatorial explosion

To handle fine templates in deforestation, it's easy enough to add extra syntax patterns for stream transformers in order to support them, since we can always assume a single argument mid-stream of a functional pipeline. But it turns out to be more problematic for producers.

The only producer we support today is range, which has the following signature:

range
(range n)
(range low high)
(range low high skip)

Consequently, the patterns we would need to match to deforest fine templates include:

(range _)
(range _ _)
(range _ _ _)
(range _ high _)
(range low _ _)
...

etc. And there are only around a couple of dozen of these, give or take.

The good news is, this only affects producers, since for transformers like map and filter, even without reducing chirality to a template, we need only write a couple of extra match patterns.

But for this to be scalable, depending on how many other stream producers there are likely to be, it might make sense to write some kind of compile-time macro that expands to a syntax class containing all of these combinatorial patterns.

Deforesting racket/list?

Although the core deforestation approach of stream fusion is largely unchanged from our initial implementation, there has been a lot of work on categorizing and supporting many input patterns so that they all benefit from this optimization. In further broadening the scope of this optimization, we recently identified a number of interfaces in racket/list that are amenable to it. There are a lot of them!

One concern as we consider extending deforestation to these interfaces is that map, filter and foldl/foldr are part of racket/base rather than racket/list. Including deforestation for racket/list in the Qi compiler effectively caters to users who happen to be doing a lot of specialized list operations. By deforesting these interfaces, we would introduce a lot of code in the compiler (perhaps inflating startup times) that is not going to be relevant to every user.

So, an argument could be made for us to only focus on racket/base in the main Qi distribution, and allow the use of libraries to incorporate additional optimizations for lists or other data structures, using an extension scheme we have yet to define.

On the other hand, we also noted that range is already in racket/list but not in racket/base, so a case can be made that extending deforestation to all of racket/list is a quantitative rather than qualitative change.

There are a few options:

Support all of racket/list in the main Qi compiler (after initial release).
Define an extension mechanism to allow something like (require qi/list) to augment the compiler and implement deforestation for lists using this approach.
Focus on racket/base, and employ the "loophole in Qi space" to define range as a Qi function. This could avoid racket/list in the core deforestation machinery.

There seems to be increasing evidence that being able to extend the compiler at the user level would be a desirable and useful thing (even if done in a way that only "internal" Qi libraries may avail of it). Lists can be seen as the prototype data structure for deforestation, and so they provide a convenient and representative starting point that we could implement in this extensible way (after the initial release). So perhaps a combination of (2) and (3) is the right long term approach.

Errors in the Source Code

We've talked before about how an important concern with writing an optimizing compiler, which rewrites user code to something totally different, is producing good error messages in terms of the "source" code rather than the target code (aside from working on compilers, we perhaps take the term "source code" for granted!).

Indeed, we do see this problem manifest when deforestation is applied.

Formerly:

(~>> ("hello") (filter odd?) (map sqr))

; filter: contract violation
;   expected: list?
;   given: "hello"

After deforestation:

(~>> ("hello") (filter odd?) (map sqr))

; car: contract violation
;   expected: pair?
;   given: "hello"
; Context (plain; to see better errortrace context, re-run with C-u prefix):
;   /Users/siddhartha/.emacs.d/straight/build/racket-mode/racket/repl.rkt:348:0 configure/require/enter

This is obviously totally unhelpful. But with some clever use of the low-level contract API, Dominik was able to improve it to this:

; contract violation
;   expected: list?
;   given: "hello"
;   in: the 1st argument of
;       (-> (listof any/c) any)
;   contract from: list->cstream
;   blaming: (thread (#%partial-application ((#%host-expression filter) (#%host-expression odd?))) (#%partial-application ((#%host-expression map) (#%host-expression sqr))))
;    (assuming the contract is correct)
;   at: /Users/siddhartha/work/qi/qi-lib/flow/extended/forms.rkt:62:2
; Context (plain; to see better errortrace context, re-run with C-u prefix):
;   /Applications/Racket-Latest/collects/racket/contract/private/blame.rkt:346:0 raise-blame-error

This is much better as it is helpful and provides enough context to identify the error, but the expression being blamed is still not a source expression but something apparently derived from the source expression.

To understand what is going on, recall that there are two stages in parsing Qi:

Expansion of Qi to Core Qi
Compilation of Core Qi to Racket

(Followed by, for comparison, the exact same steps for Racket to Core Racket, and Core Racket to Chez Scheme)

The output of Qi's expander implemented in syntax-spec resembles:

(thread
 (esc (#%host-expression sqr))
 (esc (#%host-expression add1)))

In the second version of the error above, what we are seeing is the error in terms of the target code, that is, the error in terms of Racket.

In the improved (third) version above, what we are seeing is the error in terms of Core Qi.

Could we take another step to provide the error in terms of the source code?

The right way

In order to do that, we would need to have some way to access the source expression after the expansion stage is complete. Yet, the output of the expander, which is the input to the compiler, does not appear to include that full source context.

One way that might work for our purposes is if the syntax object that is the output of syntax-spec were to retain a syntax property that contained the source expression. Then, we would be able to include that in our compiled templates for the error messages.

A hacky way

Another way that we could do today is to "de-expand" the core Qi expression prior to generating the error message. For instance, in the above "improved" error message, we see this expression being blamed:

(thread (#%partial-application ((#%host-expression filter) (#%host-expression odd?))) (#%partial-application ((#%host-expression map) (#%host-expression sqr))))

Many of these are simply "tags" or "interposition points" that were introduced for the sake of defining an explicit core language, and these may simply be elided. It would be fairly trivial to walk this tree and do that:

(thread (filter odd?) (map sqr))

Of course, it would take more work to recover the information regarding left or right threading, so in general, this is of course not an ideal fix.

We agreed that we don't want to overcommit to our current approach to deforestation by adding this kind of fix, since it may be that a better way would be available in the future, and we don't want to risk "bugs becoming features."

At the same time, there is a broad sentiment that providing good error messages is a requirement, and we don't want to regress too far on that. So, something to think about further!

Simpler compiler tests

We've talked about ways to test the compiler in the past, and settled on four types of "tests" as being necessary:

Normal unit tests validating semantics
Benchmarks validating performance improvements
Compiler unit tests to validate that rewrite rules are applied
"Functorial" compiler tests to validate normalization-like rules

We haven't been especially satisfied with (3), since for one thing, any minor changes to the implementation cause changes to the tests since they are so closely coupled to the exact expansions and compilations. And for another, while it's possible to validate using this approach that certain rules are applied (by pedantically modifying the test to match the actual observed compiled expression!), it's not easy to detect that certain rules were not applied (since those compiled versions are not available if we haven't implemented it!). Put another way, it isn't easy to write a legitimate failing unit test (e.g. when employing a test-driven style).

To address this, we agreed that we could modify the tests in (3) to instead simply assert whether a rule has been applied, using whatever abstract predicate we define, as (1) and (2) already validate correctness. This also feels a bit hacky, since the predicate must be fairly specialized to the expected target expression (and indeed, in its current implementation, it converts the expression to a string and checks for the presence of the word "cstream" to validate that deforestation was applied), but it at least addresses both of the issues we identified with the approach, and feels no more hacky than the approach already did.

More encouragement to inline

Last week we noted that function definitions needed some extra encouragement to be properly inlined by the Chez Scheme compiler, but we didn't know why. Reader Wing Hei Chan (alias usao) suggested an explanation on Discourse, that it may have something to do with the fact that we are using higher-order functions in our implementation, as those variable references may not get inlined. They also suggested a possible addition to Racket to provide a stronger inlining form like define-inline that does inline such references and in addition also supports inlining case-lambda.

Altering the flow of the Sun with Qi

It's starting to get cold up here in the Northern hemisphere, and we felt that we should appease the Sun by releasing the compiler work before December 22 (to pick a date at random[1]), to convince it to flow back in our direction. We identified these items remaining to be addressed for release:

Documentation
More tests, lots of tests
Review of normalization for correctness
Review of our current "require latency" to see if deforestation has had any impact there

[1] Dec 22 is the solstice, and also happens to be a Friday, i.e. a day on which we're likely to have a Qi meeting 😉

Next Steps

(Some of these are carried over from last time)

Update the main documentation to reflect the compiler work
Investigate require latency
Review compiler normalization pass for correctness
Add additional nonlocal benchmarks to validate all cases optimized by stream fusion (e.g. for range).
Add test cases for the backwards incompatibility on datum vs literal matching, and review whether there is a way to avoid errors here by favoring the Qi binding space in the expander.
Prototype and then propose map1 for addition to Racket.
Add a "long functional pipeline" benchmark.
Continue to integrate macroexpansion events into compiler rewrite rules to afford visibility in e.g. the Macro Stepper.
Provide good error messages in optimized (e.g. deforested) expressions in terms of source expressions.
Validate that optimization passes can prematurely terminate if they don't return false when they don't transform the input (and then fix it).
Implement streams for the other list operations described in St-Amour's writeup (i.e. unfoldr zip average). Note: implementing average might benefit from having "multi-valued CPS" for deforestation.
Write unit tests for more compiler rewrite rules, as well as end-to-end tests for pairs of input expressions, and sanity tests for the expander.
Find a way to reliably detect list-like operations on values.
Rewrite these list-like operations to list-based functional pipelines, e.g. (~> (-< (>< f) (>< g)) ...)
Generalize the stream implementation to support multiple values at each step (in addition to zero or one) and recheck the performance.
Review other core Qi combinators besides ~> (e.g. -<, == and ><, maybe if and pass) and see whether it would make sense to extend fusion to them.
Should Qi provide a map utility (overriding the default one) that supports "0 or more" values (as was discussed in the RacketCon talk)?
Design a core continuation form(s?) for Qi, and show how try can be implemented in terms of it.
Describe the sequence of compilation for input expressions with the optimizing compiler against the reference compiler, showing cases in which these do not terminate.
Come up with a methodology to formalize and theoretically validate compiler behavior.
Review other bindings cases that we could support next (e.g. selective binding, packing values, and binding exceptions in try).

Attendees

Dominik, Sid

Provide feedback

Saved searches

Use saved searches to filter your results more quickly