Skip to content

Qi Meeting Feb 2 2024

Siddhartha Kasivajhula edited this page Mar 26, 2024 · 8 revisions

Schrodinger's Probe

Qi Meeting Feb 2 2024

Adjacent meetings: Previous | Up | Next

Summary

We discussed the in-progress work on the modular compiler architecture, the possibility of language composition as a long term research goal, and reviewed Qi's guarantees on effects that we are calling "effect locality." We discussed the practical consequences of effect locality, including some curious implications for the behavior of the probe debugger.

Background

Last time, we hashed out the nature of effect locality in more detail. This time, Sid and Dominik were both in Brussels for FOSDEM and were hoping to surprise Michael and other attendees.

A Thousand Miles

Like Vanessa Carlton, Dominik traveled almost a thousand kilometers (we are in Europe and on the metric system) to be in Brussels in time for the Qi meeting. But after driving all day, he reached his hotel, about 6 km away from Sid's Airbnb, right around when the meeting was starting. In the meantime, Sid had scoped out possible local venues for the Qi meeting (including a really neat local coworking space used by some friendly artsy folks), but unfortunately they were all going to be closed and locked at the time of the meeting. So in the end, it made the most sense for everyone to just attend remotely from our hotel rooms and dorm rooms, as usual 😂.

Compiler extension plans

Modular compiler review

We discussed some recent code review comments by Sam P and by Sid. We agreed that the work is overall looking good and will provide a reasonable starting point to begin racket/list deforestation while keeping it decoupled from the Qi core.

Language composition

Looking beyond the immediate horizon, we've talked about language composition before as a potentially more robust and innovative way to achieve compiler extensibility than simple compile-time mutation of a registry of passes.

It's intriguing, but we felt that before we explore it in earnest, it would be good to have some examples showing how the approach would be beneficial. We'd like to convince ourselves that it has practical advantages.

Some of the benefits, we've already talked about, but it's worth summarizing them:

  1. The ability to extend the compiler without exposing internal implementation details of the compiler (such as internally employed intermediate representations or IRs).
  2. The ability to expose this means of extension to users (rather than only "internally"). The composition being explicitly performed by the user means that we do not need to worry about unmodeled behavior by guessing at a desired composition semantics.
  3. The composition is formally well defined and thus theoretically tractable.

In addition to these, there's another:

  1. Avoids violating the esc boundary.

Currently, we literally match uses of Racket list APIs and then deforest matched patterns. The problem is, we promise that we do not optimize across the esc boundary. It's inappropriate to do this since esc designates a departure from Qi to another language, Racket. It is thus overstepping to optimize such patterns, and our doing so with list API deforestation represented what was intended to be a one-off until a better approach was found.

With the modular compiler approach, we could address this in a few different ways, either (a) continue doing the same thing, except in a separate library (this would be no different than our current approach), (b) provide identically-named APIs in qi/list to those in racket/list which we then match literally (still effectively the same approach), (c) provide identically-named APIs in qi/list, but in the Qi binding space (marginally better as the bindings at least do not collide in the same space).

It's unclear whether any of these approaches would cleanly avoid binding collisions with Racket, but in any case, they all transgress the esc boundary.

On the other hand, with the composition approach, the list transforming operations would be syntactic forms of the qi/list language. While avoiding binding collisions still presents some challenges (but perhaps more minor ones), most importantly, this approach honors the esc boundary.

But we could also ask, are there any drawbacks with this approach? Superficial drawbacks are one thing, but could the approach be theoretically more limited in some way?

Drawbacks?

One argument could be that since each pass is modeled as a distinct language, compiler optimizations that would benefit from previous passes could no longer be implemented. For instance, there could be a pass to annotate the intermediate language with arity information, which many different passes could take advantage of, for instance, to replace variadic functions with equivalent fixed-arity ones, but also, to perform type analysis and inference.

In such a case, we might wish to have these two optimization passes inserted after the arity pass. Our proposed language composition scheme requires that the output of each pass be a shared core language, which would seem to imply that any such arity information would need to be stripped away, becoming unavailable to subsequent passes.

Yet, we would argue that in this case, arity annotation should not be considered a pass on its own but a component of both of these passes. It should be implemented as a parser that is used as an initial step in each of these passes. This could inflate compile times by duplicating the work of annotating with arity, but we feel it's necessary to keeping these transformations well-defined and self-contained and tractable. After all, there is no formal contract ensuring that the passes subsequent to arity pass in the naive approach would produce output that preserves the original arity information in all cases for the benefit of subsequent passes. If there were, then it implies that these passes encode nonlocal information within them, that is, they would not truly be self-contained.

Indeed, it would seem that this is an argument in favor of the composition approach, and shows us how to construct passes so that they are theoretically and practically sound. By implementing it in the suggested way, these passes are fully responsible for annotating the syntax with any metadata they will need to do their job. There is no scope for nonlocal bugs that would be hard to trace, making them more robust.

Commutativity of Passes

In the current version of the proposed composition scheme, it's worth noting that compiler passes are commutative and may be composed in any order by virtue of the shared intermediate representation that forms their input and output. This gives us a lot of flexibility, but we could consider weakening this requirement if there would be any practical benefit from doing so.

Effect Locality

We discussed effect locality again to understand it more precisely, and whether it is really saying anything useful beyond the obvious.

On the face of it, effect locality says that effects will always occur at the time the corresponding functions are invoked. This seems to almost be a tautology, something akin to "effects will occur when they happen." But we are really asserting an identification of effects with function invocations and admitting no finer granularity than that (order of effects within a function definition is left as a concern of the host language). This was perhaps described well in last week's notes. At the same time, we are also making certain promises regarding the effect form, which we discussed this time.

Specifically, when an effect form is used, we will treat that as if the declared effect is part of the accompanying function itself. Thus, we encourage separating effects from pure functions (with all the benefits that brings), while still preserving when effects will be executed.

We looked at an example:

(~> (pass f) (>< g))

We could (and indeed do) optimize this using a values-based deforestation to:

(>< (if f g ⏚))

But what happens when effects are present?

(~> (pass (effect displayln odd?)) (>< (effect displayln sqr)))

Here, naively, the order of effects for an input list (1 2 3) is:

(odd? 1), (odd? 2), (odd? 3), (sqr 1), (sqr 3)

… where we designate an effect by the corresponding function invocation with which it is identified, as we did last time.

Once deforested, it becomes:

(odd? 1), (sqr 1), (odd? 2), (odd? 3), (sqr 3)

This is acceptable as the effects behave as if they were included in the definition of the odd? and sqr functions. Specifically, the effects are invoked with the same original arguments, the same number of times, as if the effects had been implicit in the functions.

On the other hand, this:

(~> (effect displayln (pass odd?)) (effect displayln (>< sqr)))

… exhibits the effects:

(pass-odd? (1 2 3)), (><-sqr (1 3))

(where, once again, we designate each effect by the accompanying function invoked and the arguments provided to the invocation, which in this case are lists)

This would not be deforested, since there is no way to do that that would preserve effect locality. For instance:

(>< (if (effect displayln odd?) (effect displayln sqr) ⏚))

would result in an order of effects:

(odd? 1), (sqr 1), (odd? 2), (odd? 3), (sqr 3)

and

(effect displayln (>< (if odd? sqr ⏚)))

would result in:

(><-if-odd-sqr (1 2 3))

In each case, the effect is invoked a different number of times, and with different arguments, than if they had been implicit in the original function definitions.

So as far as effect goes, we promise that declared effects will always happen at the same time (e.g. either right before or right after - whatever we decide) as the calls to the annotated functions for a given set of arguments. The annotated function may be a simple Racket function like odd? or sqr, or it may be a composed flow like (pass odd?). But whatever it is, the effect will be treated as if it were "located within" the definition of that function. Most simply, our promise regarding effect could be expressed as:

A function f containing an effect e is indistinguishable from (effect e f') where f' is f without e.

Without this promise, we would be free to juggle declared effects without changing the meaning of the underlying pure flow (we could call a flow with all effects removed the "pure projection" of the flow), and we could, say, move all the effects so that they happen at the end. The above guarantee prevents such transformations.

The Probe Problem

Effect locality is in some sense the simplest and minimum guarantee that we can provide that is still formally well defined and which feels "natural" for functional programming. But, as we're learning, it has some surprising implications, including for tools like the probe debugger.

probe allows us to place readouts anywhere in a flow expression and see the values flowing at that point, thus providing a means to trace execution. Hendrik pointed out that adding such readouts could affect the semantics of the program by changing the order of effects. For example:

(define-flow foo (~> (pass (effect E₁ odd?))) readout (>< (effect E₂ sqr)))

(probe (foo 1 2 3))

What should happen here? With the readout, all the effects E₁ would occur first. Without the readout, the flow would be deforested and the effects E₁ and E₂ would be interleaved, as we saw earlier. Either order is consistent with locality but they are different. Indeed, the readout would not even represent a valid point in the actual execution of the program, which for reference would be:

(>< (if odd? sqr ⏚))

We have a couple of options on how to handle this:

  1. Should it be illegal to place the readout there? That is, we could raise a syntax error.
  2. We could issue a warning in the docs that readout may block optimizations and thus not always represent the true state of the running program.

The former seems the most conservative but also possibly tricky to implement as it may require knowledge of optimizations in the compiler. The latter is easiest but allows readouts to possibly mislead users. Yet, it could be said that Nature itself works this way, as we see in Young's double slit experiment. The act of measurement changes the program, producing different results. So maybe it is a caveat that we can include with the use of probe, that it's a measurement tool that works a certain way, and like any measurement, its use changes the program and may in certain cases cause its behavior to be different, in ways that we can understand.

Next Steps

(Some of these are carried over from last time)

  • Review and merge the modular compiler architecture
  • Incorporate the new benchmarking suite into the SDK (including Makefile targets) and CI workflows
  • Review whether we can continue optimizing a single syntax node at many levels
  • Preserve the source syntax through expansion in Syntax Spec.

Attendees

Dominik, Hendrik, Michael, Sid

Clone this wiki locally