-
Notifications
You must be signed in to change notification settings - Fork 63
Design Wiki
These are design notes roughly in the order that we would like to deal with the challenges.
Patterns can compile themselves into the required test.
(LiteralPattern. true) ; -> (= y true)
(SeqPattern. foo) ; -> (seq? x)
(VectorPattern. foo) ; -> (vector? x)
(PredicatePattern. even?) ; -> (even? x)
Quoted symbols are simply symbols, unquoted symbols are scoped.
(let [y "a"
x "a"]
(match [y]
[x] 1)) ;;; x is scoped
;=> 1
(let [y 'x]
(match [y]
['x] 1)) ;; 'x is 'x
;=> 1
(defn match-seq [x]
(match [x]
[()] 1
[(1)] 2
[(1 _)] 3
[(1 & _)] 4))
(match-seq [])
;=> 1
(match-seq [1])
;=> 2
(match-seq [1 2])
;=> 3
(match-seq [1 2 3])
;=> 4
(def matrix (build-matrix [x y z]
[(1 2) 3 4] 1))
(specialize matrix (seq-pattern '(1 2)))
;; should produce a new matrix with a row that looks like
;; [1 2 <crash> 3 4] :occurrences [x0 x1 xr y z]
This new matrix should have a constraint that enforces left right order until the seq occurrences are cleared. Simple way, we could put metadata on the seq occurence symbols ^:seq-occurrence, if the first occurrence has this metadata we always specialize the first column.
Perhaps we should also add metadata on the occurrence so we know how to bind this occurrence in the switch node.
^{:bind-expr '(if (nil? x)
(fail-node)
(let [x0 (first x)
x (rest x)]))}
Vectors support random access, there's no reason to not use the full power of lazy pattern matching semantics on them.
(defn match-seq [x]
(match [x]
[[]] 1
[[1]] 2
[[1 _]] 3
[[1 & _]] 4))
Note that seq patterns and vector pattern can work with either datatype, but using the correct datatype will deliver considerable performance advantages.
I think we need to solve the maximal sharing issue sooner rather than later as this effects compilation to SwitchNodes. At the moment we naively compute the decision tree of SwitchNodes. This means that when we compile to Clojure source we'll produce a very large amount of code.
What's our strategy to "collapse" tests together? Maranget's paper suggests implementing maximal sharing via hash-consing. Type-Safe Modular Hash-Consing
As far dealing with recur in the presence of maximal sharing I'm leaning towards assigning values into atoms. The performance hit seems acceptable. Definitely open to better ideas.
For example of where maximal sharing comes into play consider:
(match [x y]
[[1 2 3] 4] :a0
[[1 2 3] 5] :a1)
Imagine that the algorithm decides that y is the first column to test. We then have a switch node on y with multiway test for its value - 4 or 5.
(cond
(= y 4) ...
(= y 5) ...)
Now we'll see code duplication in both branches for testing x. Maximal sharing would avoid this.
(defmatch simple-binding [x]
[b] b))
(simple-binding 1)
;=> 1
(defmatch seq-binding [x]
[[1 _ b]] b))
[[b]] b))
(seq-binding [1 2 3])
;=> 3
(seq-binding [4])
;=> 4
You should be allowed to write the above pattern match and similar patterns like so:
(defmatch seq-binding [x]
[1 _ b] b))
[b] b))
(defmatch map-match [x]
[{:a 'foo :only [:a]}] :a0
[{:a 'foo}] :a1
[{:a 'foo :b 'bar}] :a2
[{:a 'bar :b 'foo}] :a3)
(map-match {:a 'foo} 1)
;=> :a0
(map-match {:a 'foo :bar 'wob})
;=> :a1
(map-match {:a 'foo :b 'bar})
;=> :a2
(map-match {:a 'bar :b 'foo})
;=> :a3
Will come later...