-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pattern matching factorization optimization #13
Comments
@tizoc Your PR for shen-scheme doesn't mention any performance testing? What was the impact of this optimization? (startup time, test suite time, custom benchmarks...) |
Startup, tests, etc didn't get affected much, the big impact is on specific individual functions that are heavy on pattern matching of deep structures. Most programs will not benefit much from this optimization, but if a program happens to have some loop with one of such functions being called a lot, it is going to benefit for sure. I didn't do much performance testing other than what I mentioned on the mailing list: Pasting the results here: Functions: (define simplify
[Op A B] -> (s Op (simplify A) (simplify B))
A -> A)
(define s
+ M N -> (+ M N) where (and (number? M) (number? N))
+ 0 F -> F
+ F 0 -> F
+ A [+ B C] -> (simplify [+ [+ A B] C])
* M N -> (* M N) where (and (number? M) (number? N))
* 0 F -> 0
* F 0 -> 0
* F 1 -> F
* 1 F -> F
* A [* B C] -> (simplify [* [* A B] C])
Op A B -> [Op A B])
(define neals-function
[1 2 3 1 2 3 1 2 3 1 2 3 | B] [[[1 | 2] | 3] [1 | 2] [[1 | 2] | 3] [1 | 2] | B] -> 0
[X|X] [X|X] -> (+ X 1000))
I wouldn't spend time on this now if I were you, wait at least to until after I make a generic version as an extension (if I'm able to), will save you a lot of work. |
@rkoeninger any pointers to where I should start looking at if I wanted to give this a try? I think labelled blocks make this possible, but I want to be sure. If it is not possible I may have to try implementing an alternative factorization strategy for platforms that cannot use the current one (would probably not perform the full optimization, just part of it). |
I never had a working example of goto-like code in JS so that's the first hurdle. You can Here's the root Lines 344 to 350 in 0e99050
There are functions in |
Fair enough. I will try to produce a code example in JS that does what I mean and come back to this issue to post it. But let me give you a rough idea of how in my mind it would work. No
should produce something like <label>: {
<body>
}
<label-body> Which is similar to what the Common Lisp port generates but inverted (the because the label body is outside of the body and the non-label body outside, a "goto" here gets out of the block). In Common Lisp the result is Then should produce something like break <label>; and should produce: return <expr>; Like in the case Common Lisp, labelled blocks will be nested, there is always a single immediate outer label, which can be represented easily with labelled blocks in Javascript. |
@rkoeninger when inside the ShenScript REPL, how can I produce Javascript code from a Kl input? |
Found it, |
Ouch, I wanted to use the generated code as a starting point and adjust it to use labelled breaks, but the output code doesn't look at all like what I was expecting. Does every expression have to be awaited? or is there something I'm doing wrong when generating the JS code? I'm doing: |
Ok, I think I understand what is going on. Have to go now, will continue working on this later. |
The generated code is very expression-oriented (including converting I don't want to have to bring back the statement-composing fabr concepts.
Every function call that's not certain to not return a promise must be awaited. So that's most function calls. And if a there are any
Yeah, that's right. The generated code is just highly idiomatic in style. |
Here is the code I'm trying to compile, what I don't understand is why it is generating all those awaits, nothing here is async: [defun example [V7 V8]
[%%let-label %%label15 [%%return [shen.f_error example]]
[if [cons? V7]
[let V7/hd [hd V7]
[let V7/tl [tl V7]
[%%let-label %%label16 [if [and [= 2 V7/hd]
[cons? V7/tl]]
[%%return [hd V7/tl]]
[%%goto-label %%label15]]
[if [and [= 1 V7/hd] [cons? V7/tl]]
[if [= 1 V8]
[%%return [hd V7/tl]]
[if [= 2 V8]
[%%return [tl V7/tl]]
[%%goto-label %%label16]]]
[%%goto-label %%label16]]]]]
[%%goto-label %%label15]]]] I defined dummy functions for Now that I saw the generated code I'm not sure if labelled breaks are viable, because new functions are being introduced, and I don't know if jumps from inside the function to outer blocks or if it is efficient. Doesn't this add a lot of overhead in general? |
Can you post the generated JS here? Have you made any changes to the
Oh, you mean the IIFE (Immediately Invoked Function Expression) pattern like how I think you're right that you can't |
Here is the generated code ( await (async ($25$25let$2dlabel$c, $25$25label15$s, $25$25return$c, error$c, example$s, $25$25label16$s, $25$25goto$2dlabel$c) =>
$.d("example", $.l(async (V7, V8) =>
$.b($25$25let$2dlabel$c.f, $25$25label15$s, await $.t($25$25return$c.f(await $.t(error$c.f(example$s)))), $.isCons(V7) ?
await (async V7$2fhd =>
await (async V7$2ftl =>
await $.t($25$25let$2dlabel$c.f($25$25label16$s, 2 === V7$2fhd && $.isCons(V7$2ftl) ?
await $.t($25$25return$c.f($.asCons(V7$2ftl).head)) :
await $.t($25$25goto$2dlabel$c.f($25$25label15$s)), 1 === V7$2fhd && $.isCons(V7$2ftl) ?
1 === V8 ?
await $.t($25$25return$c.f($.asCons(V7$2ftl).head)) :
2 === V8 ?
await $.t($25$25return$c.f($.asCons(V7$2ftl).tail)) :
await $.t($25$25goto$2dlabel$c.f($25$25label16$s)) :
await $.t($25$25goto$2dlabel$c.f($25$25label16$s)))))(V7.tail))(V7.head) :
await $.t($25$25goto$2dlabel$c.f($25$25label15$s)))))
)($.c("%%let-label"), $.s `%%label15`, $.c("%%return"), $.c("error"), $.s `example`, $.s `%%label16`, $.c("%%goto-label")) Edit: roughly, this is what I had in my mind: Shen function: (define example
[1 X | Xs] 1 -> X
[1 X | Xs] 2 -> Xs
[2 X | Xs] _ -> X) KLambda for that code (defun example (V1345 V1346)
(cond
((and (cons? V1345)
(and (= 1 (hd V1345)) (and (cons? (tl V1345)) (= 1 V1346))))
(hd (tl V1345)))
((and (cons? V1345)
(and (= 1 (hd V1345)) (and (cons? (tl V1345)) (= 2 V1346))))
(tl (tl V1345)))
((and (cons? V1345) (and (= 2 (hd V1345)) (cons? (tl V1345))))
(hd (tl V1345)))
(true (shen.f_error example)))) After factorization (defun example (V1345 V1346)
(%%let-label (%%label1347) (%%return (shen.f_error example))
(if (cons? V1345)
(let V1345/hd (hd V1345)
(let V1345/tl (tl V1345)
(%%let-label (%%label1348 V1345/hd V1345/tl)
(if (and (= 2 V1345/hd) (cons? V1345/tl))
(%%return (hd V1345/tl))
(%%goto-label %%label1347))
(if (and (= 1 V1345/hd) (cons? V1345/tl))
(if (= 1 V1346)
(%%return (hd V1345/tl))
(if (= 2 V1346)
(%%return (tl V1345/tl))
(%%goto-label %%label1348 V1345/hd V1345/tl)))
(%%goto-label %%label1348 V1345/hd V1345/tl)))))
(%%goto-label %%label1347)))) Common Lisp port generates: (DEFUN example (V1345 V1346)
(BLOCK NIL
(TAGBODY
(IF (CONSP V1345)
(LET ((V1345/hd (CAR V1345)))
(LET ((V1345/tl (CDR V1345)))
(TAGBODY
(IF (AND (EQL V1345/hd 1) (CONSP V1345/tl))
(IF (EQL V1346 1)
(RETURN (CAR V1345/tl))
(IF (EQL V1346 2)
(RETURN (CDR V1345/tl))
(GO %%label1348)))
(GO %%label1348))
%%label1348
(IF (AND (EQL V1345/hd 2) (CONSP V1345/tl))
(RETURN (CAR V1345/tl))
(GO %%label1347)))))
(GO %%label1347))
%%label1347
(RETURN (shen.f_error 'example))))) Equivalent Javascript would be something like: function example(V1345, V1346) {
__label1347: {
if (consp(V1345)) {
const V1345_hd = hd(V1345);
const V1345_tl = tl(V1345);
__label1348: {
if (V1345_hd === 1 && consp(V1345_tl)) {
if (V1346 === 1) {
return hd(V1345_tl);
} else {
if (V1346 === 2) {
return tl(V1345_tl);
} else {
break __label1348;
}
}
} else {
break __label1348;
}
}
if (V1345_hd == 2 && consp(V1345_tl)) {
return hd(V1345_tl);
} else {
break __label1347;
}
} else {
break __label1347;
}
}
return shen.f_error(symbol`example`);
} |
Ah, and no changes to the backend, I'm using ShenScript as-is |
Formatted with some inlining: $.d("example", $.l(async (V7, V8) =>
$.b($.f("%%let-label"),
$.s`%%label15`,
await $.t($.f("%%return")(await $.t($.f("error")($.s`example`)))),
$.isCons(V7) ?
await $.t($.f("%%let-label")(
$.s`%%label16`,
2 === V7.head && $.isCons(V7.tail) ?
await $.t($.f("%%return")(V7.tail.head)) :
await $.t($.f("%%goto-label")($.s`%%label15`)),
1 === V7.head && $.isCons(V7.tail) ?
1 === V8 ? await $.t($.f("%%return")(V7.tail.head)) :
2 === V8 ? await $.t($.f("%%return")(V7.tail.tail)) :
await $.t($.f("%%goto-label")($.s`%%label16`)) :
await $.t($.f("%%goto-label")($.s`%%label16`)))) :
await $.t($.f("%%goto-label")($.s`%%label15`))))) You're not going to be able to do any meaningful manipulation of the estree (js ast). But you'd also have to bring back statements to the |
Yes, and it is not just about adding statement. The code generation I had in mind is very different to what ShenScript does now, stuff like compiling lets to this instead using IIFE to handle bindings: { const NewVar = Expression;
{ const OtherVar = OtherExpression;
return SomeResult; } } which is a big change, because this is not expression-oriented anymore and some analysis would have to be done to decide where to put Regarding IIFE, I don't know how good Javascript runtimes are at optimizing them, I guess they do a good job because it doesn't seem to be particularly hard, and as you mention, it is a common pattern. What I'm not that sure about is all those async/awaits that get introduced because of it when the code cannot be proven to be sync. Not only it seems harder to optimize for, regular Javascript code itself is not that heavy on async/await, and because of that I don't know how much JS runtimes focus on it. Anyway, thats a whole different discussion, and I don't know if you are interested on it, so going back to the original topic: given how code is generated now, the labelled breaks approach is not possible, what would work instead is using continuations, so that each labelled block is a function that gets either passed to the scope (bound to name of the label) or defined at the root of the function definition, |
I really don't feel like re-writing the compiler right now, which this would require. I don't think factorization is going to be supported by ShenScript as previous experience with attempting such a re-write was so bad and ultimately counterproductive (slower, more code volume). |
You don't have to! everything I mentioned about the compilation strategy is about a specific way of implementing factorization (the one I mentioned on my first post), and that is what isn't doable with the current compilation strategy. The alternative one (that fits the current compilation strategy) is to use functions for jumps, like I do in Shen/Scheme. I don't think Javascript runtimes will optimize it into gotos like Chez does, but whatever overhead that may be added by that strategy (compared to labelled break) exists already, so it is not an issue. The way this will be compiled is:
I only need to figure out how to do a few things, which now I kind of know how to do after going through the
|
Edit: ignore this, I figure it out I added this: isForm(expr, 0, '%%return') ? buildReturn (context, expr) :
isForm(expr, 0, '%%goto-label') ? buildGotoLabel(context, expr) :
isForm(expr, 0, '%%let-label') ? buildLetLabel (context, expr) : and const buildReturn = (context, [_return, expr]) => {
return build(context, expr);
};
const buildGotoLabel = (context, [_goto, ...call]) => {
return buildApp(context, call);
};
const buildLetLabel = (context, [_letLabel, labelForm, labelBody, body]) => {
const [label, ...labelParams] = labelForm;
const labelFunc = buildFunction(context, labelParams, labelBody);
return buildLet(context, ['let', label, labelFunc, body]);
}; but somewhere it is failing to compile the following function: (set *code* (read-from-string "(defun example (V1345 V1346)
(%%let-label (%%label1347) (%%return (shen.f_error example))
(if (cons? V1345)
(let V1345/hd (hd V1345)
(let V1345/tl (tl V1345)
(%%let-label (%%label1348 V1345/hd V1345/tl)
(if (and (= 2 V1345/hd) (cons? V1345/tl))
(%%return (hd V1345/tl))
(%%goto-label %%label1347))
(if (and (= 1 V1345/hd) (cons? V1345/tl))
(if (= 1 V1346)
(%%return (hd V1345/tl))
(if (= 2 V1346)
(%%return (tl V1345/tl))
(%%goto-label %%label1348 V1345/hd V1345/tl)))
(%%goto-label %%label1348 V1345/hd V1345/tl)))))
(%%goto-label %%label1347))))"))
(define show-js -> (js.ast.render (js.ast.compile (hd (value *code*)))))
(show-js) The error I get is (modified to include the expr object):
@rkoeninger any idea of what am I missing? |
Ignore my question, its fixed. |
What I have right now is this: const buildReturn = (context, [_return, expr]) => {
return build(context, expr);
};
// (%%goto-label f ...args) -> f(...args), with no lookup being done
// for f, has to be a direct javascript call
const buildGotoLabel = (context, [_goto, f, ...args]) => {
return Call(SafeId(nameOf(f)), args, context.async);
};
const buildLetLabel = (context, [_letLabel, labelForm, labelBody, body]) => {
const [label, ...labelParams] = labelForm;
const labelFunc = buildFunction(context, labelParams, labelBody);
return assemble(
(y, z) => Call(Arrow([SafeId(nameOf(label))], z, context.async), [y], context.async),
uncast(labelFunc),
uncast(build(context, body)));
}; but it fails with this error, and I don't know from where it is coming from:
|
Ok, fixed, not having a typechecker makes me dizzy: const buildReturn = (context, [_return, expr]) => {
return build(context, expr);
};
const buildGotoLabel = (context, [_goto, f, ...args]) => {
return Call(SafeId(nameOf(f)), args.map(s => SafeId(nameOf(s))), context.async);
};
const buildLetLabel = (context, [_letLabel, labelForm, labelBody, body]) => {
const [label, ...labelParams] = labelForm;
const labelFunc = buildFunction(context, labelParams, labelBody);
return assemble(
(y, z) => Call(Arrow([SafeId(nameOf(label))], z, context.async), [y], context.async),
uncast(labelFunc),
uncast(build(context, body)));
}; The code generated now looks like this: await (async (shen$2ef_error$c, example$s) =>
$.d("example", $.l(async (V1345, V1346) =>
await (async $25$25label1347 =>
$.isCons(V1345) ?
await (async V1345$2fhd =>
await (async V1345$2ftl =>
await (async $25$25label1348 =>
1 === V1345$2fhd && $.isCons(V1345$2ftl) ?
1 === V1346 ?
$.asCons(V1345$2ftl).head :
2 === V1346 ?
$.asCons(V1345$2ftl).tail :
await $25$25label1348(V1345$2fhd, V1345$2ftl) :
await $25$25label1348(V1345$2fhd, V1345$2ftl))($.l(async (V1345$2fhd, V1345$2ftl) =>
2 === V1345$2fhd && $.isCons(V1345$2ftl) ?
$.asCons(V1345$2ftl).head :
await $25$25label1347())))(V1345.tail))(V1345.head) :
await $25$25label1347())($.l(async () =>
$.b(shen$2ef_error$c.f, example$s))))))($.c("shen.f_error"), $.s `example`) I still have to enable the extension and give it a try to see that all tests still pass, but I think the hardest part is solved already. |
Ok, bad news, the overhead of the introduced intermediary function definitions and calls is bigger than the savings (V8 isn't smart enough to optimize it away) for normal cases, only very extreme cases would benefit from this (but couldn't try those). It seems the Kl->Js compiler has a hard time compiling the following function: (define neals-function
[1 2 3 1 2 3 1 2 3 1 2 3 | B] [[[1 | 2] | 3] [1 | 2] [[1 | 2] | 3] [1 | 2] | B] -> 0
[X|X] [X|X] -> (+ X 1000)) It uses 100% of my cpu and never finishes (either with or without my modifications). |
Made a few changes so that labels aren't counted as external calls, and neither is await (async (shen$2ef_error$c, example$s) =>
$.d("example", $.l((V1345, V1346) =>
($25$25label1347 =>
$.isCons(V1345) ?
(V1345$2fhd =>
(V1345$2ftl =>
($25$25label1348 =>
1 === V1345$2fhd && $.isCons(V1345$2ftl) ?
1 === V1346 ?
$.asCons(V1345$2ftl).head :
2 === V1346 ?
$.asCons(V1345$2ftl).tail :
$25$25label1348(V1345$2fhd, V1345$2ftl) :
$25$25label1348(V1345$2fhd, V1345$2ftl))(
$.l((V1345$2fhd, V1345$2ftl) =>
2 === V1345$2fhd && $.isCons(V1345$2ftl) ?
$.asCons(V1345$2ftl).head :
$25$25label1347())))(V1345.tail))(V1345.head) :
$25$25label1347())(
$.l(() =>
$.b(shen$2ef_error$c.f, example$s))))))($.c("shen.f_error"), $.s `example`) Another thing to try would be to hoist those label functions, but I don't think it will make a difference. Edit: Here is the test btw, unoptimized version takes about 1.4s and optimized 2.7s: (define factorize Name -> (eval-kl (shen.x.factorise-defun.factorise-defun (ps Name))))
(define do-times
0 F -> done
N F -> (do (F void)
(do-times (- N 1) F)))
(define simplify
[Op A B] -> (s Op (simplify A) (simplify B))
A -> A)
(define s
+ M N -> (+ M N) where (and (number? M) (number? N))
+ 0 F -> F
+ F 0 -> F
+ A [+ B C] -> (simplify [+ [+ A B] C])
* M N -> (* M N) where (and (number? M) (number? N))
* 0 F -> 0
* F 0 -> 0
* F 1 -> F
* 1 F -> F
* A [* B C] -> (simplify [* [* A B] C])
Op A B -> [Op A B])
(output "simplify regular ~%")
(let Exp [* x [+ [+ [* 12 0] [+ 23 8]] y]]
(time (do-times 200000 (/. _ (simplify Exp)))))
(factorize simplify)
(factorize s)
(output "simplify optimized ~%")
(let Exp [* x [+ [+ [* 12 0] [+ 23 8]] y]]
(time (do-times 200000 (/. _ (simplify Exp))))) |
Thanks for figuring all that out. I see you got familiar with the codebase very quickly - I didn't have time to answer your questions before you found them yourself. Your approach is nice and neat also. What's the final verdict here? It's slower with this optimization added? Maybe it would be faster with statement-oriented syntax, but that would be the significant re-work ("re-write") I referred to. Factorization seems like a very specific fine-tuning at this time. ShenScript has bigger issues, like an over-abundance of async/await. The introduction of async/await slows it down by a factor of 4-8. Improving that is really the top performance priority. |
It was quite easy once a few things about the design and implementation "clicked", the code is clear and very well organized. The verdict is that, at this time it is not worth it. I have a few ideas I may try later regarding expressions vs statements -- the short version is that the context includes a flag that lets the inner expression know how its result will be used, along with a name to which it will be bound to if any, but it is still very fuzzy in my mind, so don't worry about it, I'm not even sure yet if it is doable. Regarding async/await, I noticed that other than a static list, you don't keep a record of which functions are known to not be async. Keeping a record could help by removing a lot of asyncs, but the order on which functions are defined will affect what you recognize as sync. I'm not entirely sure yet about how much overhead the async/await stuff has (in addition to the intermediary functions already there) based on the tests I pasted on that comment above, because the same function (either factorized or regular), took mostly the same amount of time to run in both sync and async versions, which I found quite surprising. You can consider this issue obsolete and close it, the topic can be revisited later. |
Thanks for trying this out. It was worth a shot.
It's limited to a static list because user-defined functions can be redefined. I had ideas about how to improve this - if you want to talk about async/await optimization, we can do that in another issue. |
One of the things I have on my queue is a program build tool for Shen/Scheme that assume that the program being compiled is not dynamic (not running on the REPL and not using eval), and take advantage of that (both in the Klambda->Scheme compiler, and then Chez itself). Once it is working it should be generalizable to other ports too. Function redefinition is something the compiler would not have to worry about when being compiled that way. |
Update on this one, I made a translation of the factorized example above: (define example
[1 X | Xs] 1 -> X
[1 X | Xs] 2 -> Xs
[2 X | Xs] _ -> X) to a version that uses labelled breaks: (((shen$2ef_error$c, example$s) =>
$.d('example', $.l((V1345, V1346) => {
$25$25label1347: {
if ($.isCons(V1345)) {
const V1345$2fhd = V1345.head;
const V1345$2ftl = V1345.tail;
$25$25label1348: {
if (1 === V1345$2fhd && $.isCons(V1345$2ftl)) {
if (1 === V1346) {
return $.asCons(V1345$2ftl).head;
} else if (2 === V1346) {
return $.asCons(V1345$2ftl).tail;
} else {
break $25$25label1348;
}
} else {
break $25$25label1348;
}
}
if (2 === V1345$2fhd && $.isCons(V1345$2ftl)) {
return $.asCons(V1345$2ftl).head;
} else {
break $25$25label1347;
}
}
}
return $.b(shen$2ef_error$c.f, example$s);
})))($.c('shen.f_error'), $.s `example`)) Using this test: (let L [2 3 5]
(time (do-times 2000000 (/. _ (example L 3))))) Times:
|
I'm thinking that it may be doable in Javascript, turns out there is a feature that lets you label blocks of code (and loops) and use
break
with those labels: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label#Using_a_labeled_block_with_breakProbably not a priority at the moment, but something to consider in the future.
The text was updated successfully, but these errors were encountered: