Skip to content
Bastiaan Veelo edited this page Jun 19, 2020 · 7 revisions

Memoization

Pegged uses memoization to remember previous parse attempts: every time a rule is called at a particular input position, the parser looks if it was already called here. If yes, the previous result is used. If not, the parser uses the rule and then stores the resulting parse tree (be it successful or not: it's useful to remember past failures too!).

As Bryan Ford showed in the original article, this makes a PEG linear in input length, at the cost of a higher demand on memory. Tests carried out during the memoization implementation for Pegged showed a speed increase up to 40% while using memoization and never showed a speed decrease (as could be feared from the frequent access to the underlying associative array). No test was carried out on huge inputs (10.000 lines-of-text files, for example).

Memoization is managed through the Memoization compile-time option in grammar. Use Memoization.yes or Memoization.no to respectively enable or disable memoization. The default is Memoization.yes.

// No memoization, be it at compile-time or runtime.
mixin(grammar!(Memoization.no)(myGrammar));
// Default:
mixin(grammar(myGrammar));
// Equivalent to:
mixin(grammar!(Memoization.yes)(myGrammar));

This option also exists for asModule:

asModule!(Memoization.no)("mymod", myGrammar);

There is a catch, though: testing different memoization implementations, the most efficient one was found to be D's built-in associative arrays, which is what is used by Pegged right now. But AAs do not work at compile-time (this may have changed, they did not at the time of implementation). So right now, memoization is disabled at compile-time even with the Memoization.yes option to allow Pegged parsers to function at compile-time.

Next step: Grammars as D Modules


Pegged Tutorial

Clone this wiki locally