PyOMeta is an implementation of OMeta2, an object-oriented pattern-matching language developed by Alessandro Warth. PyOMeta provides a compact syntax based on Parsing Expression Grammars for common lexing, parsing and tree-transforming activities in a way that's easy to reason about for Python programmers.
PyOMeta compiles a grammar to a Python class, with the rules as methods. The rules specify parsing expressions, which consume input and return values if they succeed in matching.
foo ::= ....
: Define a rule named foo
.
expr1 expr2
: Match expr1
and then match expr2
if it succeeds, returning the value of
expr2
. Like Python's and
.
expr1 | expr2
: Try to match expr1
--- if it fails, match expr2
instead. Like Python's or
.
expr*
: Match expr
zero or more times, returning a list of matches.
expr+
: Match expr
one or more times, returning a list of matches.
expr?
: Try to match expr
. Returns None
if it fails to match.
~expr
: Fail if the next item in the input matches expr
.
ruleName
: Call the rule ruleName
.
'x'
: Match the literal character 'x'.
expr:name
: Bind the result of expr
to the local variable name
.
-> pythonExpression
: Evaluate the given Python expression and return its result.
!(pythonExpression)
: Evaluate the given Python expression and return its result (this is used in the rule
definition part).
# this is a comment
: Comments are like Python comments, they start with #
and extend to the end of the line.
<expr>
: Consumed-by operator returns a sub-sequence of the input that contains the elements
matched by the enclosed expression expr
@<expr>
: Index-consumed-by operator returns an array with the start and end indices of the
elements consumed by the enclosed expression expr
"kind of thing" | PyOMeta | Note |
---|---|---|
boolean | true |
|
number | 123 |
|
character | 'x' |
|
string | "foo" |
|
rule application | expr |
|
r(x, y) |
1 | |
^digit |
4 | |
list | ['x' 1] |
|
grouping | (foo bar) |
|
negation | ~'x' |
|
look-ahead | ~~'x' |
|
semantic predicate | ?(x > y) |
3 |
semantic action | -> (x + y) |
3 |
!(x + y) |
3 | |
binding | expr:x |
|
:x |
Note 1: the arguments do not necessarily have to be statement expressions - they can be any Python expression.
Note 2: not yet in the grammar, only via Python subclassing.
Note 3: semantic predicates and actions are written in Python. More specifically, they are either primary expressions, e.g., 123 x foo.bar() or something called "statement expressions", which have the form "{" * "}" For example, { x += 2; y = "foo"; f(x) } The value of a statement expression is equal to that of its last expression.
Note 4: "super" is just like any other rule (not a special form), so you have to
quote the rule name that you pass in as an argument, e.g., both ^r(1, 2)
and super("r", 1, 2)
are valid super-sends.
The starting point for defining a new grammar is pyometa.grammar.OMeta.makeGrammar
,
which takes a grammar definition and a dict of variable bindings for its embedded expressions
and produces a Python class.
Grammars can be subclassed as usual, and makeGrammar can be called on these classes to override
rules and provide new ones. To invoke a grammar rule, call grammarObject.apply()
with its name.
>>> from pyometa.grammar import OMeta
>>> exampleGrammar = """
ones = '1' '1' -> 1 # comment
twos = '2' '2' -> 2
stuff = (ones | twos)+
"""
>>> Example = OMeta.makeGrammar(exampleGrammar, {})
>>> g = Example("11221111")
>>> result, error = g.apply("stuff")
>>> result
[1, 2, 1, 1]
Say you want to add consumed-by
operator (it is already in the grammar, by the way)
to the basic OMeta grammar.
The steps you would need to follow are:
- change
grammar.py
and add
| token('<') expr:e token('>') -> self.builder.consumed_by(e)
to expr1
definition
- add the
nullOptimizationGrammar
with the new node ingrammar.py
:
| ['ConsumedBy' opt:expr] -> self.builder.consumed_by(expr)
- add a new method to
TreeBuilder
class inbuilder.py
:
def consumed_by(self, exprs):
return ["ConsumedBy", exprs]
- add
generate_ConsumedBy
method inPythonWriter
class inbuilder.py
:
def generate_ConsumedBy(self, expr):
fname = self._newThunkFor("consumed_by", expr)
return self._expr("consumed_by", "self.consumed_by(%s)" % (fname,))
-
generate a test for the new extension
-
generate a boot grammar:
$ export PYTHONPATH=$PWD/src:$PYTHONPATH
$ mv src/pyomets/boot.py src/pyomets/boot.orig.py
$ python src/pyometa/bootgenerator.py
$ mv src/pyometa/boot_generated.py src/pyomets/boot.py
- run the tests and make sure that everything runs successfully
This fork would not have been possible without the (real hard) work of Allen Short who first implemented a Python version of OMeta. The work of Waldemar Kornewald has further pushed Allen's implementation towards OMeta2 syntax and behaviour.
I, Enrico Spinielli, have
- improved and updated README
- included
consumed-by
andindex-consumed-by
from Benjamin Dauvergne's code - added tests
consumed-by
andindex-consumed-by
- added
Makefile
- document grammar extension
- add more examples/tests
- improve debugging/error reporting