-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grammar extensions #208
base: master
Are you sure you want to change the base?
Grammar extensions #208
Conversation
bb21c10
to
9daef80
Compare
I don't have anything to add over Erik's original ideas for statement reference and override in #30. I can really only offer that the AND/Or precedence issue remains something of a sticky issue which seems to actually be about dealing with backwards incompatibility in general. The extend syntax is going to expand the de-facto public API and maybe some policy or expectations would help here. Erik added "I don't plan on making any backward-incompatible changes to the rule syntax in the future, so you can write grammars with confidence" in version 0.2, ten years ago, and then promptly shipped two breaking version (0.5, 0.6). It's sometimes necessary. You have the most skin in the game here, and I'm inclined to follow your recommendations but exposing the internals is going to add some tension with respect to backwards compatibility |
@erikrose @lonnen This PR proposes a method of extending grammars, including Parsimonious' own rule syntax. Implements #30.
Changes
Syntax for referencing/overriding previously-defined rules.
Erik suggested a syntax like this in #30. It seems reasonable, if a little terse. Very open to suggestions here.
The key point here is that to truly extend functionality of other grammars, references cannot be resolved until after ^super expressions have been resolved. This allows e.g. defining a new kind of
expression
, and having it included anywhereexpression
was used in the original grammar.Example:
This is equivalent to the following grammar:
Syntax for dividing up rule sections
Two or more
=
or-
characters makes a new kind of comment. It has no semantic content, though it could be used for refining the inheritance semantics, e.g. around**more_rules
custom rules.Grammar.extend
instance methodTakes the same arguments as the
Grammar
constructor, but instead extends the existing grammar by concatenating the original grammar definition and the new one. To achieve this, the original arguments passed to the constructor are retained.Class variables on
Grammar
to define how a grammar is parsed and visitedEach
Grammar
subclass defines a grammar that parses rules, and a visitor class that visits them.This allows extensions to parsimonious's syntax without needing to reach consensus on what those extensions should be. Individual users can update the syntax to make a DSL useful for their own purposes.
I included an example of a different approach to token grammars that is useful for a particular problem I'm trying to solve. Here,
CAPITAL_REFERENCES
refer to token types, while lowercase references refer to rules. Attributes of tokens can themselves be matched or parsed with a language similar to xpath.Limitations
**more_rules
construct is a bit wonky or buggy. Consider the following:Here the extension doesn't do anything since the extra "custom" overrides the extension. I think there are solutions to this, but they're a bit finicky to implement, so I figured I'd put this up for discussion before continuing.
That said, it doesn't break any existing use of the
**more_rules
feature which is a bit of an advanced/experimental feature anyway.Still TODO
**more_rules
, or at least document the limitation.