[Extended] PEG Grammar

Definitions

Terminal is a literal symbol.

Non-terminal is a rule name, that gets replaced by a parsing expression (denoted as e in the following table) defined after = or <-¹ symbol in the rule.

Parsing expression defines how input should be consumed to form an Abstract Syntax Tree.

Operator	Precedence	Description
`' '` or `" "`	5	Literal string
`[ ]`	5	Character class - matches a single character from set. Can use range²
`.`	5	Any character - matches any character except line break
`(e)`	5	Grouping - groups tokens into one
`e?`	4	Optional - matches the previous token between zero and one time, as many times as possible
`e*`	4	Zero-or-more - matches the previous token between zero and unlimited times, as many times as possible
`e+`	4	One-or-more - matches the previous token between one and unlimited times, as many times as possible
`e{n}`	4	[Extended] Exactly n - matches previous token exactly `n` times
`&e`	3	And-predicate - requires following token to be present, does not consume it
`!e`	3	Not-predicate - requires following token to be absent, does not consume it
`e1 e2`	2	Sequence - matches a sequence of terminals or non-terminals
`e1 / e2` or `e1 \| e2`	1	Prioritized Choice - equivalent to boolean OR
`# ...`	-	[Extended] Comment - line comment
`(* ... *)`	-	Multiline comment

Grammar tasks

Grammar tasks are inspired by the Caterpillar logic game.

In each task, a user has to construct a grammar that would be similar to a task grammar (which is hidden) by testing different inputs. For each input the program says whether it is valid for the task grammar or not. To make it easier, each task already has 5 valid and invalid inputs.

Example

Suppose we are given a task, and we know that following inputs are valid:

And following are invalid:

We can assume that input is valid if it contains both 1 and 0. Let's create a grammar and test it:

root = (zeroStart / oneStart) !.
zeroStart = "0"+ "1"+ any
oneStart = "1"+ "0"+ any
any = [01]*

We get: WA: 101 It is wrong because we need even number of ones. Therefore, this grammar should work:

root = (zeroFirst / zeroMiddle / zeroLast) !.
ones = ("1" "0"* "1")
zeroFirst = "0"+ ones ones* "0"*
zeroLast = "0"* ones ones* "0"+
zeroMiddle = "11"* ("1" "0"+ "1")+ "11"*

Footnotes

= and <- in the rule definition are interchangeable ↩
Range: X-Y matches a single character in the range between X and Y. E.g. a-z matches any non-capital english letter ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

[Extended] PEG Grammar

Definitions

Grammar tasks

Example

Files

README.md

Latest commit

History

README.md

File metadata and controls

[Extended] PEG Grammar

Definitions

Grammar tasks

Example

Footnotes