Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why dose my grammar compile but crash on run #154

Open
Computermatronic opened this issue Mar 29, 2015 · 2 comments
Open

why dose my grammar compile but crash on run #154

Computermatronic opened this issue Mar 29, 2015 · 2 comments

Comments

@Computermatronic
Copy link

i wrote the following grammar and generate it with asModule
Metacode:
Script < Statement*
Operator < "+" / "-" / "" / ""
Unary < "++" / "--" / "+" / "-"
UnaryPostFix < "++" /"--"
Identifier < ~([a-z_A-Z]) ~([a-z_A-Z0-9])

Variable < Identifier / "(" Expression ")"
Tinary < Expression "?" Expression ":" Expression
ArrayInitializer < "[" (Expression ",")* "]"
Expression < Unary Variable
/ Variable UnaryPostFix
/ Variable
/ Expression Operator Expression
/ Tinary
/ Lambda
/ Expression "." Identifier
/ Expression "[" Expression "]"
/ ArrayInitializer
Def < "def" Type? Identifier ("=" Expression)? ";"
Statement < (Expression ";"
/ If
/ For
/ While
/ Do
/ Foreach
/ Class
/ Function
/ Def ";"
/ Import ";"
/ Module ";")
Block < "{" Statement* "}" / Statement
If < "if" "(" Expression ")" Block
For < "for" "(" Def ";" Expression ";" Expression ")" Block
While < "while" "(" Expression ")" Block
Do < "do" Block "until" "(" Expression ")"
Foreach < "foreach" "(" (Variable ",")* Variable ";" Expression ")" Block
Lambda < "function" Type? "(" (Identifier ",")* Identifier? ")" Block
Function < "function" Type? Identifier "(" (Identifier ",")* Identifier? ")" Block
Import < "import" Identifier ("." Identifier)*
Module < "module" Identifier ("." Identifier)*
Type < "<" Identifier ">"

oop stuff (currently unused)

ProtectionModifiers < "public"
                    / "package"
                    / "protected"
                    / "private" 
Class <  ProtectionModifiers? "class" Identifier ("using" (Identifier ",")* Identifier)? ClassBody 
ClassBody < "{" ClassMember* "}" 
ClassMember < ProtectionModifiers? ClassModifiers? (Def / Function) 
ClassModifiers < "static" 
AbstractClass <  ProtectionModifiers? "abstract" "class" Identifier ("using" Identifier)* AbstractClassBody 
AbstractClassBody < "{" (AbstractClassMember / ClassMember)* "}" 
AbstractClassMember < ProtectionModifiers? ClassModifiers? "abstract" "function" Type? Identifier "(" (Identifier ",")* Identifier? ")" ";" 
Interface < ProtectionModifiers "interface" Identifier InterfaceBody 
InterfaceBody < "{" InterfaceMember* "}" 
InterfaceMember < ProtectionModifiers? InterfaceModifiers? "function" Type? Identifier "(" (Identifier ",")* Identifier? ")" ";" 
InterfaceModifiers < "static" 

it compiles, but when i try to use it with
module main;
import metacode.parser;
import std.stdio;

string test = `

class test
{
function f()
{
}
}
`;

int main(string[] args)
{
ParseTree p = Metacode(test);
foreach(c;p.children)
{
writeln(c.name," ",c.children);
foreach(d;c.children)
{
writeln(d.name);
}
}
return 0;
}
and it crashes. why?

@PhilippeSigaud
Copy link
Owner

Hi,

mainly the crash is due to left-recursive rules: Expression < Expression Operator Expression, for example.
Parsing Expression Grammars do not handle left-recursive rules, as
explained in the docs. You should rewrite your rules for Expression and
Tinary, so as not to have subrules beginning with Expression, else the
parser enters an infinite loop (which causes the segmentation fault).
As a first approach, using # to comment the left-recursive subrules in
Expression (including Tinary) makes the example work.

There are two other small mistakes:

  • your definition of Identifier only recognizes two-letter
    identifiers: *Identifier < ~([a-z_A-Z]) ~([a-z_A-Z0-9])*, you should use: Identifier < [a-z_A-Z][a-z_A-Z0-9]* to recognize any C-style identifier.
  • your Operator definition has no ''? You should use Operator < "+" / "-"
    / "
    " / "'"

Also, be aware that defining Script as Statement* allows the parsing of 0
characters as a valid parse. Hence not well-written modules will be parsed
as "" (the 0-Statement option). You should maybe define Script as
Statement* endOfInput to catch any mistake.

Cheers,

Philippe

@PhilippeSigaud
Copy link
Owner

Hi machinistprogrammer,

Bastiaan Veelo recently add left-recursion capability to Pegged.
If you're still interested in it, you could try your grammar anew with the current HEAD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants