Skip to content
Olivier Duhart edited this page Jun 27, 2022 · 1 revision

Generic Lexer Modes

Generic lexer modes allows to modify the lexer behavior. Each mode define a sub-lexer and lexems can be configured to trigger a particular mode.

Each lexeme can be affected to one or more modes with the [Mode] attribute

[AlphaIdentifier]
[Mode("main","sub")] 
ATTRIBUTE_ID

here the ATTRIBUTE_ID lexem is affected to modes "main" and "sub".

Modes are managed with a stack. A lexeme can - push the lexer in a new mode - or pop the lexer to the previous mode

Push and Pop action are set through [Push("mode")] [Pop] attributes respectively.

Here is an minimal lexer for simplified XML (only tags and attributes, no comment, no CDATA, no processing instructions). We have 2 modes : - "default" : catches all data between tags (i.e. text). - here we will use a special lexem [AllExcept("<")] that will match everything until it encounters a "<" - "tag" : lexes a tag and its attributes.

[AllExcept("<")]
[Mode("default")]
TEXT,

[Sugar("<")]
[Push("tag")]
[Mode("default")]
OPEN_TAG,

[Sugar(">")]
[Mode("tag")]
[Pop]
CLOSE_TAG,

[AlphaIdentifier]
[Mode("tag")]
ID,

[Sugar("=")]
[Mode("tag")]
EQUALS,

[String("\"","\\")]
[Mode("tag")]
ATTRIBUTE_VALUE