Skip to content

Commit

Permalink
feat: CDDL grammar correction for RFC8610 (#61)
Browse files Browse the repository at this point in the history
* feat: example files

* refactor: rename test files

* refactor: rename from rfc9615 to rfc9165

* feat: error display for testing multiple files

* feat: sort file reading

* fix: whitespace problem for ',', braces, and //

* feat: group elements initial unit test

* fix: comments

* fix: grammar correction for comments

* fix: grammar correcttion according to abnf

* fix: remove genericarg from typename and groupname

* fix: unit tests

* feat: initial group_elements test cases

* fix: put atomic test back

* revert: deleted grammar and tests

* fix: correct test cases

* feat: type decl tests

* feat: type decl tests

* feat: setup type decl test

* feat: type1 test cases

* feat: composition testing

* feat: rules test cases

* feat: rules test cases

* feat: all examples from rfc8610

* feat: add rule level tests

* refactor: use general passes and fails function in unit test

* fix: error msg for cddl

* fix: cddl test file name reading

* fix: cspell

* chore: lintfix

* refactor: cddl filter fn

* refactor: reset

* fix: pub(crate) for unit tests

* chore: fmtfix

* refactor: don't dry util functions

* fix: disable lint warning

* chore: fmtfix

* refactor: change common path

* feat: common

* refactor: remove tmp vars

* feat: common consts

* fix: add tmp allow dead code

* fix: cspell
  • Loading branch information
apskhem authored Jan 30, 2024
1 parent 99608dc commit efe780a
Show file tree
Hide file tree
Showing 71 changed files with 1,570 additions and 491 deletions.
83 changes: 71 additions & 12 deletions hermes/crates/cbork/cddl-parser/src/grammar/cddl_test.pest
Original file line number Diff line number Diff line change
Expand Up @@ -3,40 +3,99 @@
// Test Expressions ONLY TO Be USED by Unit Tests.
// Extends `cddl.pest` with rules needed to properly check sub-rules.

// cspell: words intfloat hexfloat
// cspell: words intfloat hexfloat groupname assignt
// cspell: words assigng genericparm genericarg rangeop ctlop
// cspell: words grpchoice grpent memberkey bareword optcom

/// Test Expression for the S Rule.
/// Test Expression for the `rule` Rule.
rule_TEST = ${ SOI ~ rule ~ EOI }

/// Test Expression for the `typename` Rule.
typename_TEST = ${ SOI ~ typename ~ EOI }

/// Test Expression for the `groupname` Rule.
groupname_TEST = ${ SOI ~ groupname ~ EOI }

/// Test Expression for the `assignt` Rule.
assignt_TEST = ${ SOI ~ assignt ~ EOI }

/// Test Expression for the `assigng` Rule.
assigng_TEST = ${ SOI ~ assigng ~ EOI }

/// Test Expression for the `genericparm` Rule.
genericparm_TEST = ${ SOI ~ genericparm ~ EOI }

/// Test Expression for the `genericarg` Rule.
genericarg_TEST = ${ SOI ~ genericarg ~ EOI }

/// Test Expression for the `type` Rule.
type_TEST = ${ SOI ~ type ~ EOI }

/// Test Expression for the `type1` Rule.
type1_TEST = ${ SOI ~ type1 ~ EOI }

/// Test Expression for the `type2` Rule.
type2_TEST = ${ SOI ~ type2 ~ EOI }

/// Test Expression for the `rangeop` Rule.
rangeop_TEST = ${ SOI ~ rangeop ~ EOI }

/// Test Expression for the `ctlop` Rule.
ctlop_TEST = ${ SOI ~ ctlop ~ EOI }

/// Test Expression for the `group` Rule.
group_TEST = ${ SOI ~ group ~ EOI }

/// Test Expression for the `grpchoice` Rule.
grpchoice_TEST = ${ SOI ~ grpchoice ~ EOI }

/// Test Expression for the `grpent` Rule.
grpent_TEST = ${ SOI ~ grpent ~ EOI }

/// Test Expression for the `memberkey` Rule.
memberkey_TEST = ${ SOI ~ memberkey ~ EOI }

/// Test Expression for the `bareword` Rule.
bareword_TEST = ${ SOI ~ bareword ~ EOI }

/// Test Expression for the `optcom` Rule.
optcom_TEST = ${ SOI ~ optcom ~ EOI }

/// Test Expression for the `occur` Rule.
occur_TEST = ${ SOI ~ occur ~ EOI }

/// Test Expression for the `S` Rule.
S_TEST = ${ SOI ~ S ~ EOI }

/// Test Expression for the COMMENT Rule.
/// Test Expression for the `COMMENT` Rule.
COMMENT_TEST = { SOI ~ COMMENT* ~ EOI }

/// Test expression for the URL_BASE64 Rule.
URL_BASE64_TEST = { SOI ~ URL_BASE64 ~ EOI }

/// Test expression to the id Rule.
/// Test expression to the `id` Rule.
id_TEST = ${ SOI ~ id ~ EOI}

/// Test expression to the bytes Rule.
/// Test expression to the `bytes` Rule.
bytes_TEST = ${ SOI ~ bytes ~ EOI}

/// Test expression to the text Rule.
/// Test expression to the `text` Rule.
text_TEST = ${ SOI ~ text ~ EOI}

/// Test expression to the uint Rule.
/// Test expression to the `uint` Rule.
uint_TEST = ${ SOI ~ uint ~ EOI}

/// Test expression to the int Rule.
/// Test expression to the `int` Rule.
int_TEST = ${ SOI ~ int ~ EOI}

/// Test expression to the intfloat Rule.
/// Test expression to the `intfloat` Rule.
intfloat_TEST = ${ SOI ~ intfloat ~ EOI}

/// Test expression to the hexfloat Rule.
/// Test expression to the `hexfloat` Rule.
hexfloat_TEST = ${ SOI ~ hexfloat ~ EOI}

/// Test expression to the number Rule.
/// Test expression to the `number` Rule.
number_TEST = ${ SOI ~ number ~ EOI}

/// Test expression to the value Rule.
/// Test expression to the `value` Rule.
value_TEST = ${ SOI ~ value ~ EOI}
106 changes: 51 additions & 55 deletions hermes/crates/cbork/cddl-parser/src/grammar/rfc_8610.pest
Original file line number Diff line number Diff line change
@@ -1,77 +1,79 @@
//! CDDL Grammar adapted from RFC8610 Appendix B
//! https://www.rfc-editor.org/rfc/rfc8610#appendix-B

// cspell: words assignt groupname grpent genericparm assigng
// cspell: words assignt groupname grpent genericparm assigng optcom
// cspell: words genericarg rangeop ctlop grpchoice memberkey bareword hexfloat intfloat
// cspell: words SCHAR BCHAR PCHAR SESC FFFD Characterset Visiable

cddl = {
cddl = ${
SOI
~ S ~ rule+
~ S ~ (rule ~ S)+
~ EOI
}

rule = {
( typename ~ assignt ~ type)
| ( groupname ~ assigng ~ grpent)
// -----------------------------------------------------------------------------
// Rules
rule = ${
(typename ~ genericparm? ~ S ~ assignt ~ S ~ type)
| (groupname ~ genericparm? ~ S ~ assigng ~ S ~ grpent)
}

typename = ${ id ~ genericparm? }
groupname = ${ id ~ genericparm? }
typename = { id }
groupname = { id }

assignt = { "=" | "/=" }
assigng = { "=" | "//=" }

genericparm = { "<" ~ id ~ ( "," ~ id )* ~ ">" }
genericarg = { "<" ~ type1 ~ ( "," ~ type1)* ~ ">" }

type = { type1 ~ ( S ~ "/" ~ type1)* }
genericparm = ${ "<" ~ S ~ id ~ S ~ ("," ~ S ~ id ~ S)* ~ ">" }
genericarg = ${ "<" ~ S ~ type1 ~ S ~ ("," ~ S ~ type1 ~ S)* ~ ">" }

type1 = { type2 ~ ( S ~ ( rangeop | ctlop ) ~ type2)? }

typename_arg = ${ typename ~ genericarg? }
groupname_arg = ${ groupname ~ genericarg? }
// -----------------------------------------------------------------------------
// Type Declaration
type = ${ type1 ~ (S ~ "/" ~ S ~ type1)* }

tag6 = ${ "#" ~ "6" ~ ("." ~ uint)? ~ "(" ~ S ~ type ~ S ~ ")" }
tag_generic = ${ "#" ~ ASCII_DIGIT ~ ("." ~ uint)? }
type1 = ${ type2 ~ (S ~ (rangeop | ctlop) ~ S ~ type2)? }

type2 = {
type2 = ${
value
| typename_arg
| ( "(" ~ type ~ ")" )
| ( "{" ~ group ~ "}" )
| ( "[" ~ group ~ "]" )
| ( "~" ~ typename_arg )
| ( "&" ~ "(" ~ group ~ ")" )
| ( "&" ~ groupname_arg )
| tag6
| tag_generic
| typename ~ genericarg?
| ("(" ~ S ~ type ~ S ~ ")")
| ("{" ~ S ~ group ~ S ~ "}")
| ("[" ~ S ~ group ~ S ~ "]")
| ("~" ~ S ~ typename ~ genericarg?)
| ("&" ~ S ~ "(" ~ S ~ group ~ S ~ ")")
| ("&" ~ S ~ groupname ~ genericarg?)
| ("#" ~ "6" ~ ("." ~ uint)? ~ "(" ~ S ~ type ~ S ~ ")")
| ("#" ~ ASCII_DIGIT ~ ("." ~ uint)?)
| "#"
}

rangeop = { "..." | ".." }
ctlop = ${ "." ~ id }

group = { grpchoice ~ ( S ~ "//" ~ grpchoice)* }
// -----------------------------------------------------------------------------
// Group Elements
group = ${ grpchoice ~ (S ~ "//" ~ S ~ grpchoice)* }

grpchoice = { ( grpent ~ ","? )* }
grpchoice = ${ (grpent ~ optcom)* }

grpent = ${
( (occur ~ S)? ~ (memberkey ~ S)? ~ type )
| ( (occur ~ S)? ~ groupname ~ genericarg? )
| ( (occur ~ S)? ~ "(" ~ S ~ group ~ S ~ ")" )
((occur ~ S)? ~ (memberkey ~ S)? ~ type)
| ((occur ~ S)? ~ groupname ~ genericarg?)
| ((occur ~ S)? ~ "(" ~ S ~ group ~ S ~ ")")
}

memberkey = {
( type1 ~ "^"? ~ "=>" )
| ( bareword ~ ":" )
| ( value ~ ":" )
memberkey = ${
(type1 ~ S ~ ("^" ~ S)? ~ "=>")
| ((value | bareword) ~ S ~ ":")
}

bareword = { id }

/// Optional Comma - Note eligible for producing pairs as this might be useful for linting
optcom = { S ~ ("," ~ S)? }

occur = {
( uint? ~ "*" ~ uint? )
(uint? ~ "*" ~ uint?)
| "+"
| "?"
}
Expand All @@ -82,7 +84,7 @@ occur = {
/// All Literal Values
value = { number | text | bytes }

/// Literal Numbers - A float if it has fraction or exponent; int otherwise
/// Literal Numbers - A float if it has fraction or exponent; int otherwise
number = { hexfloat | intfloat }

/// Hex floats of the form -0x123.abc0p+12
Expand All @@ -103,16 +105,16 @@ int = ${ "-"? ~ uint }

/// Unsigned Integers
uint = ${
( ASCII_NONZERO_DIGIT ~ ASCII_DIGIT* )
| ( "0x" ~ ASCII_HEX_DIGIT+ )
| ( "0b" ~ ASCII_BIN_DIGIT+ )
(ASCII_NONZERO_DIGIT ~ ASCII_DIGIT*)
| ("0x" ~ ASCII_HEX_DIGIT+)
| ("0b" ~ ASCII_BIN_DIGIT+)
| "0"
}

/// Literal Text
text = ${ "\"" ~ SCHAR* ~ "\"" }

/// Literal Bytes - Note CDDL Spec incorrectly defines b64''.
/// Literal Bytes.
bytes = ${ bytes_hex | bytes_b64 | bytes_text }
bytes_hex = ${ "h" ~ "'" ~ HEX_PAIR* ~ "'" }
bytes_b64 = ${ "b64" ~ "'" ~ URL_BASE64 ~ "'" }
Expand All @@ -121,13 +123,8 @@ bytes_text = ${ "'" ~ BCHAR* ~ "'" }
// -----------------------------------------------------------------------------
// Simple multiple character sequences

/// identifier, called the `name` in the CDDL spec.
id = ${
group_socket |
type_socket |
name
}

/// identifier, called the `name` in the CDDL spec.
id = ${ group_socket | type_socket | name }
/// Special form of a name that represents a Group Socket.
group_socket = ${ "$$" ~ ( ( "-" | "." )* ~ NAME_END )* }
/// Special form of a name that represents a Type Socket.
Expand All @@ -146,11 +143,10 @@ URL_BASE64 = _{ S ~ ( URL_BASE64_ALPHA ~ S)* ~ URL_BASE64_PAD? }
// -----------------------------------------------------------------------------
// Characters, Whitespace and Comments

S = _{ WHITESPACE* }
S = _{ (COMMENT | WHITESPACE)* }
WHITESPACE = _{ " " | "\t" | NEWLINE }
COMMENT = _{ ";" ~ (PCHAR | "\t")* ~ NEWLINE }
COMMENT = { ";" ~ (PCHAR | "\t")* ~ NEWLINE }

// URL Base64 Characterset.
URL_BASE64_ALPHA = _{ ASCII_ALPHA | ASCII_DIGIT | "-" | "_" }
// Optional Padding that goes at the end of Base64.
URL_BASE64_PAD = _{ "~" }
Expand All @@ -177,7 +173,7 @@ BCHAR = _{ BCHAR_ASCII_VISIBLE | UNICODE_CHAR | SESC | NEWLINE }
/// Escaping code to allow invalid characters to be used in text or byte strings.
SESC = ${ "\\" ~ (ASCII_VISIBLE | UNICODE_CHAR) }

/// All Visiable Ascii characters.
/// All Visible Ascii characters.
ASCII_VISIBLE = _{ ' '..'~' }

/// Ascii subset valid for text strings.
Expand All @@ -187,4 +183,4 @@ SCHAR_ASCII_VISIBLE = _{ ' '..'!' | '#'..'[' | ']'..'~' }
BCHAR_ASCII_VISIBLE = _{ ' '..'&' | '('..'[' | ']'..'~' }

/// Valid non ascii unicode Characters
UNICODE_CHAR = _{ '\u{80}'..'\u{10FFFD}' }
UNICODE_CHAR = _{ '\u{80}'..'\u{10FFFD}' }
Loading

0 comments on commit efe780a

Please sign in to comment.