Normand is a text-to-binary processor with its own language.
This package offers both a portable Python 3 module and a command-line tool.
Warning
|
This version of Normand is 0.23, meaning both the Normand language and the module/CLI interface aren’t stable. |
- Introduction
- Install Normand
- Design goals
- Learn Normand
- Byte constant
- Literal string
- Current byte order setting
- Fixed-length number
- LEB128 integer
- String
- Current offset setting
- Current offset alignment
- Filling
- Label
- Variable assignment
- Group
- Conditional block
- Repetition block
- Transformation block
- Macro definition block
- Macro expansion
- Post-item repetition
- Command-line tool
- Python 3 API
- Development
The purpose of Normand is to consume human-readable text representing bytes and to produce the corresponding binary data.
Consider the following Normand input:
4f 55 32 bb $167 fe %10100111 a9 $-32
The generated nine bytes are:
4f 55 32 bb a7 fe a7 a9 e0
As you can see in the last example, the fundamental unit of the Normand language is the byte. The order in which you list bytes will be the order of the generated data.
The Normand language is more than simple lists of bytes, though. Its main features are:
- Comments, including a bunch of insignificant symbols which may improve readability
-
Input:
ff bb %1101:0010 # This is a comment 78 29 af $192 # This too # 99 $-80 fe80::6257:18ff:fea3:4229 60:57:18:a3:42:29 10839636-5d65-4a68-8e6a-21608ddf7258
Output:
ff bb d2 78 29 af c0 99 b0 fe 80 62 57 18 ff fe a3 42 29 60 57 18 a3 42 29 10 83 96 36 5d 65 4a 68 8e 6a 21 60 8d df 72 58
- Hexadecimal, decimal, and binary byte constants
-
Input:
aa bb $247 $-89 %0011_0010 %11.01= 10/10
Output:
aa bb f7 a7 32 da
- Strings
-
Input:
"hello world!" 00 u16le"stress\nverdict 🤣" s:latin3{hex(ICITTE)}
Output:
68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t 00 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r 00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd 30 ┆ •d•i•c•t• •>•#•0 78 32 66 ┆ x2f
- Labels: special variables holding the offset where they’re defined
-
<beg> b2 52 e3 bc 91 05 $100 $50 <chair> 33 9f fe 25 e9 89 8a <end>
- Variables
-
5e 65 {tower = 47} c6 7f f2 c4 44 {hurl = tower - 14} b5 {tower = hurl} 26 2d
The value of a variable assignment is the evaluation of a valid Python 3 expression which may include label and variable names.
- Fixed-length number with a given length (8Â bits to 64Â bits) and byte order
-
Input:
{strength = 4} !be 67 <lbl> 44 $178 [(end - lbl) * 8 + strength : 16] $99 <end> !le [-1993 : 32] [-3.141593 : 64be]
Output:
67 44 b2 00 2c 63 37 f8 ff ff c0 09 21 fb 82 c2 bd 7f
The encoded number is the evaluation of a valid Python 3 expression which may include label and variable names.
- LEB128 integer
-
Input:
aa bb cc [-1993 : sleb128] <meow> dd ee ff [meow * 199 : uleb128]
Output:
aa bb cc b7 70 dd ee ff e3 07
The encoded integer is the evaluation of a valid Python 3 expression which may include label and variable names.
- Conditional
-
Input:
aa bb cc ( "foo" !if {ICITTE > 10} "bar" !else "fight" !end ) * 4
Output:
aa bb cc 66 6f 6f 66 69 67 68 74 66 6f 6f 66 69 ┆ •••foofightfoofi 67 68 74 66 6f 6f 62 61 72 66 6f 6f 62 61 72 ┆ ghtfoobarfoobar
- Repetition
-
Input:
aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3} !repeat 3 ff ee "juice" !end
Output:
aa bb bb bb bb bb cc 79 65 61 68 00 79 65 61 68 ┆ •••••••yeah•yeah 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah• 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 ┆ yeah•yeah•yeah•y 65 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 ┆ eah•yeah•yeah•ye 61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah• ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice•• 6a 75 69 63 65 ┆ juice
- Alignment
-
Input:
!be [199:32] @64 [43:64] @16 [-123:16] @32~255 [5584:32]
Output:
00 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b ff 85 ff ff 00 00 15 d0
- Filling
-
Input:
!le [0xdeadbeef:32] [-1993:16] [9:16] +0x40 [ICITTE:8] "meow mix" +200~FFh [ICITTE:8]
Output:
ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7••••••••••• 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ •••••••••••••••• 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ •••••••••••••••• 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ •••••••••••••••• 40 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix••••••• ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ •••••••••••••••• ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ •••••••••••••••• ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ •••••••••••••••• ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ •••••••••••••••• ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ •••••••••••••••• ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ •••••••••••••••• ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ •••••••••••••••• ff ff ff ff ff ff ff ff c8 ┆ •••••••••
- Transformation
-
Input:
"end of file @ " [end:8] !transform gzip "this part will be gzipped" !end <end>
Output:
65 6e 64 20 6f 66 20 66 69 6c 65 20 40 20 3c 1f ┆ end of file @ <• 8b 08 00 7b 7b 26 65 02 ff 2b c9 c8 2c 56 28 48 ┆ •••{{&e••+••,V(H 2c 2a 51 28 cf cc c9 51 48 4a 55 48 af ca 2c 28 ┆ ,*Q(•••QHJUH••,( 48 4d 01 00 d4 cc 5b 8a 19 00 00 00 ┆ HM••••[•••••
- Multilevel grouping
-
Input:
ff ((aa bb "zoom" cc) * 5) * 3 $-34 * 4
Output:
ff aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa ┆ •••zoom•••zoom•• bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a ┆ •zoom•••zoom•••z 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f ┆ oom•••zoom•••zoo 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc ┆ m•••zoom•••zoom• aa bb 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb ┆ ••zoom•••zoom••• 7a 6f 6f 6d cc aa bb 7a 6f 6f 6d cc aa bb 7a 6f ┆ zoom•••zoom•••zo 6f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
- Macros
-
Input:
!macro hello(world) "hello" !if world " world" !end !end !repeat 17 ff ff ff ff m:hello({ICITTE > 15 and ICITTE < 60}) !end
Output:
ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel 6c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl 64 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world• ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello•••• 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello•• ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
- Precise error reporting
-
/tmp/meow.normand:10:24 - Expecting a bit (`0` or `1`).
/tmp/meow.normand:32:6 - Unexpected character `k`.
/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
/tmp/meow.normand:32:19 - While expanding the macro `meow`: /tmp/meow.normand:35:5 - While expanding the macro `zzz`: /tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
You can use Normand to track data source files in your favorite VCS instead of raw binary files. The binary files that Normand generates can be used to test file format decoding, including malformatted data, for example, as well as for education.
See Learn Normand to explore all the Normand features.
Normand requires Python ≥ 3.4.
To install Normand:
$ python3 -m pip install --user normand
See Installing to the User Site to learn more about a user site installation.
Note
|
Normand has a single module file,
|
The design goals of Normand are:
- Portability
-
We’re making sure
normand.py
works with Python ≥ 3.4 and doesn’t have any external dependencies so that you may just copy the module as is to your own project. - Ease of use
-
The most basic Normand input is a sequence of hexadecimal constants (for example,
4e6f726d616e64
) which produce exactly what you’d expect.Most Normand features map to programming language concepts you already know and understand: constant integers, literal strings, variables, conditionals, repetitions/loops, and the rest.
- Concise and readable input
-
We could have chosen XML or YAML as the input format, but having a DSL here makes a Normand input compact and easy to read, two important traits when using Normand to write tests, for example.
Compare the following Normand input and some hypothetical XML equivalent, for example:
Actual Normand input.ff dd 01 ab $192 $-128 %1101:0011 [end:8] {iter = 1} !if {not something} # five times because xyz !repeat 5 "hello world " [iter:8] {iter = iter + 1} !end !end <end>
Hypothetical Normand XML input.<?xml version="1.0" encoding="utf-8" ?> <group> <byte base="x" val="ff" /> <byte base="x" val="dd" /> <byte base="x" val="1" /> <byte base="x" val="ab" /> <byte base="d" val="192" /> <byte base="d" val="-128" /> <byte base="b" val="11010011" /> <fixed-len-num expr="end" len="8" /> <var-assign name="iter" expr="1" /> <cond expr="not something"> <!-- five times because xyz --> <repeat expr="5"> <str>hello world </str> <fixed-len-num expr="iter" len="8" /> <var-assign name="iter" expr="iter + 1" /> </repeat> </cond> <label name="end" /> </group>
A Normand text input is a sequence of items which represent a sequence of raw bytes.
State variable | Description | Initial value: Python 3 API | Initial value: CLI |
---|---|---|---|
The current offset has an effect on the value of labels and of
the special Each generated byte increments the current offset. A current offset setting may change the current offset without generating data. An current offset alignment generates padding bytes to make the current offset satisfy a given alignment. |
|
|
|
The current byte order can have an effect on the encoding of fixed-length numbers. A current byte order setting may change the current byte order. |
|
|
|
Mapping of label names to integral values. |
|
One or more |
|
Mapping of variable names to integral or floating point number values. |
|
One or more |
The available items are:
-
A constant integer representing one or more constant bytes.
-
A literal string representing a constant sequence of bytes encoding UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 data.
-
A current byte order setting (big or little endian).
-
A fixed-length number (integer or floating point), possibly using the current byte order, and of which the value is the result of a Python 3 expression.
-
An LEB128 integer of which the value is the result of a Python 3 expression.
-
A string representing a sequence of bytes encoding UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 data, and of which the value is the result of a Python 3 expression.
-
A filling.
-
A label, that is, a named constant holding the current offset.
This is similar to an assembly label.
-
A variable assignment associating a name to the integral result of an evaluated Python 3 expression.
-
A group, that is, a scoped sequence of items.
Moreover, you can repeat many items above a constant or variable number
of times with the *
operator after the item to repeat. This
is called a post-item repetition.
A Normand comment may exist pretty much anywhere between tokens.
A comment is anything between two #
characters on the same
line, or from #
until the end of the line. Whitespaces are
also considered comments. The following symbols are also considered
comments around and between items, as well as between hexadecimal
nibbles and binary bits of byte constants:
& , - . / : ; = ? \ _ |
The latter serve to improve readability so that you may write, for example, a MAC address or a UUID as is.
Many items require a constant integer, possibly
negative, in which case it may start with -
for a negative integer. A
positive constant integer is any of:
- Decimal
-
One or mode digits (
0
to9
). - Hexadecimal
-
One of:
-
The
0x
or0X
prefix followed with one or more hexadecimal digits (0
to9
,a
tof
, orA
toF
). -
One or more hexadecimal digits followed with the
h
orH
suffix.
-
- Octal
-
One of:
-
The
0o
or0O
prefix followed with one or more octal digits (0
to7
). -
One or more octal digits followed with the
o
,O
,q
, orQ
suffix.
-
- Binary
-
One of:
-
The
0b
or0B
prefix followed with one or more bits (0
or1
). -
One or more bits followed with the
b
orB
suffix.
-
In general, anything between {
and }
is a Python 3 expression.
You can test the examples of this section with the normand
command-line tool as such:
$ normand file | hexdump -C
where file
is the name of a file containing the Normand input.
A byte constant represents one or more constant bytes.
A byte constant is:
- Hexadecimal form
-
Two consecutive hexadecimal digits representing a single byte.
- Decimal form
-
One or more digits after the
$
prefix representing a single byte. - Binary form
-
-
N
%
prefixes (at least one).The number of
%
characters is the number of subsequent expected bytes. -
N × 8 bits (
0
or1
).
-
Input:
ab cd (3d 8F) CC
Output:
ab cd 3d 8f cc
Input:
$192 %1100/0011 $ -77
Output:
c0 c3 b3
Input:
58f64689-6316-4d55-8a1a-04cada366172 fe80::6257:18ff:fea3:4229
Output:
58 f6 46 89 63 16 4d 55 8a 1a 04 ca da 36 61 72 ┆ X•F•c•MU•••••6ar fe 80 62 57 18 ff fe a3 42 29 ┆ ••bW••••B)
Input:
%01110011 %01100001 %01101100 %01110101 %01110100 %%%1101:0010 11111111 #A#11 #B#00 #C#011 #D#1
Output:
73 61 6c 75 74 d2 ff c7 ┆ salut•••
A literal string represents the encoded bytes of a literal string using the UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 encoding.
The string to encode isn’t implicitly null-terminated: use \0
at the
end of the string to add a null character.
A literal string is:
-
Optional: one of the following encodings instead of the default UTF-8:
s:u8
u8
UTF-8.
s:u16be
u16be
UTF-16BE.
s:u16le
u16le
UTF-16LE.
s:u32be
u32be
UTF-32BE.
s:u32le
u32le
UTF-32LE.
s:latin1
ISO/IEC 8859-1.
s:latin2
ISO/IEC 8859-2.
s:latin3
ISO/IEC 8859-3.
s:latin4
ISO/IEC 8859-4.
s:latin5
ISO/IEC 8859-9.
s:latin6
ISO/IEC 8859-10.
s:latin7
ISO/IEC 8859-13.
s:latin8
ISO/IEC 8859-14.
s:latin9
ISO/IEC 8859-15.
s:latin10
ISO/IEC 8859-16.
-
The
"
prefix. -
A sequence of zero or more characters, possibly containing escape sequences.
An escape sequence is the
\
character followed by one of:0
Null (U+0000)
a
Alert (U+0007)
b
Backspace (U+0008)
e
Escape (U+001B)
f
Form feed (U+000C)
n
End of line (U+000A)
r
Carriage return (U+000D)
t
Character tabulation (U+0009)
v
Line tabulation (U+000B)
\
Reverse solidus (U+005C)
"
Quotation mark (U+0022)
-
The
"
suffix.
Input:
"coucou tout le monde!"
Output:
63 6f 75 63 6f 75 20 74 6f 75 74 20 6c 65 20 6d ┆ coucou tout le m 6f 6e 64 65 21 ┆ onde!
Input:
u16le"I am not young enough to know everything."
Output:
49 00 20 00 61 00 6d 00 20 00 6e 00 6f 00 74 00 ┆ I• •a•m• •n•o•t• 20 00 79 00 6f 00 75 00 6e 00 67 00 20 00 65 00 ┆ •y•o•u•n•g• •e• 6e 00 6f 00 75 00 67 00 68 00 20 00 74 00 6f 00 ┆ n•o•u•g•h• •t•o• 20 00 6b 00 6e 00 6f 00 77 00 20 00 65 00 76 00 ┆ •k•n•o•w• •e•v• 65 00 72 00 79 00 74 00 68 00 69 00 6e 00 67 00 ┆ e•r•y•t•h•i•n•g• 2e 00 ┆ .•
Input:
s:u32be "\"illusion is the first\nof all pleasures\" 🦉"
Output:
00 00 00 22 00 00 00 69 00 00 00 6c 00 00 00 6c ┆ •••"•••i•••l•••l 00 00 00 75 00 00 00 73 00 00 00 69 00 00 00 6f ┆ •••u•••s•••i•••o 00 00 00 6e 00 00 00 20 00 00 00 69 00 00 00 73 ┆ •••n••• •••i•••s 00 00 00 20 00 00 00 74 00 00 00 68 00 00 00 65 ┆ ••• •••t•••h•••e 00 00 00 20 00 00 00 66 00 00 00 69 00 00 00 72 ┆ ••• •••f•••i•••r 00 00 00 73 00 00 00 74 00 00 00 0a 00 00 00 6f ┆ •••s•••t•••••••o 00 00 00 66 00 00 00 20 00 00 00 61 00 00 00 6c ┆ •••f••• •••a•••l 00 00 00 6c 00 00 00 20 00 00 00 70 00 00 00 6c ┆ •••l••• •••p•••l 00 00 00 65 00 00 00 61 00 00 00 73 00 00 00 75 ┆ •••e•••a•••s•••u 00 00 00 72 00 00 00 65 00 00 00 73 00 00 00 22 ┆ •••r•••e•••s•••" 00 00 00 20 00 01 f9 89 ┆ ••• ••••
Input:
s:latin1 "Paul Piché"
Output:
50 61 75 6c 20 50 69 63 68 e9 ┆ Paul Pich•
This special item sets the current byte order.
The two accepted forms are:
!be
|
Set the current byte order to big endian. |
!le
|
Set the current byte order to little endian. |
A fixed-length number represents a fixed number of bytes encoding either:
-
An unsigned or signed integer (two’s complement).
The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
-
A floating point number (IEEEÂ 754-2008).
The available lengths are 32 (binary32) and 64 (binary64).
The value is the result of evaluating a Python 3 expression.
The byte order to use to encode the value is either directly specified or is the current byte order.
A fixed-length number is:
-
The
[
prefix. -
A valid Python 3 expression.
For a fixed-length number at some source location L, this expression may contain the name of any accessible label (not within a nested group), including the name of a label defined after L (except within a transformation block), as well as the name of any variable known at L.
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before encoding the number). -
The
:
character. -
An encoding length in bits amongst:
- The expression evaluates to an
int
orbool
value -
8
,16
,24
,32
,40
,48
,56
, and64
.NoteNormand automatically converts a bool
value toint
. - The expression evaluates to a
float
value -
32
and64
.
- The expression evaluates to an
-
Optional: a suffix of the previous encoding length, without any whitespace, amongst:
be
Encode in big endian.
le
Encode in little endian.
Without this suffix, the encoding byte order is the current byte order which must be defined if the encoding length is greater than eight.
-
The
]
suffix.
Input:
[345:16le] [-0xabcd:32be]
Output:
59 01 ff ff 54 33
Input:
!be # String length in bits [8 * (str_end - str_beg) : 16] # String <str_beg> "hello world!" <str_end>
Output:
00 60 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ┆ •`hello world!
Input:
[20 - ICITTE : 8] * 10
Output:
14 13 12 11 10 0f 0e 0d 0c 0b
Input:
[2 * 0.0529 : 32le]
Output:
ac ad d8 3d
An LEB128 integer represents a variable number of bytes encoding an unsigned or signed integer which is the result of evaluating a Python 3 expression following the LEB128 format.
An LEB128 integer is:
-
The
[
prefix. -
A valid Python 3 expression of which the evaluation result type is
int
orbool
(automatically converted toint
).For an LEB128 integer at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before encoding the integer). -
The
:
character. -
One of:
uleb128
Use the unsigned LEB128 format.
sleb128
Use the signed LEB128 format.
-
The
]
suffix.
Input:
[624485 : uleb128]
Output:
e5 8e 26
Input:
aa bb cc dd <meow> ee ff [-981238311 + (meow * -23) : sleb128] "hello"
Output:
aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
A string represents a variable number of bytes encoding a string which is the result of evaluating a Python 3 expression using the UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 encoding.
A string has two possible forms:
- Encoding prefix form
-
-
An encoding amongst:
s:u8
u8
UTF-8.
s:u16be
u16be
UTF-16BE.
s:u16le
u16le
UTF-16LE.
s:u32be
u32be
UTF-32BE.
s:u32le
u32le
UTF-32LE.
s:latin1
ISO/IEC 8859-1.
s:latin2
ISO/IEC 8859-2.
s:latin3
ISO/IEC 8859-3.
s:latin4
ISO/IEC 8859-4.
s:latin5
ISO/IEC 8859-9.
s:latin6
ISO/IEC 8859-10.
s:latin7
ISO/IEC 8859-13.
s:latin8
ISO/IEC 8859-14.
s:latin9
ISO/IEC 8859-15.
s:latin10
ISO/IEC 8859-16.
-
The
{
prefix. -
A valid Python 3 expression of which the evaluation result type is
bool
,int
,float
, orstr
(the first three automatically converted tostr
).For a string at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before encoding the string). -
The
}
suffix.
-
- Encoding suffix form
-
-
The
[
prefix. -
A valid Python 3 expression of which the evaluation result type is
bool
,int
,float
, orstr
(the first three automatically converted tostr
).For a string at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before encoding the string). -
The
:
character. -
A string encoding amongst:
s:u8
UTF-8.
s:u16be
UTF-16BE.
s:u16le
UTF-16LE.
s:u32be
UTF-32BE.
s:u32le
UTF-32LE.
s:latin1
ISO/IEC 8859-1.
s:latin2
ISO/IEC 8859-2.
s:latin3
ISO/IEC 8859-3.
s:latin4
ISO/IEC 8859-4.
s:latin5
ISO/IEC 8859-9.
s:latin6
ISO/IEC 8859-10.
s:latin7
ISO/IEC 8859-13.
s:latin8
ISO/IEC 8859-14.
s:latin9
ISO/IEC 8859-15.
s:latin10
ISO/IEC 8859-16.
-
The
]
suffix.
-
Input:
{iter = 1} !repeat 10 u8{iter} " " {iter = iter + 1} !end
Output:
31 20 32 20 33 20 34 20 35 20 36 20 37 20 38 20 ┆ 1 2 3 4 5 6 7 8 39 20 31 30 20 ┆ 9 10
Input:
{meow = 'salut jérémie'} [meow.upper() : s:latin1]
Output:
53 41 4c 55 54 20 4a c9 52 c9 4d 49 45 ┆ SALUT J•R•MIE
This special item sets the current offset.
A current offset setting is:
-
The
<
prefix. -
A positive constant integer which is the new current offset.
-
The
>
suffix.
Input:
[ICITTE : 8] * 8 <0x61> [ICITTE : 8] * 8
Output:
00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
Input:
aa bb cc dd <meow> ee ff <12> 11 22 33 <mix> 44 55 [meow : 8] [mix : 8]
Output:
aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
A current offset alignment represents zero or more padding bytes to make the current offset meet a given alignment value.
More specifically, for an alignment value of N bits, a current offset alignment represents the required padding bytes until the current offset is a multiple of N / 8.
A current offset alignment is:
-
The
@
prefix. -
A positive constant integer which is the alignment value in bits.
This value must be greater than zero and a multiple of 8.
-
Optional:
-
The
~
prefix. -
A positive constant integer which is the value of the byte to use as padding to align the current offset.
Without this section, the padding byte value is zero.
-
Input:
11 22 (@32 aa bb cc) * 3
Output:
11 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
Input:
!le 77 88 @32~0xcc [-893.5:32] @128~0x55 "meow"
Output:
77 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU 6d 65 6f 77 ┆ meow
Input:
aa bb cc <29> @64~255 "zoom"
Output:
aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
A filling represents zero or more padding bytes to make the current offset reach a given value.
A filling is:
-
The
+
prefix. -
One of:
-
A positive constant integer which is the current offset target.
-
The
{
prefix, a valid Python 3 expression of which the evaluation result type isint
orbool
(automatically converted toint
), and the}
suffix.For a filling at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before handling the items to repeat). -
A valid Python 3 name.
For the name
NAME
, this is equivalent to the{NAME}
form above.
This value must be greater than or equal to the current offset where it’s used.
-
-
Optional:
-
The
~
prefix. -
A positive constant integer which is the value of the byte to use as padding to reach the current offset target.
Without this section, the padding byte value is zero.
-
Input:
aa bb cc dd +0x40 "hello world"
Output:
aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ •••••••••••••••• 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ •••••••••••••••• 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ •••••••••••••••• 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ •••••••••••••••• 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
Input:
!macro part(iter, fill) <0> "particular security " [ord('0') + iter : 8] +fill~0x80 !end {iter = 1} !repeat 5 m:part(iter, {32 + 4 * iter}) {iter = iter + 1} !end
Output:
70 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur 69 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1••••••••••• 80 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s 65 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2••••••• 80 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part 69 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security 33 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3••••••••••••••• 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul 61 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4••• 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ •••••••••••••••• 80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul 61 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5••• 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ •••••••••••••••• 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
A label associates a name to the current offset.
All the labels of a whole Normand input must have unique names.
A label must not share the name of a variable name.
A label is:
-
The
<
prefix. -
A valid Python 3 name which is not
ICITTE
. -
The
>
suffix.
A variable assignment associates a name to the integral result of an evaluated Python 3 expression.
A variable assignment is:
-
The
{
prefix. -
A valid Python 3 name which is not
ICITTE
. -
The
=
character. -
A valid Python 3 expression of which the evaluation result type is
int
,float
, orbool
(automatically converted toint
), orstr
.For a variable assignment at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset. -
The
}
suffix.
Input:
{mix = 101} !le {meow = 42} 11 22 [meow:8] 33 {meow = ICITTE + 17} "yooo" [meow + mix : 16]
Output:
11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
A group is a scoped sequence of items.
The labels within a group aren’t visible outside of it.
The main purpose of a group is to repeat more than a single item and to isolate labels.
A group is:
-
The
(
,!group
, or!g
opening. -
Zero or more items except, recursively, a macro definition block.
-
Depending on the group opening:
(
-
The
)
closing. !group
!g
-
The
!end
closing.
Input:
((aa bb cc) dd () ee) "leclerc"
Output:
aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
Input:
!group (aa bb cc) * 3 dd ee !end * 5
Output:
aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd ee
Input:
!be ( <str_beg> u16le"sébastien diaz" <str_end> [ICITTE - str_beg : 8] [(end - str_beg) * 5 : 24] ) * 3 <end>
Output:
73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e• 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z••••• 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e• 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@ 73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e• 6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
A conditional block represents either the bytes of zero or more items if some expression is true, or the bytes of zero or more other items if it’s false.
A conditional block is:
-
The
!if
opening. -
One of:
-
The
{
prefix, a valid Python 3 expression of which the evaluation result type isint
orbool
(automatically converted toint
), and the}
suffix.For a conditional block at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before handling the contained items). -
A valid Python 3 name.
For the name
NAME
, this is equivalent to the{NAME}
form above.
-
-
Zero or more items to be handled when the condition is true except, recursively, a macro definition block.
-
Optional:
-
The
!else
opening. -
Zero or more items to be handled when the condition is false except, recursively, a macro definition block
-
-
The
!end
closing.
Input:
{at = 1} {rep_count = 9} !repeat rep_count "meow " !if {ICITTE > 25} "mix" !else "zoom" !end !if {at < rep_count} 20 !end {at = at + 1} !end
Output:
6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 6f 77 20 7a ┆ meow zoom meow z 6f 6f 6d 20 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 ┆ oom meow zoom me 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 78 20 ┆ ow mix meow mix 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 ┆ meow mix meow mi 78 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow 6d 69 78 ┆ mix
Input:
<str_beg> u16le"meow mix!" <str_end> !if {str_end - str_beg > 10} " BIG" !end
Output:
6d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x• 21 00 20 42 49 47 ┆ !• BIG
A repetition block represents the bytes of one or more items repeated a given number of times.
A repetition block is:
-
The
!repeat
or!r
opening. -
One of:
-
A positive constant integer which is the number of times to repeat the previous item.
-
The
{
prefix, a valid Python 3 expression of which the evaluation result type isint
orbool
(automatically converted toint
), and the}
suffix.For a repetition block at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before handling the items to repeat). -
A valid Python 3 name.
For the name
NAME
, this is equivalent to the{NAME}
form above.
-
-
Zero or more items except, recursively, a macro definition block.
-
The
!end
closing.
You may also use a post-item repetition after
some items. The form !repeat X ITEMS !end
is equivalent to (ITEMS)Â *Â X
.
Input:
!repeat 0o400 [end - ICITTE - 1 : 8] !end <end>
Output:
ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ •••••••••••••••• ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ •••••••••••••••• df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ •••••••••••••••• cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ •••••••••••••••• bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ •••••••••••••••• af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ •••••••••••••••• 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ •••••••••••••••• 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ •••••••••••••••• 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba` 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@ 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"! 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ •••••••••••••••• 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
Input:
{times = 1} aa bb cc dd !repeat 3 <here> !repeat {here + 1} ee ff !end 11 22 !repeat times 33 !end {times = times + 1} !end "coucou!"
Output:
aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••" 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
A transformation block represents the bytes of one or more items transformed into other bytes by a function.
As of this version, Normand only offers a predetermined set of transformation functions.
An encoded block is:
-
The
!transform
or!t
opening. -
A transformation function name amongst:
base64
b64
Standard Base64.
base64u
b64u
URL-safe Base64, using
-
instead of+
and_
instead of/
.base32
b32
Standard Base32.
base16
b16
Standard Base16.
ascii85
a85
Ascii85 without padding.
ascii85p
a85p
Ascii85 with padding.
base85
b85
Base85 (like Git-style binary diffs) without padding.
base85p
b85p
Base85 with padding.
quopri
qp
MIME quoted-printable without quoted whitespaces.
quoprit
qpt
MIME quoted-printable with quoted whitespaces.
gzip
gz
gzip.
bzip2
bz2
-
Zero or more items except, recursively, a macro definition block.
Any Python 3 expression within any of those items may not refer to a future label.
The value of the special name
ICITTE
in any Python 3 expression within any of those items is the current offset before Normand applies the transformation function. Therefore, labels defined within those items also have the current offset value before Normand applies the transformation function. -
The
!end
closing.
The current offset after having handled the last item of a transformation block is the value of the current offset before handling the first item plus the size of the generated (transformed) bytes. In other words, current offset settings within the items of the block have no impact outside said block.
Input:
aa bb cc dd "size of compressed section: " [end - start : 8] <start> !transform bzip2 "this will be compressed!" 89*100 00*5000 !end <end> "yes!"
Output:
aa bb cc dd 73 69 7a 65 20 6f 66 20 63 6f 6d 70 ┆ ••••size of comp 72 65 73 73 65 64 20 73 65 63 74 69 6f 6e 3a 20 ┆ ressed section: 52 42 5a 68 39 31 41 59 26 53 59 68 e1 8c fc 00 ┆ RBZh91AY&SYh•••• 00 33 d1 e0 c0 00 60 00 5e 66 dc 80 00 20 00 80 ┆ •3••••`•^f••• •• 00 08 20 00 31 40 d3 43 23 26 20 ca 87 a9 a1 e8 ┆ •• •1@•C#& ••••• 18 29 44 80 9c 80 49 bf cc b3 e8 45 ed e2 76 ad ┆ •)D•••I••••E••v• 0f 12 8b 8a d6 cd 40 04 7e 2e e4 8a 70 a1 20 d1 ┆ ••••••@•~.••p• • c3 19 f8 79 65 73 21 ┆ •••yes!
Input:
88*16 !t a85 "I am determined to be cheerful and happy in whatever situation " "I may find myself. For I have learned that the greater part of " "our misery or unhappiness is determined not by our circumstance " "but by our disposition." !end @128~99h !t qp <beg> [ICITTE - beg : 8] * 50 !end
Output:
88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 ┆ •••••••••••••••• 38 4b 5f 47 59 2b 43 6f 26 2a 41 54 44 58 25 44 ┆ 8K_GY+Co&*ATDX%D 49 6d 3f 24 46 44 69 3a 32 41 4b 59 4a 72 41 53 ┆ Im?$FDi:2AKYJrAS 23 6d 6f 46 5f 69 31 2f 44 49 61 6c 27 40 3b 70 ┆ #moF_i1/DIal'@;p 31 32 2b 44 47 5e 39 47 41 28 45 2c 41 54 68 58 ┆ 12+DG^9GA(E,AThX 2a 2b 45 4d 37 3d 46 5e 5d 42 2b 44 66 2d 5b 68 ┆ *+EM7=F^]B+Df-[h 2b 44 6b 50 34 2b 44 2c 3e 2a 41 30 3e 60 37 46 ┆ +DkP4+D,>*A0>`7F 28 4b 30 22 2f 67 2a 57 25 45 5a 64 70 72 42 4f ┆ (K0"/g*W%EZdprBO 51 27 71 2b 44 62 55 74 45 63 2c 48 21 2b 45 56 ┆ Q'q+DbUtEc,H!+EV 3a 2a 46 3c 47 5b 3d 41 4b 59 57 2b 41 52 54 5b ┆ :*F<G[=AKYW+ART[ 6c 45 5a 66 3d 30 45 63 60 46 42 41 66 75 23 37 ┆ lEZf=0Ec`FBAfu#7 45 5a 66 34 35 46 28 4b 42 3b 2b 45 29 39 43 46 ┆ EZf45F(KB;+E)9CF 60 28 6c 24 45 2c 5d 4e 2f 41 54 4d 6f 38 42 6c ┆ `(l$E,]N/ATMo8Bl 62 44 2d 41 54 56 4c 28 44 2f 21 6d 21 41 30 3e ┆ bD-ATVL(D/!m!A0> 63 2e 46 3c 47 25 3c 2b 45 29 43 43 2b 43 66 2c ┆ c.F<G%<+E)CC+Cf, 2b 40 73 29 58 30 46 43 42 26 73 41 4b 59 48 29 ┆ +@s)X0FCB&sAKYH) 46 3c 47 25 3c 2b 45 29 43 43 2b 43 6f 32 2d 45 ┆ F<G%<+E)CC+Co2-E 2c 54 66 33 46 44 35 5a 32 2f 63 99 99 99 99 99 ┆ ,Tf3FD5Z2/c••••• 3d 30 30 3d 30 31 3d 30 32 3d 30 33 3d 30 34 3d ┆ =00=01=02=03=04= 30 35 3d 30 36 3d 30 37 3d 30 38 3d 30 39 0a 3d ┆ 05=06=07=08=09•= 30 42 3d 30 43 0d 3d 30 45 3d 30 46 3d 31 30 3d ┆ 0B=0C•=0E=0F=10= 31 31 3d 31 32 3d 31 33 3d 31 34 3d 31 35 3d 31 ┆ 11=12=13=14=15=1 36 3d 31 37 3d 31 38 3d 31 39 3d 31 41 3d 31 42 ┆ 6=17=18=19=1A=1B 3d 31 43 3d 31 44 3d 31 45 3d 31 46 20 21 22 23 ┆ =1C=1D=1E=1F !"# 24 25 26 27 28 29 2a 2b 2c 2d 3d 0a 2e 2f 30 31 ┆ $%&'()*+,-=•./01
A macro definition block associates a name and parameter names to a group of items.
A macro definition block doesn’t lead to generated bytes itself: a macro expansion does so.
A macro definition may only exist at the root level, that is, not within a group, a repetition block, a conditional block, or another macro definition block.
All macro definitions must have unique names.
A macro definition is:
-
The
!macro
or!m
opening. -
A valid Python 3 name (the macro name).
-
The
(
parameter name list prefix. -
A comma-separated list of zero or more unique parameter names, each one being a valid Python 3 name.
-
The
)
parameter name list suffix. -
Zero or more items except, recursively, a macro definition block.
-
The
!end
closing.
!macro bake() !le [ICITTE * 8 : 16] u16le"predict explode" !end
!macro nail(rep, with_extra, val) {iter = 1} !repeat rep [val + iter : uleb128] [0xdeadbeef : 32] {iter = iter + 1} !end !if with_extra "meow mix\0" !end !end
A macro expansion expands the items of a defined macro.
The macro to expand must be defined before the expansion.
The state before handling the first item of the chosen macro is:
- Current offset
-
Unchanged.
- Current byte order
-
Unchanged.
- Variables
-
The only available variables initially are the macro parameters.
- Labels
-
None.
The state after having handled the last item of the chosen macro is:
- Current offset
-
The one before handling the first item of the macro plus the size of the generated data of the macro expansion.
ImportantThis means current offset setting items within the expanded macro don’t impact the final current offset. - Current byte order
-
The one before handling the first item of the macro.
- Variables
-
The ones before handling the first item of the macro.
- Labels
-
The ones before handling the first item of the macro.
A macro expansion is:
-
The
m:
prefix. -
A valid Python 3 name (the name of the macro to expand).
-
The
(
parameter value list prefix. -
A comma-separated list of zero or more unique parameter values.
The number of parameter values must match the number of parameter names of the definition of the chosen macro.
A parameter value is one of:
-
A constant integer, possibly negative.
-
A constant floating point number.
-
The
{
prefix, a valid Python 3 expression of which the evaluation result type isint
orbool
(automatically converted toint
), and the}
suffix.For a macro expansion at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before handling the items of the chosen macro). -
A valid Python 3 name.
For the name
NAME
, this is equivalent to the{NAME}
form above.
-
-
The
)
parameter value list suffix.
Input:
!macro bake() !le [ICITTE * 8 : 16] u16le"predict explode" !end "hello [" m:bake() "] world" m:bake() * 5
Output:
68 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d 00 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l 00 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp• 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• • 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p• 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• • 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p• 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• • 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p• 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• • 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p• 70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• • 65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
Input:
!macro A(val, is_be) !le !if is_be !be !end [val : 16] !end !macro B(rep, is_be) {iter = 1} !repeat rep m:A({iter * 3}, is_be) {iter = iter + 1} !end !end m:B(5, 1) m:B(3, 0)
Output:
00 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
Input:
!macro flt32be(val) !be [val : 32] !end "CHEETOS" m:flt32be(-42.17) m:flt32be(56.23e-4)
Output:
43 48 45 45 54 4f 53 c2 28 ae 14 3b b8 41 25 ┆ CHEETOS•(••;•A%
A post-item repetition represents the bytes of an item repeated a given number of times.
A post-item repetition is:
-
One of those items:
-
An LEB128 integer.
-
A string.
-
A group.
-
The
*
character. -
One of:
-
A positive integer (hexadecimal starting with
0x
or0X
accepted) which is the number of times to repeat the previous item. -
The
{
prefix, a valid Python 3 expression of which the evaluation result type isint
orbool
(automatically converted toint
), and the}
suffix.For a post-item repetition at some source location L, this expression may contain:
The value of the special name
ICITTE
(int
type) in this expression is the current offset (before handling the items to repeat). -
A valid Python 3 name.
For the name
NAME
, this is equivalent to the{NAME}
form above.
-
You may also use a repetition block. The form
ITEMÂ *Â X
is equivalent to
!repeat X ITEM !end
.
Input:
[end - ICITTE - 1 : 8] * 0x100 <end>
Output:
ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ •••••••••••••••• ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ •••••••••••••••• df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ •••••••••••••••• cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ •••••••••••••••• bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ •••••••••••••••• af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ •••••••••••••••• 9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ •••••••••••••••• 8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ •••••••••••••••• 7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp 6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba` 5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP 4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@ 3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210 2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"! 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ •••••••••••••••• 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
Input:
{times = 1} aa bb cc dd ( <here> (ee ff) * {here + 1} 11 22 33 * {times} {times = times + 1} ) * 3 "coucou!"
Output:
aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••" 33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ •••••••••••••••• ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3 33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
If you installed the normand
package, then you
can use the normand
command-line tool:
$ normand <<< '"ma gang de malades"' | hexdump -C
00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad| 00000010 65 73 |es|
If you copy the normand.py
module to your own project, then you can
run the module itself:
$ python3 -m normand <<< '"ma gang de malades"' | hexdump -C
00000000 6d 61 20 67 61 6e 67 20 64 65 20 6d 61 6c 61 64 |ma gang de malad| 00000010 65 73 |es|
Without a path argument, the normand
tool reads from the standard
input.
The normand
tool prints the generated binary data to the standard
output.
Various options control the initial state of the processor:
use the --help
option to learn more.
The whole normand
package/module public API is:
# Byte order.
class ByteOrder(enum.Enum):
# Big endian.
BE = ...
# Little endian.
LE = ...
# Text location.
class TextLocation:
# Line number.
@property
def line_no(self) -> int:
...
# Column number.
@property
def col_no(self) -> int:
...
# Parsing error message.
class ParseErrorMessage:
# Message text.
@property
def text(self):
...
# Source text location.
@property
def text_location(self):
...
# Parsing error.
class ParseError(RuntimeError):
# Parsing error messages.
#
# The first message is the most _specific_ one.
@property
def messages(self):
...
# Variables dictionary type (for type hints).
VariablesT = typing.Dict[str, typing.Union[int, float]]
# Labels dictionary type (for type hints).
LabelsT = typing.Dict[str, int]
# Parsing result.
class ParseResult:
# Generated data.
@property
def data(self) -> bytearray:
...
# Updated variable values.
@property
def variables(self) -> SymbolsT:
...
# Updated main group label values.
@property
def labels(self) -> SymbolsT:
...
# Final offset.
@property
def offset(self) -> int:
...
# Final byte order.
@property
def byte_order(self) -> typing.Optional[ByteOrder]:
...
# Parses the `normand` input using the initial state defined by
# `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
# and returns the corresponding parsing result.
def parse(normand: str,
init_variables: typing.Optional[SymbolsT] = None,
init_labels: typing.Optional[SymbolsT] = None,
init_offset: int = 0,
init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
...
The normand
parameter is the actual Normand input
while the other parameters control the initial state.
The parse()
function raises a ParseError
instance should it fail to
parse the normand
string for any reason.
Normand is a Poetry project.
To develop it, install it through Poetry and enter the virtual environment:
$ poetry install $ poetry shell $ normand <<< '"lol" * 10 0a'
normand.py
is processed by:
Licensing and copyright follows the REUSE specification and is checked with the reuse tool.
Use pytest to test Normand once the package is part of your virtual environment, for example:
$ poetry install $ poetry run pip3 install pytest $ poetry run pytest
The pytest
project is currently not a development dependency in
pyproject.toml
due to backward compatibiliy issues with
Python 3.4.
In the tests
directory, each *.nt
file is a test. The file name
prefix indicates what it’s meant to test:
pass-
-
Everything above the
---
line is the valid Normand input to test.Everything below the
---
line is the expected data (whitespace-separated hexadecimal bytes). fail-
-
Everything above the
---
line is the invalid Normand input to test.Everything below the
---
line is the expected error message having this form:LINE:COL - MESSAGE