Author: Jack Robbins
This project defines and implements a Pushdown Automaton(PDA) that recognizes a context-free language
Define the context-free language
Here is an example of a string in abba(1.2*(1.-2.+3.1))abba
Although it may not be immediately obvious,
For an exact definition of the langauge,
-
$V = \{S,T,C,H,Y,N\}$ is the set of variables, also called nonterminals -
$\Sigma = \{.,0,1,2,...,9,+,-,*,/,(,),a,b\}$ , where$\Sigma$ is the alphabet, also known as the set of terminals -
$S$ is the start variable
And
-
$S \rightarrow$ a$T$ a -
$T \rightarrow$ b$T$ b | a$C$ a -
$C \rightarrow$ $C+C$ |$C-C$ |$C*C$ |$C/C$ |$(C)$ |$H$ -
$H \rightarrow$ $Y.Y$ |$Y.$ |$.Y$ -
$Y \rightarrow$ $NY$ |$N$ -
$N \rightarrow$ 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
With this context-free grammar
To make this context-free grammar more concrete, here is an example of how abba(1.2*(1.-2.+3.1))abba
Note
Before introducing the Pushdown Automato(PDA), let's first understand why we need to use a PDA in for this language. It has already shown that the language
To prove that
Recall the pumping lemma for regular languages:
If
$A$ is a regular language, then$\exists$ number$p$ , the pumping length, where if$s \in A$ and$|s| \geq p$ , then the string$s$ can be split into three pieces$s = xyz$ , which must satisfy the following properties:
$xy^iz \in A$ for each$i \geq 0$ $|y| \gt 0$ $|xy| \leq p$
With this in mind, let's begin the proof.
Proof:
Suppose that
Now consider the string
So, by the pumping lemma, we can split
This leaves us with:
-
$x = a$ , so$|x| = 1$ -
$y = b^k$ , for some$0 \lt k \leq p-1$ -
$z = b^ma(1.0)ab^pa$ for some$0 \lt m \lt p-1$
So
-
$s = ab^kb^ma(1.))ab^pa$ , so it follows that$k+m = p$
This split satisfies properties 2 & 3 above because:
-
$y = |k| \gt 0$ , satisfying property 2 -
$|xy| = |x| + |y|$ which is at most$p-1 + 1 = p \leq p$ , satisfying property 3
Property 1 implies that
-
$xyyz = ab^kb^kb^ma(1.0)ab^pa = ab^{k+k+m}a(1.0)ab^pa = ab^{p+k}a(1.0)ab^pa \notin A$ because$k+m = p$ and$k \gt 0$ .
Therefore, we have a contradiction, so
End proof
Since b
symbols, to ensure that the front and back of the
string have an even number of them, and we also need to match (
with )
to make sure that the floating point expression component of the string is valid as well. With the extra features of a PDA, we can design a finite state machine
The most intuitive way to understand the PDA
Note
Notice how the transitions are of the form
Let's see how this Pushdown Automaton processes the example string: abba(1.1*(2.1/3.1)+((2.2*.8)+(1.2)))abba
. First, visually inspect the string and verify that it is in the langauge
In q1. Read ε, pop ε, push $. Move to q2.
In q2. Read a, pop ε, push a. Move to q3.
In q3. Read b, pop ε, push b. Move to q3.
In q3. Read b, pop ε, push b. Move to q3.
In q3. Read a, pop ε, push a. Move to q4.
In q4. Read (, pop ε, push (. Move to q4.
In q4. Read 1, pop ε, push ε. Move to q5.
In q5. Read ., pop ε, push ε. Move to q7.
In q7. Read 1, pop ε, push ε. Move to q7.
In q7. Read *, pop ε, push ε. Move to q4.
In q4. Read (, pop ε, push (. Move to q4.
In q4. Read 2, pop ε, push ε. Move to q5.
In q5. Read ., pop ε, push ε. Move to q7.
In q7. Read 1, pop ε, push ε. Move to q7.
In q7. Read /, pop ε, push ε. Move to q4.
In q4. Read 3, pop ε, push ε. Move to q5.
In q5. Read ., pop ε, push ε. Move to q7.
In q7. Read 1, pop ε, push ε. Move to q7.
In q7. Read ), pop (, push ε. Move to q8.
In q8. Read +, pop ε, push ε. Move to q4.
In q4. Read (, pop ε, push (. Move to q4.
In q4. Read (, pop ε, push (. Move to q4.
In q4. Read 2, pop ε, push ε. Move to q5.
In q5. Read ., pop ε, push ε. Move to q7.
In q7. Read 2, pop ε, push ε. Move to q7.
In q7. Read *, pop ε, push ε. Move to q4.
In q4. Read ., pop ε, push ε. Move to q6.
In q6. Read 8, pop ε, push ε. Move to q7.
In q7. Read ), pop (, push ε. Move to q8.
In q8. Read +, pop ε, push ε. Move to q4.
In q4. Read (, pop ε, push (. Move to q4.
In q4. Read 1, pop ε, push ε. Move to q5.
In q5. Read ., pop ε, push ε. Move to q7.
In q7. Read 2, pop ε, push ε. Move to q7.
In q7. Read ), pop (, push ε. Move to q8.
In q8. Read ), pop (, push ε. Move to q8.
In q8. Read ), pop (, push ε. Move to q8.
In q8. Read a, pop a, push ε. Move to q9.
In q9. Read b, pop b, push ε. Move to q9.
In q9. Read b, pop b, push ε. Move to q9.
In q9. Read a, pop a, push ε. Move to q10.
In q10. Read ε, pop $, push ε. Move to q11.
In q11, the accepting state. String has been fully processed. String is accepted.
Let's also look at an example of a string that is not in abbbba(1.2*2.1))abbbba
. Upon visual inspection, we can see that the parenthesis are not properly matched, so the expression is not valid. Let's see how
In q1. Read ε, pop ε, push $. Move to q2.
In q2. Read a, pop ε, push a. Move to q3.
In q3. Read b, pop ε, push b. Move to q3.
In q3. Read b, pop ε, push b. Move to q3.
In q3. Read b, pop ε, push b. Move to q3.
In q3. Read b, pop ε, push b. Move to q3.
In q3. Read a, pop ε, push a. Move to q4.
In q4. Read (, pop ε, push (. Move to q4.
In q4. Read 1, pop ε, push ε. Move to q5.
In q5. Read ., pop ε, push ε. Move to q7.
In q7. Read 2, pop ε, push ε. Move to q7.
In q7. Read *, pop ε, push ε. Move to q4.
In q4. Read 2, pop ε, push ε. Move to q5.
In q5. Read ., pop ε, push ε. Move to q7.
In q7. Read 1, pop ε, push ε. Move to q7.
In q7. Read ), pop (, push ε. Move to q8.
In q8. Read ), pop ε, push ε. PDA crashes.
Notice how upon reading the unmatched )
, the PDA crashes. This happens because it has no valid transitions out of the state q8 with the current state of the stack, meaning that the parenthesis must not be matched. A PDA crashing is a way of rejecting the string, so we
know that abbbba(1.2*2.1))abbbba
is not in the language
The C++ program pda.cpp implements the PDA
The runner script run.sh is provided for ease of compilation and execution. To run this program using the runner script, download the entirety of this project and navigate to the folder where it was downloaded. Following this, run the following commands:
example@bash:~$ chmod +x run.sh
example@bash:~$ ./run.sh
Do you want to use the given test cases?[Y/n]:
At this stage, if you type Y
, the program will be run with the 15 test cases that are provided in the tests folder. This is a convenient and typing-free option if you just want to see what some output on example
strings looks like. If you type n
, then the program control will be given to you, and you will enter the strings that you want to process one by one.