ch01.tex

\chapter[Computer Systems: Simple Principles Lead to Complex Behavior]{Computer Systems: Simple Principles, Complex Behavior}

\chaptermark{Computer Systems}

\section{Hardware and Software}

Computer systems, both hardware and software, are some of the
most complicated artifacts that people have ever created.
Yet, computer systems are
applications of principles of logic that philosophers
have been developing for thousands of years.
The logic that arose from this long effort,
together with some engineering artifacts
that emerged from its framework,
will be the primary topics of this book.

The \index{hardware}hardware component of a computer system is made up
of physical pieces of equipment such as monitors,
keyboards, printers, web cameras, and USB drives.
Hardware also includes components inside the computer enclosure,
such as chips, cables, hard drives, and circuit boards.
The properties of hardware devices are largely fixed when the system is
constructed. For example, the capacity of a hard drive is
determined when it is manufactured. A three-terabyte drive
stores three terabytes of data, not more.
A particular cable may be constructed to carry twenty-five different signals
simultaneously. Need to send twenty-six signals at the same time?
Get a different cable, or more of them.

The hardware that makes up a computer system is not much different
than the electronics inside a television set or a navigational device,
but computer hardware can do something that most
consumer electronic devices cannot do.
It can respond to the software component of the computer system
to exhibit an unlimited range of behaviors.
The \index{software}software component is a collection of computer programs,
and programs can be added to the collection or removed from it
at any time.
So, a computer system is a multipurpose device in the extreme,
capable of doing things that its designers never thought about.

You have no doubt heard of various computer chips.
Maybe your laptop is powered by an Intel i7 or
an Apple A10.
Software for these chips consists of a list of
instructions that, when carried out one by one, in sequence,
modify the state of the computer system to make it perform
the function the software designer had in mind.
However, software designers rarely put together programs using
the \index{instruction set}instruction set of the computer chip
that performs the computation.
Instead, they use a programming language in which
the program tends to be shorter and
in some ways easier to understand than it would
be if it were a list of chip instructions.
Another piece of software
translates the program into instructions that the
chip is wired to carry out.

There are hundreds of programming languages
that software designers can choose from.
In most cases, software expressed in these languages consists
of a list of instructions to be performed, one by one,
in the manner of chip instructions,
but each instruction in the programming language gives
rise to a sequence of at least a few and sometimes many
commands in the instruction set of the underlying computer chip.

However, software as a list of instructions is
not the only alternative.
Another alternative uses drawings of
system configurations to build software.
For example, scientists use a programming language called
\index{LabView}LabView
to design software to control laboratory equipment.
Engineers use it to design digital systems,
and college students use it in projects to
design electronic devices.
Computer programs in LabView look like engineering drawings,
not like lists of instructions.
Another visual language,
\index{Scratch}Scratch, created by the \index{MIT Media Lab}MIT Media Lab,
is used by children to construct software for games
and other educational projects.
LabView and Scratch programs don't
look anything like chip instructions,
but they have the same capability for specifying computations.

A related alternative is to specify software as
a collection of equations in which
the left-hand side of the equation names a computation to be carried out
and the right-hand side reduces the computation to
more basic components, each of which produces part of the result.
The partial results are then combined to deliver the whole.
An advantage of the equation-based approach
is that a form of reasoning similar to that used
in ordinary, high school algebra can be used to derive
guarantees that the software produces results consistent
with design requirements.
Another advantage is that the equations themselves
are often inspired directly by design requirements,
so that if the software designer can think of tests
that the software should pass, the software itself
can emerge from specifications of the tests.

Instruction lists, visual programming languages, and equation-based
software are fundamentally different ways to describe computer programs,
but they are fully equivalent in terms of capabilities.
Since the primary focus of this text is on logical reasoning
about hardware and software components of computer systems,
we will use the equation-based approach,
but specification of the hardware and software artifacts that we study
is not limited to that approach.

It is software that gives a computer system its power and flexibility.
An iPhone, for example, has a screen that
can display millions of pixels,
but it is the software that determines whether the
iPhone displays an album cover or a weather update.
Software makes hardware useful by extending its range of behavior.
Although iPhone audio hardware may be able to produce
only a single tone at a time, the software controlling it
can direct it to produce a sequence of tones
that sound like Beethoven's Ninth.

\begin{aside}{aside-model-of-computation}{Models of Computation}
Logicians and mathematicians have been studying
\index{computation models}\index{model!of computation}models of computation
since before computers were invented. This was done in part to
answer a deep mathematical question: What parts of mathematics can,
in principle, be fully automated?  Is it possible to
build a machine that can prove all mathematical truths?

Many different models of computation were developed to study this question:
\index{Turing machine}Turing machines, \index{lambda calculus}lambda calculus,
partial recursive functions,
unrestricted grammars, Post production rules, and random-access machines,
to name a few.
Historically, Turing machines have provided the canonical foundation for computation,
but the random-access model conforms more closely to modern computers,
and the lambda calculus model is used in both computer science theory and practice.
The equation-based model of computation that is the focus of this book
falls into the lambda calculus bailiwick.

It is truly remarkable that all of these models of
computation are equivalent.  That is,
any computation that can be described in one of the models can
also be described in any of the others.
A conjecture known as the Church--Turing thesis
is based on this equivalence.
It says that all realizable computations can
be carried out by a Turing machine or by
any equivalent model of computation, such as
the lambda calculus model.
Much of computer science theory
revolves around this hypothesis.

Another remarkable fact is that some problems cannot be solved
by computer hardware or software.
Alan \index{Turing, Alan}Turing, one of the first computer scientists,
was the first to describe
\index{uncomputable}\seeonlyindex{incomputable}{uncomputable}uncomputable problems,
and soon afterwards the logician Kurt \index{Godel, Kurt@G\"odel, Kurt}G\"odel
showed that all mathematical systems that are at least as
powerful as arithmetic
are \index{incompleteness}incomplete. No formal system can be used
to prove all mathematical truths.
The theorems of Turing and G\"odel
prove that it is not possible to build a machine that can verify all mathematical
truths, even in principle.
%\caption{Models of Computation}
%\label{aside-model-of-computation}
\end{aside}

You can think of the hardware as
the parts of a computer that you can see
and the \index{software}software as information that tells the computer what to do.
However, the distinction between hardware and software is not as
clear cut as this suggests. Many hardware components
encode software directly and control other pieces of the system,
and many techniques used in the design of
hardware were originated to build software.
Major elements of hardware designs look like software
and can use the same language and notation.
A major theme of this book
is that both hardware and software are realizations of formal logic.
In this sense, computer systems are
\index{logic!in action}logic in action.

\section{Structure of a Program}

The distinction between hardware and software
leaves many questions to the imagination:
%\begin{quote}
\begin{enumerate}
\item How can software control hardware?\\
      Example: Instruct an audio device to emit a sound.
\item How can software detect the status of hardware?\\
      Example: Determine whether or not a switch is pressed.
\item What kinds of instructions can software give to hardware?\\
      Examples: Add numbers. Replace one formula by another. Select a formula.
\end{enumerate}
%\end{quote}

A model of computation
(box~\ref{aside-model-of-computation}, page \pageref{aside-model-of-computation})
is a way for software to control hardware,
so there are as many answers to the first of these questions
as there are models of computation, and
there are a lot of models.
Logicians have made sure of that.
Since there are many equivalent models,
the requirements of a project can guide the choice of models.
Generally, a programming language based on any model of computation
has intrinsic arithmetic and logic operators
(adders, multipliers, inverters, and so on)
and provides the ability to combine
intrinsic operators and operators
defined elsewhere in the software to
define new operators.

\begin{aside}{operations-and-functions}{Operators, Operands, Functions, Parameters, Arguments}
We will use the terms ``operator'' and ``function'' interchangeably.
Some treatments use these terms to mean different things,
but we find it more convenient in this presentation
to think of them as the same thing.
So, when we say \index{function}function we mean \index{operator}operator and vice versa.
What we're talking about when we use either term
is a transformation that delivers results when supplied with input.
We refer to the results delivered by an operator as its \emph{value},
and we refer to the input supplied
to the operator as its \index{operand}\emph{operands}.
Sometimes we use the term \index{argument}``argument''
or \index{parameter}``parameter'' instead of ``operand,''
usually in connection with the term ``function,''
but the terms carry the same meaning in any case.

To summarize, an operator (aka, function) is supplied with operands
(aka, parameters or arguments) and delivers a value.
A formula like $x + y$ denotes the value delivered
by the addition operator when supplied with the operands $x$ and $y$.
More generally, an operator $f$, when supplied with operands $x$ and $y$,
produces a value, usually denoted algebraically as $f(x,y)$.
If $f$ were the arithmetic addition operator ($+$),
then $f(x,y)$ would stand for the value $x+y$, $f(2,2)$ would denote 4,
$f(3,7)$ would stand for 10, and $f(2x,5)$ would mean $2x+5$.
%\caption{Operators, Operands, Functions, Parameters, Arguments}
%\label{operations-and-functions}
\end{aside}

Once we take the familiar operators
and the ability to define new ones
as basic elements of a computation model,
we can talk about what it is possible for software to do.
Software can affect
hardware by the values that an operator delivers.
For example, a computer program can tell an iPhone what to display
on its screen by delivering a matrix of pixels.
Each entry in the matrix
can be a number that represents a color: 16,777,215 for
red or 65,280 for green, for example.
Similarly, the hardware can inform the
software of the status of a component by
invoking an operator and supplying the status as an operand.
For example, touching the screen of an iPhone might
trigger a software operator and supply the operator with
the coordinates of the pixel selected by the touch.
The operator could then control the device's response.
Other gestures, such as tapping or
scrolling, would trigger different operators.

To be more specific, let's consider a program to control
a device that plays the game of \index{rock-paper-scissors}\index{game, rock-paper-scissors}rock-paper-scissors against a human opponent.
The device has three buttons, allowing the human player to select rock, paper, or
scissors.  It also has a display unit.
When the human player presses a button,
the display unit shows the device's choice (rock, paper, or scissors)
and shows the winner of the round.
The program will compute the device's choice
by invoking an operator called \textsf{emily}.\footnote{This operator is
named after a daughter of one of the authors who, when she was young,
played the rock-paper-scissors game just like the program described here.}
This operator will deliver
rock or paper or scissors as its value.
We could use numbers like 0, 1, and 2 as shorthand for these values,
but most programming languages include many
kinds of information, not just numbers,
so we are going to stick with the longer names
to make it easier for us to keep track of what things mean.
Another part of the software will compare the human player's
selection with that of the device and determine the winner of the round.

To make the game fair, the program will use
separate operators for the device's choice and
for examining the choice of the human player to determine the winner.
That way, the device won't know the human player's choice before making its own.
The operator \textsf{emily} will deliver the device's choice for a round of the game.
It needs some kind of information because an operator with no operands
always delivers the same value, and that would make for a very boring game.
Any player will have seen the choices of the other player for
previous rounds, so it seems fair to use some of that information in
making a choice for the next round.
The player for the device, which is the operator \textsf{emily},
will receive as its operand the choice that the human player selected
in the immediately previous round.
The operator \textsf{emily}
(and this is the heart of the device's ``prowess''
at the game, if prowess is the right term, and it probably isn't)
will, in turn, choose a play according to the following scheme.
The operand $u$ is the human player's choice for the previous round.
\begin{displaymath}
\textsf{emily}(u) =
   \left\{
        \begin{array}{ll}
        \mbox{\textsf{rock}}     & \mbox{if } u = \mbox{\textsf{scissors}} \\
        \mbox{\textsf{paper}}    & \mbox{if } u = \mbox{\textsf{rock}} \\
        \mbox{\textsf{scissors}} & \mbox{otherwise}
        \end{array}
   \right.
\end{displaymath}

The other operator, which we will call \textsf{score},
has two operands: ($d$, $u$).
The first operand, $d$,
indicates the  device's choice for the round.
It will be \textsf{rock}, \textsf{paper}, or \textsf{scissors}.
The second operand, $u$, indicates the human player's choice for the round.

The \textsf{score} operator delivers a value with two components: ($w$, $u$).
The first component, $w$, indicates the winner of the round.
It will be \textsf{device}, \textsf{human}, or \textsf{none}
(in case both players make the same choice in a round).
The second component, $u$, indicates the human player's choice for the round
(\textsf{rock}, \textsf{paper}, or \textsf{scissors}).
The \textsf{score} operator reports the human player's choice so that
in the next round the software
can tell the operator \textsf{emily}
the choice that the human player made in the previous round,
and \textsf{emily} can use that information in deciding on a choice
for the next round.

The following table specifies
the two-component value, ($w$, $u$), that the \textsf{score} operator
delivers, given the two operands indicating the player's choices, ($d$, $u$):
\begin{displaymath}
\textsf{score}(d,u) =
   \left\{
        \begin{array}{ll}
        (\mbox{\textsf{none}}, u)   & \mbox{if } d = u \\
        (\mbox{\textsf{device}}, u) & \mbox{if } (d,u) = (\mbox{\textsf{rock}}, \mbox{\textsf{scissors}}) \\
        (\mbox{\textsf{human}}, u)  & \mbox{if } (d,u) = (\mbox{\textsf{rock}}, \mbox{\textsf{paper}}) \\
        (\mbox{\textsf{device}}, u) & \mbox{if } (d,u) = (\mbox{\textsf{paper}}, \mbox{\textsf{rock}}) \\
        (\mbox{\textsf{human}}, u)  & \mbox{if } (d,u) = (\mbox{\textsf{paper}}, \mbox{\textsf{scissors}}) \\
        (\mbox{\textsf{device}}, u) & \mbox{if } (d,u) = (\mbox{\textsf{scissors}}, \mbox{\textsf{paper}}) \\
        (\mbox{\textsf{human}}, u)  & \mbox{if } (d,u) = (\mbox{\textsf{scissors}}, \mbox{\textsf{rock}}) \\
        \end{array}
   \right.
\end{displaymath}

More sophisticated software could make the device play a better game of
rock-paper-scissors.
For example, the \textsf{score} operator might
keep track of the number of times that the human player
has selected scissors or rock or paper
in previous rounds or it might keep track of all three.
The software controlling the device's choice of plays could
then take more information into account and play a better game.
That is, the operator \textsf{emily} could be upgraded.
Such is the flexibility of software.

The way we have described the operators \textsf{emily} and \textsf{score}
illustrates an important point about our computational model.
A program in our model consists of a collection of equations
defining mathematical functions that start each computation from scratch
based only on their input. They cannot ``remember'' anything from previous computations.
This will come as a surprise to programmers accustomed
to other computational models such as those on which
programming languages like Java or C++ are based.
In those models, programs use variables to record values
and can update recorded values later in the computation.
In our equation-based model,
programs consist of collections of mathematical functions (operators)
that base their results entirely on their operands.
Operators do not make use of recorded values
beyond the data supplied in operands
and unchanging elements in the equations as they stand.
This makes it possible to
understand equation-based programs in terms of classical
logic and traditional algebraic formulas.

To get back to our discussion of the game,
after the first round the display component of the device
would show the first element $w_1$ of the pair $(w_1, u_1)$ delivered by the formula
\textsf{score}(\textsf{emily}(\textsf{none}), $b_1$),
where $b_1$ stands for rock, paper, or scissors, depending on which button
the human player pressed.
For the next round, the outcome would be
$(w_2, u_2)$ $=$ \textsf{score}(\textsf{emily}($u_1$), $b_2$),
where $b_2$ is the button pressed by the human player for the new round,
and the display would show the value of $w_2$.
A game of five rounds would correspond to a sequence of equations,
each showing the outcome of a particular round.
The display hardware would show the winner of each round,
and the software would supply the button press of the previous round, which
is delivered by the \textsf{score} operator as the second
element of a pair to the \textsf{emily} operator for the next round.
\begin{displaymath}
%   \left\{
        \begin{array}{ll}
        (w_1, u_1) = \mbox{\textsf{score}}(\mbox{\textsf{emily}}(\mbox{\textsf{none}}), b_1) \\
        (w_2, u_2) = \mbox{\textsf{score}}(\mbox{\textsf{emily}}(u_1), b_2) \\
        (w_3, u_3) = \mbox{\textsf{score}}(\mbox{\textsf{emily}}(u_2), b_3) \\
        (w_4, u_4) = \mbox{\textsf{score}}(\mbox{\textsf{emily}}(u_3), b_4) \\
        (w_5, u_5) = \mbox{\textsf{score}}(\mbox{\textsf{emily}}(u_4), b_5)
        \end{array}
%   \right.
\end{displaymath}

A more complicated version of the software
could define an operator \textit{play} that would
generate a sequence of $n$ plays, where $n$ stands for
an operand supplied to \textit{play}.
In such a scenario, there would need to be some way to supply an operand $n$
to the operator \textit{play}, where $n$
might be a fixed number, like five or ten, or
it might be a number selected by having the human
player press the buttons in a special sequence.
The point is that the hardware provides a fixed set of
components such as displays and buttons,
and the software specifies a way to use those components.

\section{Deep Blue and Inductive Definitions}

The equation-based model of computation has the same
capabilities for specifying computations
as any other known model. Knowing the great diversity of things that computers can do,
it seems reasonable to expect them to be much more complicated than,
say, the rock-paper-scissors device, but in fundamental terms,
they have the same basic structure.
Complex behavior from modest beginnings, a bargain if there ever was one.

Consider \index{Deep Blue}Deep Blue, the
computer that beat world champion Gary Kasparov in a chess match on May 11,
1997. It was the first time a computer had performed so well in a
game that challenges the mental capacities of human players.
The Deep Blue chess-playing software can be specified as an operator 
with an operand that is an $8 \times 8$ matrix showing the positions of the pieces
on the board after a move by the human player (Kasparov, in the 1997 game).
The operator would deliver a result showing
either the positions on the board after its move or a special
white-flag token used to resign the game.

In principle, a chess-playing computer program can be an operator
in software with the following characteristics.
Given a matrix describing the board at a particular point in a game,
the operator determines all possible legal moves.  If there
are no legal moves, the operator delivers the white flag,
signaling that Deep Blue has resigned.
If there is a legal move that results
in checkmate, the operator delivers the $8 \times 8$ matrix specifying the
new board position after that move.  Otherwise, for each legal move,
the operator considers
each of the possible moves in response by the human opponent.
Each one of those moves leads to a new board that can be
examined by the chess-playing operator
to choose a move based on a calculation that estimates the value of
board positions.

An operator that follows this strategy can be defined
in a circular way, circular in the sense that
the operator invokes itself with new operands.\footnote{To
\index{invoke (invocation)}\emph{invoke}
an operator is to apply it to its operands
to make a computation.
An \emph{invocation} is a formula that invokes an operator.}

Circular definitions of this kind are common in mathematics.
We call them inductive definitions.\footnote{Operators with
\seeonlyindex{inductive definition}{definition}\index{definition!inductive (circular)}inductive definitions
are sometimes called \index{recursive}``recursive'' functions.
We try to avoid that term because it is
often associated with specialized ways to carry out the computation,
but we may slip up from time to time and call them recursive functions.
Besides, you should know the term so you'll know what people are
talking about if they mention it.}
The trick that makes inductive definitions useful in mathematics
is that at the point of circularity,
the invocation of the operator being defined
has operands that are closer to those in a noncircular part of the
definition than to the original operands.

Inductive definitions
will play a central role throughout the book
and will be discussed in great detail, but we want to provide an
introduction of sorts at this point to get the ball rolling.
A natural number is a whole number that is not negative: $0, 1, 2, 3, \dots$.
The sum of the five \index{reciprocal}reciprocals of the
natural numbers $1$ through $5$
can be written as an algebraic formula, as in the following equation:
\begin{displaymath}
h5 = \frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5}
\end{displaymath}
This is a noninductive definition of $h5$,
and $h5$ has no operands, so it simply stands for a number,
namely, $\frac{137}{60}$, which is about $2.3$.

We could, of course, have written a formula for the sum
of the first ten reciprocals or a sum with any number of terms.
Sums of this form comprise a mathematical entity known as
the \index{harmonic series}harmonic \index{series, harmonic}series.
Specifically, they are the partial sums of the series,
and we can define an operator $h$ that computes such a sum
with any specified number $n$ of terms.
\begin{displaymath}
h(n) = \frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \dots  \frac{1}{n}
\end{displaymath}

As a step in that direction, observe that for any natural number $n$,
the sum $h(n+1)$ of the first $n+1$ reciprocals
is the same as the sum of the first $n$ reciprocals, that is, $h(n)$, plus $\frac{1}{n+1}$.
Suppose we specify that $h(0)$ stands for zero.\footnote{Our
informal definition is that $h(n)$ stands for the sum of the first $n$ reciprocals.
When $n$ is zero, there would be no terms in the sum,
so the standard convention that the sum of an empty set of numbers
is zero is consistent with the equation $h(0) = 0$.}
Then, the following equations express the specification for $h(0)$ and
the observation about the relationship between $h(n)$ and $h(n+1)$:\index{equation, by name!\{h0\}, \{h1\}}
\begin{center}
\begin{tabular}{ll}
$h(n+1)$ $=$ $h(n) + \frac{1}{n+1}$ & \{h1\}\\
$h(0)$   $=$ $0$                    & \{h0\}\\
\end{tabular}
\end{center}\label{reciprocalsdef}

It will be a common practice in this text to
attach names to equations to make it easy to discuss them.
In this case, we have used the names \{h1\} and \{h0\}.
The equation \{h1\} is \seeonlyindex{circular definition}{definition}circular
because both sides of the equation refer to a value delivered by the operator $h$.
However, the operand $n$ in the circular reference
(that is, the formula $h(n)$ on the right-hand side of the \{h1\} equation)
is closer to zero than the operand $(n+1)$ on the left-hand side of the equation.
Furthermore, in equation \{h0\}, the operand on the left-hand side is zero and
the equation is not circular.
So, the operand on the right-hand side of the circular equation, \{h1\},
is closer to the operand in the part of the definition
that is not circular, \{h0\}, than it is to the operand on the
left-hand side of the circular equation.

It turns out, and we will study this in great detail later,
that a collection of circular equations with
these two characteristics
(a reduced operand on the right-hand side in circular equations
and a specific operand on the left-hand side in noncircular equations)
provides a useful definition of an operator.
Therefore, we can say that equations \{h1\} and \{h0\}
define the operator $h$.
The equations comprise an inductive definition.

The definition is mathematically rigorous
and also computational.
To see how this works, observe that, according to equation \{h1\},
$h(5) = h(4)+\frac{1}{5}$.
Furthermore, \{h1\} also says that $h(4) = h(3)+\frac{1}{4}$.
Combining these equations algebraically leads to the equation
$h(5) = h(3)+\frac{1}{4}+\frac{1}{5}$.
Continuing the analysis in this way, using equation \{h1\} at each step,
produces the following sequence of equations:
\begin{center}
\begin{tabular}{l}
$h(5) = h(4) + \frac{1}{5}$ \\
$h(5) = h(3) + \frac{1}{4} + \frac{1}{5}$\\
$h(5) = h(2) + \frac{1}{3} + \frac{1}{4} + \frac{1}{5}$\\
$h(5) = h(1) + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5}$\\
$h(5) = h(0) + \frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5}$\\
\end{tabular}
\end{center}

Equation \{h0\} says $h(0) = 0$,
so the last equation is equivalent to the following one:
\begin{center}
\begin{tabular}{l}
$h(5) = 0 + \frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5}$\\
\end{tabular}
\end{center}

This equation needs no analysis, just a little arithmetic
to compute $h(5) = \frac{137}{60}$.
In this way, the equations \{h1\} and \{h0\} provide not just
a definition of the operator $h$ but also a scheme for computing $h(n)$
for any natural number $n$.
Furthermore, other properties of the operator $h$,
such as the fact that when $n$ is a natural number,
the formula $h(n+1)$ must in all cases stand for a strictly positive number
and that its value must be a ratio of whole numbers,
can be derived from \{h1\} and \{h0\}.
The fact that for any number $x$, no matter how large,
there is a natural number $n$ such that $h(n) > x$
is another property of the harmonic series that can be
derived from equations \{h0\} and \{h1\}.
The fact that $h(n)$ grows larger at about
the same rate as $log(n)$ is also a consequence
of \{h1\} and \{h0\}. All the properties of the
harmonic series are a consequence of those two equations.
Simple beginnings. Big results.

It turns out that every computable function has a definition consisting
of some equations specifying properties of the function
and that all the properties of the function derive
from those equations.
This notion provides the basis for most of the reasoning
discussed in this text.
The implications of inductive equations
are broader than at first they seem.

To get back to the chess-playing computer, the problem with our
na\"ive approach is that there are too many moves to consider.
Although the function can, \emph{in principle},
select the best move given a particular arrangement of pieces on the chess board,
\emph{in practice} the computation
would take too much time.
The sun in our solar system would be long dead
before the computer could decide on its first move.

\index{Deep Blue}\index{chess}Deep Blue
did play in more or less this way, but it didn't look at board positions
all the way to the end of the game. Most of the time, it looked
six to eight moves ahead. Even that is a very big computation
but feasible for Deep Blue because it had
thousands of processing units and could consider many moves at the same time.
That gave it the ability to analyze about 200 million board positions per second.
The combination of massive computational power and limited look-ahead
made it possible for Deep Blue to outplay
the most accomplished human chess player of the time.
Now, two decades after the big match,
chess-playing software on laptops
gives accomplished players a run for their money.
They use chess-playing software as part of their
practice regimen. They learn things from the way
the computer plays. Simple beginnings.
Astonishing outcomes.

Deep Blue is a combination of a complicated computer
and a computer program with a long, complex definition.
The Deep Blue software conformed to a conventional computation model,
not the equation-based model we used to describe it,
but the program could have been designed
using equations in a manner similar to the
way equations provided a definition of the operator $h$
but a lot more of them, of course.
Not so simple, but founded on a few simple principles.

\begin{exercises}
\exer {Can you use equations \{h0\} and \{h1\}
%(page \pageref{reciprocalsdef})
to compute $h(-1)$?  Or $h(\frac{1}{2})$? Why not?}

\exer{Define an operator \emph{factor}$(k, n)$: true
if $k$ is a factor of $n$, false otherwise.\\
\emph{Note}: Natural numbers only. And
$k$ is a factor $n$
if there is a number $m$ such that $n = km$.\\
\emph{Note}: $mod(k,n) =$ remainder
in $k \div n$: $mod(17,5)=2$, $mod(9,2)=1$, $mod(12,4)=0$.\\
\emph{Hint}: One equation is enough. Use $mod$.}

\exer {\label{largest-factor}%
Define an operator \emph{lf}$(n) =$
        largest factor of $n$ other than $n$:
        \emph{lf}$(30)=15$, \emph{lf}$(15)=5$.\\
        \emph{Note}: Use \emph{lft}$(n,k) =$ largest factor of
        $n$ up to $k$:
        \emph{lft}$(30,7)=6$, \emph{lft}$(120,10)=10$.\\
        \emph{Note}: Use $\lfloor n \div k\rfloor = n \div k$ rounded down to an integer.}

\exer {\label{prime-predicate}%
Define an operator to compute $p(n)$, which is true
        if $n$ is a prime number and false otherwise.
        Refer to the operator \emph{lf}
        from exercise \ref{largest-factor} in your definition.\\
        \emph{Note}: A natural number $p \geq 2$ is a prime number if
        it has no factors other than $p$ and $1$.}

\vspace*{8mm}  %% proofreader asked to move all of next exercise to next page
\exer {Define an operator to compute $rp(n)$,
the sum of the reciprocals of all the prime numbers that are less than or equal to
the natural number $n$.
Use the equations that define the operator
$h$ (page~\pageref{reciprocalsdef})
as a model for your definition of $rp(n)$.\footnote{The
\index{harmonic series}harmonic \index{series, harmonic}series, $h(n)$,
grows without bound as $n$ becomes large
but very slowly, at about the same rate of growth as $log(n)$.
The number $rp(n)$ grows even more slowly, of course,
but it too grows without bound.
However, the sum of the squares of the reciprocals is bounded
by $\pi^2/6$, which is about $1.64$.
Leonhard Euler proved these facts over 200 years ago, during
a kind of Cambrian explosion of mathematics initiated a hundred years earlier
by Isaac \index{Newton, Isaac}Newton and Gottfried \index{Leibniz, Gottfried}Leibniz
with their invention of the infinitesimal calculus.
In their development of calculus,
Newton and Leibniz conjured up mathematical entities known as infinitesimal numbers
to justify the new mathematics that they introduced in the 1600s,
but it wasn't until the 1960s that Abraham \index{Robinson, Abraham}Robinson invented a formal logic
for reasoning about infinitesimal numbers.
The proof engine \index{ACL2r}ACL2r, which is an extension of ACL2,
the mechanized logic used extensively in this book,
employs Robinson's \index{nonstandard analysis}nonstandard analysis
to support partially automated, formal reasoning about functions with
numeric operands of infinite precision.}\\
\emph{Hint}: Use three equations, one for numbers $n$ that are prime numbers,
another for nonzero numbers that are not prime,
and a third for the case when $n = 0$.
It will be helpful to refer to the operator $p$ from exercise \ref{prime-predicate}.}
\end{exercises}


%%% Local Variables:
%%% mode: latex
%%% TeX-master: "book"
%%% End: