L02Numerics.qmd

---
jupyter: python3
---
```{python}
#| echo: false
qr_setting = None
#
import numpy as np
import matplotlib as mp
import pandas as pd
import matplotlib.pyplot as plt
import laUtilities as ut
import slideUtilities as sl
import demoUtilities as dm
import pandas as pd
from matplotlib import animation
from IPython.display import HTML
```
<!--
This comment somehow suppresses the title page
-->
## (Getting Serious About) Numbers

![](images/William_Kahan_2008_(cropped).jpeg){width=200 fig-alt="Photo of William Kahan" fig-align="center" fig-scap="William Kahan"}

::: {.content-hidden when-profile="slides"}
::::{.column-margin}
Photo Credit: 
<a href="https://commons.wikimedia.org/wiki/File:William_Kahan_2008_(cropped).jpg">George M. Bergman</a>, <a href="https://creativecommons.org/licenses/by-sa/4.0">CC BY-SA 4.0</a>, via Wikimedia Commons
::::
:::

::: {.ctrd}
__William Kahan,__ creator of the IEEE-754 standard.<br>
Turing Award Winner, 1989
:::

>I have a number in my head <br>
>Though I don't know why it's there<br>
>When numbers get serious<br>
>You see their shape everywhere
>
>Paul Simon

::: {.content-visible when-profile="slides"}
##
:::

One of the themes of this course will be shifting between mathematical and computational views of various concepts.

Today we need to talk about why the answers we get from computers can be different from the answers we get mathematically 

-- for the same question!

::: {.fragment}
The root of the problem has to do with how __numbers__ are manipulated in a computer.

In other words, how numbers are __represented.__
:::

## Representations

A number is a mathematical concept -- an abstract idea. 

::: {.fragment}
> God made the integers, <br>all else is the work of man.
>
> Leopold Kronecker (1823 - 1891)
:::

::: {.fragment}
In a computer we assign __bit patterns__ to correspond to certain numbers.   

We say the bit pattern is the number's _representation._
:::

::: {.fragment}
For example the number '3.14' might have the representation '01000000010010001111010111000011'.

For reasons of efficiency, we use a fixed number of bits for these representations.   In most computers nowadays we use __64 bits__ to represent a number. 
:::

::: {.fragment}
Let's look at some number representations and see what they imply about computations.
:::

## Integers

::: {.fragment}
Kronecker believed that integers were the only 'true' numbers.

And for the most part, using integers in a computer is not complicated.   

Integer representation is essentially the same as binary numerals.  

For example, in a 64-bit computer, the representation of the concept of 'seven' would be '0..0111' (with 61 zeros in the front).

There is a size limit on the largest value that can be stored as an integer, but it's so big we don't need to concern ourselves with it in this course.
:::

::: {.fragment}
So for our purposes, an integer can be stored exactly.

In other words, there is an 1-1 correspondence between every (computational) representation and the corresponding (mathematical) integer.
:::

::: {.fragment}
So, what happens when we compute with integers?

For (reasonably sized) integers, computation is __exact__ .... as long as it only involves __addition, subtraction, and multiplication.__  

In other words, there are no errors introduced when adding, subtracting or multiplying integers.   
:::

::: {.fragment}
However, it is a different story when we come to division, because the integers are not closed under division.

For example, 2/3 is not an integer.   ... It is, however, a __real__ number.
:::

## Real Numbers and Floating-Point Representations

::: {.fragment}
Representing a real number in a computer is a __much__ more complicated matter.   

In fact, for many decades after electronic computers were developed, there was no accepted "best" way to do this!   

Eventually (in the 1980s) a widely accepted standard emerged, called IEEE-754.  This is what almost all computers now use.
:::

::: {.fragment}
The style of representation used is called __floating point.__
:::


::: {.fragment}
Conceptually, it is similar to "scientific notation."

$$
123456 = \underbrace{1.23456}_{\text{significand}}\times {\underbrace{10}_{\text{base}}}^{\overbrace{5}^{exponent}}
$$
:::

::: {.fragment}
Except that it is encoded in binary:

$$17 = \underbrace{1.0001}_{\text{significand}}\times {\underbrace{2}_{\text{base}}}^{\overbrace{4}^{exponent}}$$
:::

::: {.fragment}
The sign, significand, and exponent are all contained within the 64 bits.
:::

::: {.fragment}
![](images/IEEE_754_Double_Floating_Point_Format.png){width=450 fig-alt="Bits in IEEE-754" fig-align="center" fig-scap="Bits in IEEE-754"}

::: {.content-hidden when-profile="slides"}
:::: {.column-margin}
By Codekaizen (Own work) [<a href="http://www.gnu.org/copyleft/fdl.html">GFDL</a> or <a href="http://creativecommons.org/licenses/by-sa/4.0-3.0-2.5-2.0-1.0">CC BY-SA 4.0-3.0-2.5-2.0-1.0</a>], <a href="https://commons.wikimedia.org/wiki/File%3AIEEE_754_Double_Floating_Point_Format.svg">via Wikimedia Commons</a>
::::
:::
:::

::: {.fragment}
Because only a fixed number of bits are used, __most real numbers cannot be represented exactly in a computer.__

Another way of saying this is that, usually, a floating point number is an approximation of some particular real number.
:::

::: {.fragment}
Generally when we try to store a real number in a computer, __what we wind up storing is the closest floating point number that the computer can represent.__
:::

::: {.content-visible when-profile="slides"}
##
:::

### The Relative Error of a Real Number stored in a Computer

::: {.content-hidden when-profile="slides"}
:::: {.column-margin}
You can experiment with floating point representations to see how errors arise using [this interactive tool](https://baseconvert.com/ieee-754-floating-point).  
::::
:::

::: {.fragment}
The way to think about working with floating point (in fact, how the hardware actually does it) is:

1. Represent each input as the __nearest__ representable floating point number.
2. Compute the result exactly from the floating point representations.
3. Return the __nearest__ representable floating point number to the result.
:::

::: {.fragment}
What does "__nearest__" mean?   Long story short, it means "round to the nearest representable value."

Let's say we have a particular real number $r$ and we represent it as a floating point value $f$.   

Then $r = f + \epsilon$ where $\epsilon$ is the amount that $r$ was rounded when represented as $f$.
:::

::: {.fragment}
So $\epsilon$ is the difference between the value we want, and the value we get.

How big can this difference be?  Let's say $f$ is

$$f = \underbrace{1.010...01}_\text{53 bits}\times 2^n$$
:::

::: {.fragment}
Then $|\epsilon|$ must be smaller than 

$$|\epsilon| < \underbrace{0.000...01}_\text{53 bits}\times 2^n.$$
:::

::: {.fragment}
So as a _relative error_, 

$$ \text{relative error} = \frac{|\epsilon|}{f} < \frac{{0.000...01}\times 2^n}{\underbrace{1.000...00}_\text{53 bits}\times 2^n} = 2^{-52} \approx 10^{-16}$$
:::

::: {.fragment}
This value $10^{-16}$ is an important one to remember.

It is approximately __the relative error that can be introduced any time a real number is stored in a computer.__
:::

::: {.fragment}
Another way of thinking about this is that you __only have about 16 digits of accuracy__ in a floating point number.
:::

::: {.content-visible when-profile="slides"}
##
:::

### Implications of Representation Error

::: {.fragment}
Problems arise when we work with floating point numbers and confuse them with real numbers.

It is important to remember that most of the time we are not storing the real number exactly, but only a floating point number that is close to it.
:::

::: {.fragment}
Let's look at some examples.  First:

$$ \left( \frac{1}{8} \cdot 8 \right) - 1 $$
:::

::: {.fragment}
```{python}
# ((1/8)*8)-1
a = 1/8
b = 8
c = 1
(a*b)-c
```
:::

::: {.fragment}
It turns out that 1/8, 8, and 1 can all be stored exactly in IEEE-754 floating point format.

So, we are 
* storing the inputs exactly (1/8, 8 and 1)
* computing the results exactly (by definition of IEEE-754), yielding $(1/8) \cdot 8 = 1$
* and representing the result exactly (zero)
:::

::: {.fragment}
OK, here is another example:

$$ \left( \frac{1}{7} \cdot 7 \right) - 1 $$
:::

::: {.fragment}
```{python}
# ((1/7)*7)-1
a = 1/7
b = 7
c = 1
a * b - c
```
:::

::: {.fragment}
Here the situation is different. 

1/7 can __not__ be stored exactly in IEEE-754 floating point format.

In binary, 1/7 is $0.001\overline{001}$, an infinitely repeating pattern that obviously cannot be represented as a finite sequence of bits.
:::

::: {.fragment}
Nonetheless, the computation $(1/7) \cdot 7$ still yields exactly 1.0.

Why? Because the rounding of $0.001\overline{001}$ to its closest floating point representation, when multiplied by 7, yields a value whose closest floating point representation is 1.0.
:::

::: {.fragment}
Now, let's do something that seems very similar:

$$ \left( \frac{1}{70} \cdot 7 \right) - 0.1 $$
:::

::: {.fragment}
```{python}
a = 1/70
b = 7
c = 0.1
a * b - c
```
:::

::: {.fragment}
In this case, both 1/70 and 0.1 __cannot__ be stored exactly.    

More importantly, the process of rounding 1/70 to its closest floating point representation, then multiplying by 7, yields a number whose closest floating point representation is __not__ 0.1
:::

::: {.fragment}
However, that floating point representation is very __close__ to 0.1.   

Let's look at the difference: -1.3877787807814457e-17.  

This is about $-1 \cdot 10^{-17}$.
:::

::: {.fragment}
In other words, about -0.00000000000000001

Compared to 0.1, this is a very small number.  The relative error is about:

$$ \frac{|-0.00000000000000001|}{0.1} $$

which is about $10^{-16}.$
:::

::: {.fragment}
This suggests that when a floating point calculation is not exact, the error (in a relative sense) is usually very small.
:::

::: {.fragment}
Notice also that in our example the size of the relative error is about $10^{-16}$.   

Recall that the significand in IEEE-754 uses 52 bits.

Now, note that $2^{-52} \approx 10^{-16}$.

There's our "sixteen digits of accuracy" principle again.
:::

::: {.content-visible when-profile="slides"}
##
:::

### Special Values

::: {.fragment}
There are three kinds of special values defined by IEEE-754:

1.  NaN, which means "Not a Number" 
2.  Infinity -- both positive and negative
3.  Zero -- both positive and negative.
:::

::: {.fragment}
__NaN__ and __Inf__ behave about as you'd expect.  

If you get one of these values in a computation you should be able to reason about how it happened.   Note that these are values, and can be assigned to variables.
:::

::: {.fragment}

```{python}
#| warning: false
np.sqrt(-1)
```
:::

::: {.fragment}
```{python}
#| warning: false
np.log(0)
```
:::

::: {.fragment}
```{python}
#| warning: false
1/np.log(0)
```
:::

::: {.fragment}
As far as we are concerned, there is no difference between positive and negative zero.   You can ignore the minus sign in front of a negative zero.  (If you are curious why there is a negative zero, see the online notes.)
:::

::: {.content-hidden when-profile="slides"}
:::: {.column-margin}
The reason for having a negative and positive zero is the following.   

Remember that, due to the limitations of floating point representation, we can only store the __nearest representable__ number to the one we'd like to store.

So, let's say we try to store a number $x$ that is very close to zero. To be specific, let $|x| < 2.2 \times 10^{-308}$.  Then the closest floating point representation is zero, so that is what is stored.   This is known as "underflow".

But ... the number $x$ that we were _trying_ to store could have been positive or negative.   So the standard defines a positive and negative zero. The sign of zero tells us when underflow occurred, "which direction" the underflow came from. 

This can be useful in some numerical algorithms.
::::
:::

::: {.fragment}
```{python}
var = np.nan
var + 7
```
:::

::: {.fragment}
```{python}
var = np.inf
var + 7
```
:::

::: {.content-visible when-profile="slides"}
##
:::

## _Mathematical_ Computation vs. _Mechanical_ Computation

::: {.fragment}
In a mathematical theorem, working with (idealized) numbers, it is always true that:

If 

$$c = 1/a$$ 

then 

$$abc = b.$$

In other words, 

$$(ab)/a = b.$$
:::

::: {.fragment}
Let's test whether this is always true in actual computation.
:::

::: {.fragment}
```{python}
a = 7
b = 1/10
c = 1/a
a*c*b
```
:::

::: {.fragment}
```{python}
b*c*a
```
:::

::: {.fragment}
```{python}
a*c*b == b*c*a
```
:::

::: {.fragment}
Here is another example:

```{python}
0.1 + 0.1 + 0.1 
```
:::

::: {.fragment}
```{python}
3 * (0.1) - 0.3
```
:::

::: {.content-visible when-profile="slides"}
##
:::

__What does all this mean for us in practice?__

I will now give you three principles to keep in mind when computing with floating point numbers.

::: {.content-visible when-profile="slides"}
##
:::

### Principle 1: Do not compare floating point numbers for equality


::: {.fragment}
Two floating point computations that _should_ yield the same result mathematically, may not do so due to rounding error.

However, in general, if two numbers should be equal, the relative error of the difference in the floating point should be small.

So, instead of asking whether two floating numbers are equal, we should ask whether the relative error of their difference is small.
:::

::: {.fragment}
```{python}
r1 = a * b * c
r2 = b * c * a
np.abs(r1-r2)/r1
```
:::

::: {.fragment}
```{python}
print(r1 == r2)
```
:::

::: {.fragment}
```{python}
print(np.abs(r1 -  r2)/np.max([r1, r2]) < np.finfo('float').resolution)
```
:::

::: {.fragment}
This test is needed often enough that `numpy` has a function that implements it:

```{python}
np.isclose(r1, r2)
```
:::

::: {.fragment}
So another way to state Rule 1 for our purposes is:
    
... __always__ use `np.isclose()` to compare floating point numbers for equality!
:::

::: {.content-visible when-profile="slides"}
##
:::

::: {.fragment}
Next, we will generalize this idea a bit: 

beyond the fact that numbers that should be equal, may not be in practice, 

we can also observe that it can be hard to be accurate about the __difference__ between two numbers that are __nearly__ equal.  This leads to the next two principles.
:::

::: {.content-visible when-profile="slides"}
##
:::

### Principle 2: Beware of ill-conditioned problems

::: {.fragment}
An __ill-conditioned__ problem is one in which the outputs depend in a very sensitive manner on the inputs.  

That is, a small change in the inputs can yield a very large change in the outputs.

The simplest example is computing $1/(a-b)$.
:::

::: {.fragment}
```{python}
print(f'r1 is {r1}')
print(f'r2 is very close to r1')
r3 = r1 + 0.0001
print(f'r3 is 0.1001')
```
:::

::: {.fragment}
Let's look at

$$ \frac{1}{r1 - r2} \text{ versus } \frac{1}{r3-r2} $$
:::

::: {.fragment}
```{python}
print(f'1/(r1 - r2) = {1/(r1 - r2)}')
print(f'1/(r3 - r2) = {1/(r3 - r2)}')
```
If $a$ is close to $b$, small changes in either make a big difference in the output.
:::

::: {.fragment}
Because the inputs to a problem may not be exact, if the problem is ill-conditioned, the outputs may be wrong by a large amount.
:::

::: {.fragment}
Later on we will see that the notion of ill-conditioning applies to matrix problems too, and in particular comes up when we solve certain problems involving matrices.
:::

::: {.content-visible when-profile="slides"}
##
:::

### Principle 3: Relative error can be magnified during subtractions

::: {.fragment}
Two numbers, each with small relative error, can yield a value with large relative error if subtracted.
:::

::: {.fragment}
Let's say we represent a = 1.2345 as 1.2345002 -- the relative error is 0.0000002.
:::

::: {.fragment}
Let's say we represent b = 1.234  as 1.2340001 -- the relative error is 0.0000001.
:::

::: {.fragment}
Now, subtract a - b: the result is .0005001.  
:::

::: {.fragment}
What is the relative error?  0.005001 - 0.005 / 0.005 = 0.0002  
:::

::: {.fragment}
The relative error of the result is 1000 times larger than the relative error of the inputs.
:::

::: {.fragment}
Here's an example in practice:

```{python}
a = 1.23456789
b = 1.2345678
print(0.00000009)
print(a-b)
```
:::

::: {.fragment}
```{python}
print(np.abs(a-b-0.00000009)/ 0.00000009)
```
:::

::: {.fragment}
We know the relative error in the inputs is on the order of $10^{-16}$, but the relative error of the output is on the order of $10^{-9}$ -- i.e., a million times larger.
:::

::: {.content-hidden when-profile="slides"}

### Further Reading

* Further information about how Python handles issues around floating point is at [Floating Point Arithmetic: Issues and Limitations](https://docs.python.org/3/tutorial/floatingpoint.html). 
* An excellent, in-depth introduction to floating point is [What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html).

:::