ML1_Ex9.Rmd

---
title: "Machine Learning 1 - Workshop 9 - Non-Linear Regression Models"
author: "Arndt Allhorn"
date: "June 6, 2018"
output: html_notebook
---

# Generalised additive Models

## The `Work` data

Work through Lab 7.8.3 in James et al. starting on page 294, up to the command `plot(gam.lo.i)` on page 296.  
The first example uses `lm()` to fit the model. Check that the function `gam()` with the same arguments outputs the same coefficients. 

Hints:  

* if you have not already done so you need to start the `gam` package, and to obtain the coefficients of a statistical model use the function `coef()`  
* The function `plot.gam` should be `plot.Gam`

---

We now fit a GAM to predict `wage` using natural spline functions of `year`
and `age`, treating `education` as a qualitative predictor, as in (7.16). Since
this is just a big linear regression model using an appropriate choice of
basis functions, we can simply do this using the `lm()` function.

```{r}
library (ISLR)
attach (Wage)
gam1=lm( wage ~ ns(year,4) + ns(age,5) + education , data=Wage )
```

We now fit the model (7.16) using smoothing splines rather than natural
splines. In order to fit more general sorts of GAMs, using smoothing splines
or other components that cannot be expressed in terms of basis functions
and then fit using least squares regression, we will need to use the `gam`
library in R.
The `s()` function, which is part of the `gam` library, is used to indicate that
we would like to use a smoothing spline. We specify that the function of
`year` should have 4 degrees of freedom, and that the function of `age` will
have 5 degrees of freedom. Since `education` is qualitative, we leave it as is,
and it is converted into four dummy variables. We use the `gam()` function in
order to fit a GAM using these components. All of the terms in (7.16) are
fit simultaneously, taking each other into account to explain the response.

```{r}
library (gam)
gam.m3=gam (wage~s(year,4)+s(age,5)+education, data= Wage)
```

In order to produce Figure 7.12, we simply call the `plot()` function:

```{r}
par(mfrow=c(1,3))
plot(gam.m3,se=TRUE,col="blue")
```

The generic `plot()` function recognizes that `gam.m3` is an object of class `gam`,
and invokes the appropriate `plot.Gam()` method. Conveniently, even though
`gam1` is not of class `gam` but rather of class `lm`, we can *still* use `plot.gam()`
on it. Figure 7.11 was produced using the following expression:

```{r}
par(mfrow=c(1,3))
plot.Gam(gam1, se=TRUE, col="red")
```

Notice here we had to use `plot.Gam()` rather than the generic `plot()`
function.  
In these plots, the function of `year` looks rather linear. We can perform a
series of ANOVA tests in order to determine which of these three models is
best: a GAM that excludes `year` (M1), a GAM that uses a linear function
of `year` (M2), or a GAM that uses a spline function of `year` (M3).

```{r}
gam.m1=gam(wage~s(age,5)+education, data=Wage )
gam.m2=gam(wage~year+s(age,5)+education, data=Wage)
anova(gam.m1, gam.m2, gam.m3, test ="F")
```

We find that there is compelling evidence that a GAM with a linear function of `year` is better than a GAM that does not include `year` at all (p-value=0.00014). However, there is no evidence that a non-linear function of `year` is needed (p-value=0.349). In other words, based on the results of this ANOVA, M2 is preferred.
The `summary()` function produces a summary of the gam fit.

```{r}
summary(gam.m3)
```

The p-values for `year` and `age` correspond to a null hypothesis of a linear
relationship versus the alternative of a non-linear relationship. The large
p-value for year reinforces our conclusion from the ANOVA test that a linear function is adequate for this term. However, there is very clear evidence
that a non-linear term is required for `age`.
We can make predictions from `gam` objects, just like from `lm` objects,
using the `predict()` method for the class `gam`. Here we make predictions on
the training set.

```{r}
preds = predict(gam.m2, newdata=Wage)
```

We can also use local regression fits as building blocks in a GAM, using
the `lo()` function.

```{r}
par(mfrow=c(1,3))
gam.lo= gam(wage~s(year,df =4)+lo(age,span=0.7)+education, data=Wage)
plot.Gam(gam.lo, se=TRUE, col="green")
```

Here we have used local regression for the `age` term, with a span of 0.7.
We can also use the `lo()` function to create interactions before calling the
`gam()` function. For example,

```{r}
gam.lo.i=gam(wage~lo(year,age,span=0.5)+education,data=Wage)
```

fits a two-term model, in which the first term is an interaction between
`year` and `age`, fit by a local regression surface. We can plot the resulting
two-dimensional surface if we first install the `akima` package.

```{r}
par(mfrow=c(1,2))
library(akima)
plot(gam.lo.i)
```