-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathL188_LSTM_MultivariateMultistep_Final.Rmd
203 lines (155 loc) · 5.33 KB
/
L188_LSTM_MultivariateMultistep_Final.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
---
title: "LSTM: Multivariate, Multi-Step Timeseries Prediction"
author: "Bert Gollnick"
output:
html_document:
toc: true
toc_float: true
code_folding: hide
number_sections: true
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = T, message = F, warning = F)
```
# Introduction
We will work with data from a sine wave and try to predict the next values. This is different to the univariate sine wave example - here, we create a wave based on two different sine waves.
# Packages
```{r}
library(dplyr)
library(tidyr)
library(ggplot2)
library(keras)
library(tfruns)
```
# Data Preparation
We start very simple and try to predict the next values of a sine wave. We take 2 complete waves, where we choose the first for training and the second for testing.
```{r}
sine_wave <- tibble(x = seq(0, 6 * 3.14, by = 0.01),
y1 = sin(x),
y2 = sin(x)^2,
y = y1 + y2)
sine_wave$class <- NA
sine_wave$class[sine_wave$x < 2 *3.14] <- "train"
sine_wave$class[sine_wave$x > 4 *3.14] <- "test"
sine_wave$class[is.na(sine_wave$class)] <- "valid"
```
Let's take a look at our training and testing data.
```{r}
ggplot(sine_wave, aes(x, y, col = class)) +
geom_point()
```
Excursion on State Size and Data Shape:
We use a state size of 5. Assume we have data that follows the sequence from 1 to n. This results in the training data
1, 2, 3, 4, 5
2, 3, 4, 5, 6
...
and these target data
6, 7, 8, 9, 10
7, 8, 9, 10, 11
...
## Train / Test Split
As always, we split our data into training, validation, and testing.
```{r}
train <- sine_wave %>%
filter(class == "train")
valid <- sine_wave %>%
filter(class == "valid")
test <- sine_wave %>%
filter (class == "test")
```
The new data needs to have the shape:
(number of observations, number of timesteps, prediction features)
```{r}
n_timesteps <- 20
n_features <- 2
```
```{r}
lstm_reshape <- function(feature1, feature2, target, n_timesteps, n_features = 2) {
n_obs <- length(target) - 2 * n_timesteps
# initialize result arrays
X_arr <- array(data = NA, dim = c(n_obs, n_timesteps, n_features))
y_arr <- array(data = NA, dim = c(n_obs, n_timesteps, 1))
for (i in 1: n_obs) {
X_arr[i, 1:n_timesteps, 1] <- feature1[i: (i+n_timesteps-1)]
X_arr[i, 1:n_timesteps, 2] <- feature2[i: (i+n_timesteps-1)]
y_arr[i, 1:n_timesteps, 1] <- target[(i+n_timesteps): (i+2*n_timesteps-1)]
}
#print(X_arr)
list(X_arr, y_arr)
}
```
The independent X and dependent y variables are separated.
```{r}
c(X_train, y_train) %<-% lstm_reshape(feature1 = train$y1,
feature2 = train$y2,
target = train$y,
n_timesteps = n_timesteps,
n_features = n_features)
c(X_valid, y_valid) %<-% lstm_reshape(feature1 = valid$y1,
feature2 = valid$y2,
target = valid$y,
n_timesteps = n_timesteps,
n_features = n_features)
c(X_test, y_test) %<-% lstm_reshape(feature1 = test$y1,
feature2 = test$y2,
target = test$y,
n_timesteps = n_timesteps,
n_features = n_features)
```
# Modeling
We define flags, that can be used later to specify parameters of the model.
```{r parameters}
FLAGS <- tfruns::flags(
flag_integer("batch_size", 10),
flag_integer("n_epochs", 10),
flag_integer("n_timesteps", 10), # size of the hidden state
flag_numeric("dropout", 0.2),
flag_string("loss", "logcosh"),
flag_string("optimizer_type", "sgd"),
flag_integer("n_units", 128) # LSTM layer size
)
```
The function for creating the model is created.
```{r}
create_model <- function() {
keras_model_sequential() %>%
layer_lstm(units = FLAGS$n_units,
return_sequences = T,
input_shape = c(n_timesteps, n_features)) %>%
layer_dense(units = 1) %>%
compile(loss = FLAGS$loss,
optimizer = "adam",
metrics = "mean_squared_error")
}
```
We create the model and fit it to the data.
```{r}
lstm_model <- create_model()
history <- lstm_model %>%
keras::fit(x = X_train,
y = y_train,
verbose = 0,
validation_data = list(X_valid, y_valid),
batch_size = FLAGS$batch_size,
epochs = FLAGS$n_epochs)
```
## Model Evaluation
For the evaluation, we use our test data.
```{r}
predictions <- lstm_model %>%
predict(X_test)
```
Finally, we check the performance visually. We create a plot for specific point in time.
```{r}
start_point <- 190
predicted_series <- tibble(x = test$x[start_point: (start_point+n_timesteps-1)],
y = predictions[start_point,1:n_timesteps,1])
ggplot(test, aes(x, y)) +
geom_point(alpha = .1) +
geom_point(data = test[(start_point-n_timesteps):(start_point-1), ], col = "blue") +
geom_point(data = predicted_series, col = "red")
```
- Black points represent the overall test data.
- Blue points represent the points used for prediction.
- Red points indicate the predicted values.
The predictions are quite good and follow the overall shape of the data.