This project explored various statistical methods and deep learning models for multivariate time series analysis. Techniques such as Naive Forecasting, Moving Average Forecasting, Differenced Moving Average Forecasting, and Differenced Moving Average Forecasting with Smoothing were meticulously examined. Within the realm of deep learning, Simple Neural Networks, Deep Neural Networks, Single-Layer LSTMs, Single-Layer Regularized LSTMs, Bi-Directional Regularized LSTMs, Regularized Stacked GRUs, and Convolutional Layers with Stacked GRUs and Fully Connected Layers were analyzed. Through rigorous comparison and evaluation, the most effective methodology for achieving accurate and reliable weather predictions were sought. This involved establishing baseline, selecting the best model using learning rate scheduler, and conducting performance comparisons against baseline.
Naive Forecasting: A simple forecasting method where the prediction is the last observed value, assuming no change or trend.
Moving Average Forecasting: A method that predicts future values by averaging a set of recent past values, smoothing out short-term fluctuations.
Differenced Moving Average Forecasting: Extends moving average forecasting by first differencing the data (subtracting consecutive observations) to remove trends or seasonality.
Differenced Moving Average Forecasting with Smoothing: Further refines differenced moving average forecasting by applying additional smoothing techniques to the differenced data to reduce noise.
Simple Neural Networks: Basic neural networks with a single hidden layer, used for pattern recognition in data with limited complexity.
Deep Neural Networks: Advanced neural networks with multiple hidden layers, capable of learning complex representations from large datasets.
Single-Layer LSTMs: Long Short-Term Memory (LSTM) networks with one layer, designed to handle sequential data by retaining information over time.
Bi-Directional LSTMs: LSTMs that process data in both forward and backward directions to enhance performance on sequential data.
GRUs: Gated Recurrent Units (GRUs) that efficiently capture dependencies in sequential data.
CNNs: Convolutional Neural Networks (CNNs) that adaptively learn and extract hierarchical spatial features from data using convolutional layers, commonly used for image and video processing.
In this phase, individual variables are analyzed to understand their distribution and normality. Utilizing histograms and quantile-quantile (qq) plots, we gain insights into their characteristics.
Histograms
|
Quantile-Quantile Plots
|
Exploring relationships between variables, correlation analysis employs Pearson correlation coefficients. A correlation matrix visualized through a heatmap highlights the strengths of correlations with 'T (degC)', offering valuable insights into inter-variable relationships and dependencies.
Heatmap of Correlation Coefficients
|
Time series plots depict temperature variations over time, revealing both long-term trends and short-term fluctuations within seasonal cycles. Annual temperature trend analysis showcases maximum, average, and minimum temperatures annually, aiding in the interpretation of climate data and identification of seasonal patterns.
Seasonality
|
Seasonality without Noise
|
Seasonality (First Season Cycle)
|
Temperature over the Years
|
A systematic partitioning approach divides temperature data into training and testing sets. Data from 2012 to 2014 are allocated for training to enable model learning from historical data, while data from subsequent years are reserved for validation and testing, ensuring accurate predictions of future temperatures.
Predictions are based solely on the last observed temperature, serving as a baseline for accuracy assessment.
Average temperatures over defined window sizes are computed to smooth short-term fluctuations and highlight long-term trends.
By differencing to remove trends and seasonality before applying a moving average, this method refines predictions and improves accuracy.
Seasonality and Trend added back to the differenced moving average.
Using centered approach to smooth the data at each step. For instance, to smooth the data point at t = 365, we would compute the average of the values from t = 359 to t = 370, with the window size of 11.
Various deep learning models, including Basic Neural Network, Deep Neural Network, LSTM, Regularized LSTM, Bi-Directional LSTM, Stacked GRUs, and Convolutional layer with stacked GRUs and Fully Connected Layers are explored for temperature forecasting, each tailored to leverage sequential data characteristics for enhanced prediction accuracy.
Temperature data are split into training, validation, and testing sets ensuring chronological order and accounting for seasonality, essential for effective model training and evaluation.
Data normalization using MinMaxScaler ensures consistent scaling, particularly beneficial for non-normally distributed data and when training neural networks with features of different scales.
Sequences are generated from input array data using TensorFlow's timeseries_dataset_from_array, facilitating training, validation, and testing of models with specified sequence lengths.
The two most promising models, determined by their low loss and Mean Absolute Error (MAE), were selected for further refinement. They underwent fine-tuning using a learning rate schedule to identify the optimal learning rate and were then retrained on the dataset. Among these models, the one exhibiting the best performance with the new learning rate was chosen as the final selection.
Training Loss versus Learning Rate for Model 1
|
Training Loss versus Learning Rate for Model 2
|
Model performance is evaluated using Mean Absolute Error (MAE) metric on the test dataset, comparing predictions against actual values to quantify forecasting accuracy.
Trained models are utilized to predict future temperature values, leveraging the learned patterns and dependencies in the data to provide accurate forecasts.
- Modify the number of units.
- Change the dropout ratio.
- Test different learning rates.
- Experiment with batch sizes.
- Add more dense layers.
- Alter the sequence length.
https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip
This project is licensed under the Raza Mehar License. See the LICENSE.md file for details.
For any questions or clarifications, please contact Raza Mehar at [raza.mehar@gmail.com].