-
Notifications
You must be signed in to change notification settings - Fork 76
Short term prediction
A simple short term predictive LightGBM model can be found in this notebook. The model was trained with 30 days in order to predict 3 days. This was performed for winter and summer.
Based on the exploratory data analysis a simple feature engineering was performed. Based on EDA of meter readings:
- Healthcare, Food sales and services and Utility usages shows the highest meter reading values.
- Hotwater meter shows the highest meter reading values.
- Monthly behaviour (meter-reading median) shows higher readings in warm season.
- Hourly behaviour (meter-reading median) shows higher values from 6 to 19 hs.
- Weekday behaviour: lowers during weekends.
In the following section can be found the features selected, transformed and created.
the following features were selected from each data set:
-
Building metadata
- Building ID*
- Site ID*
- Primary space usage
- Building size (sqm)
-
Weather data
- Timestamp*
- Site ID*
- Air temperature
-
Meter reading data
- Timestamp*
- Building ID*
- meter
- meter reading (target)
The following features were transformed:
-
primaryspaceusage
categories (16) were reduced to food sales and services, healthcare, utility and other -
meter
categories (8) were preserved
The following features were created:
- day of the week
- hour of the day
- Timestamp*
- Site ID
- Building ID
- Hour
- Day of the week
- Usage (4 levels: healthcare, food, utility, other)
- Building size (sqm)
- Air temperature
- Meter (8 levels)
- Meter reading / target
Parameters for this model were not tuned, but were manually modified to perform better than default.
- "objective": "regression"
- "metric": "rmse"
- "random_state": 55
- "learning_rate": 0.01, (default 0.1)
- "max_bin": 761 (default 255)
- "num_leaves": 2197 (default 31)
Performance, as expected, was poor for this model. It can be used as baseline for more complex models.
Figure 1: meter_reading
real values and predicted with short-term winter model v. timestamp
.
Figure 2: meter_reading
predicted with short-term winter model v. real values.
meter/metric | RMSE | RMSLE | CVRMSE | MBE | R2 |
---|---|---|---|---|---|
all | 63793.3281 | 3.585 | 893.9526 | -1.3002 | -0.343 |
electricity | 519.7323 | 2.9424 | 365.1851 | -204.7482 | -3.3066 |
water | 2870.6986 | 4.1811 | 370.3285 | 25.1321 | -0.0163 |
chilledwater | 102821.528 | 3.9423 | 545.8142 | 10.6727 | -0.2252 |
hotwater | 186015.753 | 5.4836 | 334.6407 | -8.7227 | -0.7154 |
gas | 3242.0113 | 5.3402 | 385.3021 | -6.2478 | -0.3415 |
steam | 2473.409 | 2.5389 | 290.4014 | -15.6519 | -0.5368 |
solar | 809.158 | 5.5228 | 1504.3307 | -1199.6183 | -58.916 |
irrigation | 1844.7737 | 5.7132 | 470.3007 | -40.5796 | -0.1421 |
Table 1: metrics for the short-term winter model, calculated for all meters alltogether and for each one.
Figure 1: meter_reading
real values and predicted with short-term summermodel v. timestamp
.
Figure 2: meter_reading
predicted with short-term summer model v. real values.
meter/metric | RMSE | RMSLE | CVRMSE | MBE | R2 |
---|---|---|---|---|---|
all | 160223.041 | 5.1053 | 1076.0289 | -9.3812 | -0.4323 |
electricity | 3840.3759 | 4.8381 | 2602.8031 | -2593.092 | -187.2267 |
water | 3852.6936 | 6.3158 | 999.4388 | -957.9767 | -14.316 |
chilledwater | 373663.17 | 3.5728 | 504.8544 | 10.4382 | -0.5017 |
hotwater | 56641.9717 | 6.4026 | 286.6051 | 12.3044 | -0.1629 |
gas | 4284.2112 | 6.7518 | 751.958 | -627.9519 | -2.9741 |
steam | 4008.5469 | 5.5218 | 1399.8411 | -1331.0957 | -13.0241 |
solar | 3899.8942 | 7.4042 | 82942.7713 | -82941.967 | -251333.771 |
irrigation | 26549.0901 | 8.3586 | 12386.1993 | -5364.6018 | -342.9041 |
Table 1: metrics for the short-term summer model, calculated for all meters alltogether and for each one.