[Major] Support Re-Training #1635

ourownstory · 2024-08-23T23:03:58Z

Continuation of #1605 on repository branch to enable metrics CI to run.

github-actions · 2024-08-23T23:13:03Z

Model Benchmark

Benchmark	Metric	main	current	diff
YosemiteTemps	MAE_val	0.59734	0.58159	-2.64%	✅
YosemiteTemps	RMSE_val	0.88884	0.86935	-2.19%	✅
YosemiteTemps	Loss_val	0.00046	0.00044	-4.33%	✅
YosemiteTemps	train_loss	0.00126	0.0012	-4.44%	✅
YosemiteTemps	reg_loss	0	0	0.0%	✅
YosemiteTemps	MAE	0.97461	0.94811	-2.72%	✅
YosemiteTemps	RMSE	1.70806	1.66847	-2.32%	✅
YosemiteTemps	Loss	0.00126	0.0012	-4.44%	✅
YosemiteTemps	time	141.75	139.93	-1.28%	✅
EnergyPriceDaily	MAE_val	5.64247	5.42935	-3.78%	✅
EnergyPriceDaily	RMSE_val	7.19972	6.88991	-4.3%	✅
EnergyPriceDaily	Loss_val	0.02893	0.02655	-8.24%	🎉
EnergyPriceDaily	train_loss	0.02957	0.02758	-6.73%	🎉
EnergyPriceDaily	reg_loss	0	0	0.0%	✅
EnergyPriceDaily	MAE	6.39654	6.15242	-3.82%	✅
EnergyPriceDaily	RMSE	8.56207	8.26192	-3.51%	✅
EnergyPriceDaily	Loss	0.02936	0.02739	-6.73%	🎉
EnergyPriceDaily	time	40.4214	40.35	-0.18%	✅
AirPassengers	MAE_val	30.8306	30.081	-2.43%	✅
AirPassengers	RMSE_val	31.8167	30.9826	-2.62%	✅
AirPassengers	Loss_val	0.01301	0.01234	-5.17%	🎉
AirPassengers	train_loss	0.00071	0.00071	-0.86%	✅
AirPassengers	reg_loss	0	0	0.0%	✅
AirPassengers	MAE	6.88169	6.86203	-0.29%	✅
AirPassengers	RMSE	8.92083	8.81789	-1.15%	✅
AirPassengers	Loss	0.00076	0.00074	-2.27%	✅
AirPassengers	time	10.2224	10.06	-1.59%	✅
PeytonManning	MAE_val	0.35533	0.35447	-0.24%	✅
PeytonManning	RMSE_val	0.50396	0.50324	-0.14%	✅
PeytonManning	Loss_val	0.01802	0.01796	-0.32%	✅
PeytonManning	train_loss	0.01466	0.01461	-0.3%	✅
PeytonManning	reg_loss	0	0	0.0%	✅
PeytonManning	MAE	0.34756	0.34738	-0.05%	✅
PeytonManning	RMSE	0.4945	0.49347	-0.21%	✅
PeytonManning	Loss	0.01465	0.01461	-0.3%	✅
PeytonManning	time	25.6879	25.48	-0.81%	✅

Model training plots

Model Training

PeytonManning

YosemiteTemps

AirPassengers

EnergyPriceDaily

ourownstory · 2024-08-26T04:25:43Z

neuralprophet/configure.py

@@ -104,18 +122,21 @@ class Train:
    n_data: int = field(init=False)
    loss_func_name: str = field(init=False)
    lr_finder_args: dict = field(default_factory=dict)
+    optimizer_state: dict = field(default_factory=dict)


move to separate PR

ourownstory · 2024-08-26T04:34:25Z

neuralprophet/configure.py

-                "three_phase": True,
-            }
-        )
+        if self.continue_training:


move to other PR

ourownstory · 2024-08-26T04:34:44Z

neuralprophet/configure.py

@@ -239,6 +304,9 @@ def get_reg_delay_weight(self, e, iter_progress, reg_start_pct: float = 0.66, re
            delay_weight = 1
        return delay_weight

+    def set_optimizer_state(self, optimizer_state: dict):


move to other PR

ourownstory · 2024-08-26T04:35:52Z

neuralprophet/forecaster.py

+            --------
+            >>> from neuralprophet import NeuralProphet
+            >>> # Step Learning Rate scheduler
+            >>> m = NeuralProphet(scheduler="StepLR")


add scheduler args example

ourownstory · 2024-08-26T04:37:19Z

neuralprophet/forecaster.py

-            newer_samples_start=newer_samples_start,
-            trend_reg_threshold=self.config_trend.trend_reg_threshold,
-        )
+        self.learning_rate = learning_rate


redo init of Train

ourownstory · 2024-08-27T00:20:16Z

neuralprophet/time_net.py

        self.n_forecasts = n_forecasts

        # Lightning Config
        self.config_train = config_train
        self.config_normalization = config_normalization
        self.compute_components_flag = compute_components_flag

+        # Continued training


separate PR

ourownstory · 2024-08-27T00:21:05Z

neuralprophet/time_net.py

        # Optimizer
        optimizer = self._optimizer(self.parameters(), lr=self.learning_rate, **self.config_train.optimizer_args)

+        if self.continue_training:


separate PR

ourownstory · 2024-08-27T00:21:53Z

tests/test_train_config.py

+    return config_train_params
+
+
+def test_continue_training():


modify to do without checkpoints

ourownstory · 2024-08-27T00:22:09Z

tests/test_train_config.py

+    assert metrics["Loss"].min() >= metrics2["Loss"].min()
+
+
+def test_continue_training_with_scheduler_selection():


modify to work without checkpoints

ourownstory · 2024-08-27T00:22:13Z

tests/test_train_config.py

+    assert metrics["Loss"].min() >= metrics2["Loss"].min()
+
+
+def test_save_load_continue_training():


modify to work without checkpoints

weberpals and others added 14 commits June 24, 2024 17:40

enable re-training

2ae4506

update scheduler

900c8d5

change scheduler for continued training

f1355eb

add test

da3a6d5

merge main

492dee9

fix metrics logging

f996928

include feedback

f9a77f8

get correct optimizer states

7ad761d

fix tests

b14d20b

enable setting the scheduler

9fe3401

update for onecyclelr

00f2e25

add tests and adapt docstring

5f103d8

fix array mismatch

e043201

Merge branch 'main' into dynamic-weight-saving-for-retraining

df74dc3

ourownstory added 2 commits August 23, 2024 23:10

robustify scheduler config

63c935c

clean up train config setup

6a74680

ourownstory force-pushed the train-continue branch from 30c106f to 6a74680 Compare August 24, 2024 06:18

restructure train model config

420f8a6

ourownstory changed the title ~~[Major] Train continue~~ [Major] Support Custom Learning Rate Scheduler Aug 26, 2024

ourownstory commented Aug 27, 2024

View reviewed changes

ourownstory changed the title ~~[Major] Support Custom Learning Rate Scheduler~~ [Major] Support Re-Training Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Major] Support Re-Training #1635

[Major] Support Re-Training #1635

ourownstory commented Aug 23, 2024

github-actions bot commented Aug 23, 2024 •

edited

Loading

Model Training

PeytonManning

YosemiteTemps

AirPassengers

EnergyPriceDaily

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 27, 2024

ourownstory Aug 27, 2024

ourownstory Aug 27, 2024

ourownstory Aug 27, 2024

ourownstory Aug 27, 2024

		assert metrics["Loss"].min() >= metrics2["Loss"].min()


		def test_continue_training_with_scheduler_selection():

		assert metrics["Loss"].min() >= metrics2["Loss"].min()


		def test_save_load_continue_training():

[Major] Support Re-Training #1635

Are you sure you want to change the base?

[Major] Support Re-Training #1635

Conversation

ourownstory commented Aug 23, 2024

github-actions bot commented Aug 23, 2024 • edited Loading

Model Benchmark

Model Training

PeytonManning

YosemiteTemps

AirPassengers

EnergyPriceDaily

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Aug 23, 2024 •

edited

Loading