[Major] Speedup dataset get_item #1636

MaiBe-ctrl · 2024-08-24T00:05:35Z

No description provided.

github-actions · 2024-08-24T00:11:24Z

Model Benchmark

Benchmark	Metric	main	current	diff
YosemiteTemps	MAE_val	0.57299	0.57299	0.0%	✅
YosemiteTemps	RMSE_val	0.84755	0.84755	0.0%	✅
YosemiteTemps	Loss_val	0.00042	0.00042	0.0%	✅
YosemiteTemps	MAE	0.93964	0.93964	0.0%	✅
YosemiteTemps	RMSE	1.65623	1.65623	0.0%	✅
YosemiteTemps	Loss	0.00118	0.00118	-0.0%	✅
YosemiteTemps	LR	0.0001	0.0001	0.0%	✅
YosemiteTemps	time	148.134	59.45	-59.87%	🎉
EnergyPriceDaily	MAE_val	5.4019	5.4019	0.0%	✅
EnergyPriceDaily	RMSE_val	6.70638	6.70638	-0.0%	✅
EnergyPriceDaily	Loss_val	0.02518	0.02518	0.0%	✅
EnergyPriceDaily	MAE	5.91313	5.91313	0.0%	✅
EnergyPriceDaily	RMSE	7.94013	7.94013	0.0%	✅
EnergyPriceDaily	Loss	0.02554	0.02554	0.0%	✅
EnergyPriceDaily	LR	0.00029	0.00029	0.0%	✅
EnergyPriceDaily	time	39.613	19.81	-49.99%	🎉
AirPassengers	MAE_val	30.1289	30.1289	-0.0%	✅
AirPassengers	RMSE_val	31.082	31.0819	-0.0%	✅
AirPassengers	Loss_val	0.01242	0.01242	-0.0%	✅
AirPassengers	MAE	6.12209	6.12209	0.0%	✅
AirPassengers	RMSE	7.80953	7.80953	0.0%	✅
AirPassengers	Loss	0.00064	0.00064	0.0%	✅
AirPassengers	LR	0.0004	0.0004	0.0%	✅
AirPassengers	time	9.16971	8.38	-8.61%	🎉
PeytonManning	MAE_val	0.35028	0.35028	0.0%	✅
PeytonManning	RMSE_val	0.50092	0.50092	0.0%	✅
PeytonManning	Loss_val	0.01774	0.01774	0.0%	✅
PeytonManning	MAE	0.34667	0.34667	0.0%	✅
PeytonManning	RMSE	0.49358	0.49358	-0.0%	✅
PeytonManning	Loss	0.01466	0.01466	0.0%	✅
PeytonManning	LR	0.00032	0.00032	0.0%	✅
PeytonManning	time	25.532	14.91	-41.6%	🎉

Model training plots

Model Training

PeytonManning

YosemiteTemps

AirPassengers

EnergyPriceDaily

ourownstory · 2024-08-24T08:04:06Z

@MaiBe-ctrl Looking great! 2x Speedup!!

ourownstory

This is looking great! I did a detailed review and noted things that you can make more consistent and raised a few questions. Thank you!

ourownstory · 2024-08-26T21:07:00Z

neuralprophet/forecaster.py

                # config_train=self.config_train, # no longer needed since JIT tabularization.
            )
            loader = DataLoader(dataset, batch_size=min(4096, len(df)), shuffle=False, drop_last=False)
            predicted = {}
            for name in self.config_seasonality.periods:
                predicted[name] = list()
-            for inputs, _, meta in loader:
+            for inputs_tensor, meta in loader:
+                inputs = unpack_sliced_tensor(


was this not needed for the other predict_xyz_components methods?

ourownstory · 2024-08-26T21:12:41Z

tests/test_unit.py

-                "; ".join(["{}: {}".format(inp, values.shape) for inp, values in inputs.items()])
-            )
-        )
+        input, meta = dataset.__getitem__(0)


We should have one test that tests the unpacking of the tensor.

All the unpacking logic will be removed inside of the forward call and should be done incrementally when needed instead.

ourownstory · 2024-08-26T21:14:53Z

neuralprophet/utils.py

@@ -987,3 +987,153 @@ def configure_trainer(
    # config["replace_sampler_ddp"] = False

    return pl.Trainer(**config), checkpoint_callback
+
+
+def unpack_sliced_tensor(


I think it would be good to have this function in the same file as stack_all_features

Let's also add a docstring to this function and mention that the returned tensors may not be contiguous.

This function will be completely removed.

Actually, we need anyways to make the tensors of the targets contiguous anyways, because doing it like this didn't really solve the problem.

ourownstory · 2024-08-26T21:24:52Z

neuralprophet/time_net.py

@@ -774,7 +775,17 @@ def loss_func(self, inputs, predicted, targets):
        return loss, reg_loss

    def training_step(self, batch, batch_idx):


Might there be a "create_batch" function called before, where we could place the unpack_sliced_tensor?

Or should we just move the unpacking of each component to each of the components' respective forward call?

Note: we can do this in a subsequent PR.

All of the unpacking will be done incrementally in the forward pass.

ourownstory · 2024-08-26T21:25:17Z

neuralprophet/time_net.py

@@ -51,6 +52,7 @@ def __init__(
        config_regressors: Optional[configure.ConfigFutureRegressors] = None,
        config_events: Optional[configure.ConfigEvents] = None,
        config_holidays: Optional[configure.ConfigCountryHolidays] = None,
+        config_model: Optional[configure.ConfigModel] = None,


Is this optional?

ourownstory · 2024-08-26T23:34:41Z

neuralprophet/utils.py

+
+    else:
+        start_idx, end_idx = feature_indices["time"]
+        inputs["time"] = sliced_tensor[:, start_idx : end_idx + 1]


Why include the end_idx here, unlike when max_lags>0?

ourownstory · 2024-08-26T23:35:29Z

neuralprophet/utils.py

+
+        if "targets" in feature_indices:
+            targets_start_idx, targets_end_idx = feature_indices["targets"]
+            inputs["targets"] = sliced_tensor[:, targets_start_idx : targets_end_idx + 1].unsqueeze(1)


Why include the end_idx here, unlike when max_lags>0?

ourownstory · 2024-08-26T23:37:35Z

neuralprophet/utils.py

+                        lagged_regressor_start_idx,
+                    ]
+
+    else:


Please add a comment explaining dimensions and why most have a unsqueeze(1) added.

ourownstory · 2024-08-26T23:38:17Z

neuralprophet/utils.py

+        # Unpack multiplicative event and holiday features
+        if "multiplicative_events" in feature_indices:
+            events_start_idx, events_end_idx = feature_indices["multiplicative_events"]
+            if "events" not in inputs:


add check also for additive

ourownstory · 2024-08-26T23:38:26Z

neuralprophet/utils.py

+        # Unpack multiplicative regressor features
+        if "multiplicative_regressors" in feature_indices:
+            regressors_start_idx, regressors_end_idx = feature_indices["multiplicative_regressors"]
+            if "regressors" not in inputs:


add check also for additive

ourownstory · 2024-08-28T22:35:37Z

neuralprophet/time_net.py

+        # Unpack and process seasonalities
+        seasonalities_input = None
+        if self.config_seasonality and self.config_seasonality.periods:
+            print("++++seasonalities ++++")


print statement from debugging

Still working on cleaning the code 🙈. Let you know once it's ready for review!

ourownstory · 2024-09-03T18:08:26Z

neuralprophet/forecaster.py

+                max_lags=0,
+                n_forecasts=1,
+                config_seasonality=self.config_seasonality,
+                lagged_regressor_config=self.config_lagged_regressors,


should set elf.config_lagged_regressors to none

ourownstory · 2024-09-03T18:11:17Z

neuralprophet/time_net.py

@@ -142,11 +147,16 @@ def __init__(
        # General
        self.config_model = config_model
        self.n_forecasts = n_forecasts
+        self.train_components_stacker = train_components_stacker


store as dict

ourownstory · 2024-09-03T18:11:34Z

neuralprophet/time_net.py

@@ -310,6 +320,16 @@ def ar_weights(self) -> torch.Tensor:
            if isinstance(layer, nn.Linear):
                return layer.weight

+    def set_components_stacker(self, components_stacker, mode):


store/call as dict

ourownstory · 2024-09-03T18:13:11Z

neuralprophet/time_net.py

+    def forward(
+        self,
+        input_tensor: torch.Tensor,
+        components_stacker=ComponentStacker,


components stacker should not be passed to forward, but rather inferred from a mode-flag

updated dataset get_item

b64373b

fixed linting issues

3eb0f52

MaiBe-ctrl force-pushed the speedup_dataset_next branch from fdd8043 to 3eb0f52 Compare August 24, 2024 01:31

MaiBe-ctrl added 2 commits August 23, 2024 19:32

make targets contiguous

1f9c603

fixed ruff warnings

6074baf

MaiBe-ctrl requested a review from ourownstory August 24, 2024 02:47

ourownstory requested changes Aug 26, 2024

View reviewed changes

Unpack incrementally when needed

77fdd7f

ourownstory reviewed Aug 28, 2024

View reviewed changes

MaiBe-ctrl and others added 19 commits August 28, 2024 15:41

adjust forecaster

4a4aee9

separate unpacking logic

361e546

added featureExtractor class

b6e3c9b

separate packing logic

f3f2afa

merged main

8ecb443

fixed liniting issues

1dde4e9

Merge branch 'main' into speedup_dataset_next

c0bfe97

fixed covariates

2f5bb4a

added features extractor

fb613ef

rename classes and functions

24a0b0e

remove prints in time_net

5d0d59a

init Stacker in forecaster

e0623ed

fix time-dataset

853fb21

remove last _create component staker

81868a9

fix lagged config

4456fed

uncomment glocal tests

bac04af

fix ruff

a4a277a

uncomment future reg tests

ca1cb84

comment test

aae91e8

ourownstory reviewed Sep 3, 2024

View reviewed changes

ourownstory merged commit 9c4963c into main Sep 3, 2024
10 of 11 checks passed

ourownstory deleted the speedup_dataset_next branch September 3, 2024 18:16

ourownstory mentioned this pull request Sep 3, 2024

[Minor] Move max_lags, prediction_freq to config_model and n_lags to config_ar #1644

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Major] Speedup dataset get_item #1636

[Major] Speedup dataset get_item #1636

MaiBe-ctrl commented Aug 24, 2024

github-actions bot commented Aug 24, 2024 •

edited

Loading

Model Training

PeytonManning

YosemiteTemps

AirPassengers

EnergyPriceDaily

ourownstory commented Aug 24, 2024

ourownstory left a comment

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

MaiBe-ctrl Aug 27, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

MaiBe-ctrl Aug 27, 2024

MaiBe-ctrl Aug 27, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

MaiBe-ctrl Aug 27, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 26, 2024

ourownstory Aug 28, 2024

MaiBe-ctrl Aug 28, 2024

ourownstory Sep 3, 2024

ourownstory Sep 3, 2024

ourownstory Sep 3, 2024

ourownstory Sep 3, 2024

		@@ -774,7 +775,17 @@ def loss_func(self, inputs, predicted, targets):
		return loss, reg_loss

		def training_step(self, batch, batch_idx):

[Major] Speedup dataset get_item #1636

[Major] Speedup dataset get_item #1636

Conversation

MaiBe-ctrl commented Aug 24, 2024

github-actions bot commented Aug 24, 2024 • edited Loading

Model Benchmark

Model Training

PeytonManning

YosemiteTemps

AirPassengers

EnergyPriceDaily

ourownstory commented Aug 24, 2024

ourownstory left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Aug 24, 2024 •

edited

Loading