MultiHorizonMetric weighted loss not working #942

clianga · 2022-04-06T04:11:07Z

PyTorch-Forecasting version: 0.9.0
PyTorch version: 1.7.1
Python version: 3.6.13
Operating System: Amazon Sagemaker Juypternotebook

Expected behavior

I'm trying to apply a weight to each sample (and losses) to get rid of the covid effect. Say I have 100 time series from 2016 to 2022, and I want the model not update parameters from 2020 Mar 1st to 2020 July 30th. Hence I created a weight in my panda DataFrame, and let the weight = 0 when time is between 2020 Mar 1st to 2020 July 30th, and 1 elsewhere, for all 100 time series. The dataset creating code is here:

from pytorch_forecasting import TimeSeriesDataSet
from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
from pytorch_lightning.loggers import TensorBoardLogger

context_length = 365
prediction_length = 60
training_cutoff = tot_data["date"].max() - prediction_length

    
training = TimeSeriesDataSet(
    tot_data[lambda x: x.date <= training_cutoff],
    group_ids= ['combined_group'],
    target='contact',
    weight = 'weight',
    time_idx='date',
    static_categoricals = ['marketplace_id', 'pg_rollup', 'order_channel'],
    time_varying_known_reals = numeric_col,
    time_varying_unknown_reals=["contact"],
    lags = {'contact': list(range(1, 31))},
    min_encoder_length = context_length,
    max_encoder_length = context_length*2,
    min_prediction_length = prediction_length,
    max_prediction_length = prediction_length,
)

# create validation set (predict=True) which means to predict the last max_prediction_length points in time
# for each series
validation = TimeSeriesDataSet.from_dataset(training, tot_data, predict=True, stop_randomization=True)

# create dataloaders for model
batch_size = 128  # set this between 32 to 128
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size * 10)

trainx, trainy = next(iter(train_dataloader))
display(trainy[0].size())
testx, testy = next(iter(train_dataloader))
display(testy[0].size())

I check the tensor size for training and validation, it matches. Then I run code

# Weighted loss
from pytorch_forecasting.metrics import MultiHorizonMetric
class MAE(MultiHorizonMetric):
    def loss(self, y_pred, target):
        loss = (self.to_prediction(y_pred) - target).abs()
        return loss

import pytorch_lightning as pl
# res.suggestion()

# configure network and trainer
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min")
lr_logger = LearningRateMonitor()  # log the learning rate
logger = TensorBoardLogger("lightning_logs")  # logging results to a tensorboard

trainer = pl.Trainer(
    max_epochs= 200,
    auto_lr_find = True,
    auto_scale_batch_size = True,
    gpus=1,
#     weights_summary="top",
    gradient_clip_val=0.1,
    limit_train_batches=30,  # coment in for training, running valiation every 30 batches
    # fast_dev_run=True,  # comment in to check that networkor dataset has no serious bugs
    callbacks=[lr_logger, early_stop_callback],
    logger=logger,
)


lstm_net = RecurrentNetwork.from_dataset(
    dataset = training,
    cell_type = 'LSTM', 
    hidden_size = 128, 
    rnn_layers = 2, 
    dropout = 0.1,
    loss = MAE(), 
)
print(f"Number of parameters in network: {lstm_net.size()/1e3:.1f}k")

to create a model and it went through without error. However when I try to train the model using code:

# fit network
trainer.fit(
    lstm_net,
    train_dataloaders=train_dataloader,
    val_dataloaders=val_dataloader,
)

It produces an error

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/pytorch_forecasting/metrics.py in update(self, y_pred, target, encoder_target, encoder_lengths)
854 # weight samples
855 if weight is not None:
--> 856 losses = losses * weight.unsqueeze(-1)
857
858 self._update_losses_and_lengths(losses, lengths)

RuntimeError: The size of tensor a (119) must match the size of tensor b (60) at non-singleton dimension 1

Can you tell me how to fix it? Or is there a way to apply weight to the losses (for different samples different weight is applied).
Thank you!

The text was updated successfully, but these errors were encountered:

fnavruzov · 2022-06-14T12:02:23Z

I've encountered the same RuntimeError with shape mismatch when using AggregationMetric(some_metric) as loss.
To my understanding, it was connected with the predictions and actuals shape mismatch:

actuals are of shape (n_samples, n_timesteps), ndim = 2,
predictions (even when omitting quantile-based losses) are of shape (n_samples, n_timesteps, n_outputs), ndim = 3 -> which in my case was (batch_size, forecasting_horizon, 1), so check you model's output shape

Can you check whether original MAE() works for you?
if no - as a quick workaround I suggest adding in loss method last axis averaging:

def loss(self, y_pred, target):
        y_pred = self.to_prediction(y_pred)
        if y_pred.ndim == 3:
            # maybe some other checks
            y_pred = y_pred.mean(axis=-1)
        loss = (y_pred - target).abs()
        return loss

Hope it will help you to solve the issue

bohdan-safoniuk mentioned this issue May 28, 2023

Bugfix, weights shape during length randomization #1318

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiHorizonMetric weighted loss not working #942

MultiHorizonMetric weighted loss not working #942

clianga commented Apr 6, 2022

fnavruzov commented Jun 14, 2022

MultiHorizonMetric weighted loss not working #942

MultiHorizonMetric weighted loss not working #942

Comments

clianga commented Apr 6, 2022

Expected behavior

fnavruzov commented Jun 14, 2022