Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiHorizonMetric weighted loss not working #942

Open
clianga opened this issue Apr 6, 2022 · 1 comment
Open

MultiHorizonMetric weighted loss not working #942

clianga opened this issue Apr 6, 2022 · 1 comment

Comments

@clianga
Copy link

clianga commented Apr 6, 2022

  • PyTorch-Forecasting version: 0.9.0
  • PyTorch version: 1.7.1
  • Python version: 3.6.13
  • Operating System: Amazon Sagemaker Juypternotebook

Expected behavior

I'm trying to apply a weight to each sample (and losses) to get rid of the covid effect. Say I have 100 time series from 2016 to 2022, and I want the model not update parameters from 2020 Mar 1st to 2020 July 30th. Hence I created a weight in my panda DataFrame, and let the weight = 0 when time is between 2020 Mar 1st to 2020 July 30th, and 1 elsewhere, for all 100 time series. The dataset creating code is here:

from pytorch_forecasting import TimeSeriesDataSet
from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
from pytorch_lightning.loggers import TensorBoardLogger

context_length = 365
prediction_length = 60
training_cutoff = tot_data["date"].max() - prediction_length

    
training = TimeSeriesDataSet(
    tot_data[lambda x: x.date <= training_cutoff],
    group_ids= ['combined_group'],
    target='contact',
    weight = 'weight',
    time_idx='date',
    static_categoricals = ['marketplace_id', 'pg_rollup', 'order_channel'],
    time_varying_known_reals = numeric_col,
    time_varying_unknown_reals=["contact"],
    lags = {'contact': list(range(1, 31))},
    min_encoder_length = context_length,
    max_encoder_length = context_length*2,
    min_prediction_length = prediction_length,
    max_prediction_length = prediction_length,
)

# create validation set (predict=True) which means to predict the last max_prediction_length points in time
# for each series
validation = TimeSeriesDataSet.from_dataset(training, tot_data, predict=True, stop_randomization=True)

# create dataloaders for model
batch_size = 128  # set this between 32 to 128
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size * 10)

trainx, trainy = next(iter(train_dataloader))
display(trainy[0].size())
testx, testy = next(iter(train_dataloader))
display(testy[0].size())

I check the tensor size for training and validation, it matches. Then I run code

# Weighted loss
from pytorch_forecasting.metrics import MultiHorizonMetric
class MAE(MultiHorizonMetric):
    def loss(self, y_pred, target):
        loss = (self.to_prediction(y_pred) - target).abs()
        return loss

import pytorch_lightning as pl
# res.suggestion()

# configure network and trainer
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min")
lr_logger = LearningRateMonitor()  # log the learning rate
logger = TensorBoardLogger("lightning_logs")  # logging results to a tensorboard

trainer = pl.Trainer(
    max_epochs= 200,
    auto_lr_find = True,
    auto_scale_batch_size = True,
    gpus=1,
#     weights_summary="top",
    gradient_clip_val=0.1,
    limit_train_batches=30,  # coment in for training, running valiation every 30 batches
    # fast_dev_run=True,  # comment in to check that networkor dataset has no serious bugs
    callbacks=[lr_logger, early_stop_callback],
    logger=logger,
)


lstm_net = RecurrentNetwork.from_dataset(
    dataset = training,
    cell_type = 'LSTM', 
    hidden_size = 128, 
    rnn_layers = 2, 
    dropout = 0.1,
    loss = MAE(), 
)
print(f"Number of parameters in network: {lstm_net.size()/1e3:.1f}k")

to create a model and it went through without error. However when I try to train the model using code:

# fit network
trainer.fit(
    lstm_net,
    train_dataloaders=train_dataloader,
    val_dataloaders=val_dataloader,
)

It produces an error

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/pytorch_forecasting/metrics.py in update(self, y_pred, target, encoder_target, encoder_lengths)
854 # weight samples
855 if weight is not None:
--> 856 losses = losses * weight.unsqueeze(-1)
857
858 self._update_losses_and_lengths(losses, lengths)

RuntimeError: The size of tensor a (119) must match the size of tensor b (60) at non-singleton dimension 1

Can you tell me how to fix it? Or is there a way to apply weight to the losses (for different samples different weight is applied).
Thank you!

@fnavruzov
Copy link

I've encountered the same RuntimeError with shape mismatch when using AggregationMetric(some_metric) as loss.
To my understanding, it was connected with the predictions and actuals shape mismatch:

  • actuals are of shape (n_samples, n_timesteps), ndim = 2,
  • predictions (even when omitting quantile-based losses) are of shape (n_samples, n_timesteps, n_outputs), ndim = 3 -> which in my case was (batch_size, forecasting_horizon, 1), so check you model's output shape

Can you check whether original MAE() works for you?
if no - as a quick workaround I suggest adding in loss method last axis averaging:

def loss(self, y_pred, target):
        y_pred = self.to_prediction(y_pred)
        if y_pred.ndim == 3:
            # maybe some other checks
            y_pred = y_pred.mean(axis=-1)
        loss = (y_pred - target).abs()
        return loss

Hope it will help you to solve the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants