training_epoch_end's outputs doesn't have 'loss' key #2372

xiadingZ · 2020-06-26T13:35:38Z

pytorch-lightning: build from master

Traceback (most recent call last):
  File "main.py", line 140, in <module>
    main(hparams)
  File "main.py", line 72, in main
    trainer.fit(model)
  File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 881, in fit
    self.ddp_train(task, model)
  File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/distrib_data_parallel.py", line 539, in ddp_train
    self.run_pretrain_routine(model)
  File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1091, in run_pretrain_routine
    self.train()
  File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 376, in train
    self.run_training_epoch()
  File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 510, in run_training_epoch
    self.run_training_epoch_end(epoch_output)
  File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 535, in run_training_epoch_end
    epoch_output = model.training_epoch_end(epoch_output)
  File "/mnt/lustre/maxiao1/PVM/models/baseline.py", line 335, in training_epoch_end
    avg_loss = torch.stack([x['loss'] for x in outputs]).mean()
  File "/mnt/lustre/maxiao1/PVM/models/baseline.py", line 335, in <listcomp>
    avg_loss = torch.stack([x['loss'] for x in outputs]).mean()
KeyError: 'loss'

This is my code:

    def training_step(self, batch, batch_idx):
        ...
        return {'loss': loss, "train_acc": acc}

    def training_epoch_end(self, outputs):
        avg_loss = torch.stack([x['loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['train_acc'] for x in outputs]).mean()
        logs = {'loss': avg_loss, 'train_acc': avg_acc}
        progress_bar = {'train_loss': avg_loss, 'train_acc': avg_acc}
        results = {
            'log': logs,
            'progress_bar': progress_bar
        }
        return results

The text was updated successfully, but these errors were encountered:

rohitgr7 · 2020-06-26T21:54:01Z

Try: avg_loss = torch.stack([x['batch_loss'] for x in outputs]).mean()

xiadingZ · 2020-06-27T01:35:18Z

Thanks， it works
but 'train_acc' key doesn't exist, neither do batch_train_acc. How to access other keys returned in training_step?

rohitgr7 · 2020-06-27T11:35:13Z

As of now in lightning you can access them using x['callback_metrics']['loss'] and x['callback_metrics']['train_acc'], but I think it should be handled in a similar way we do this with validation_epoch_end and test_epoch_end.

Pet222 · 2020-06-29T16:47:24Z

Hi! One hint: for me it works with "loss" under windows but not under ubuntu.

rohitgr7 · 2020-06-29T17:20:30Z

Weird!! Why is this think platform dependent?? 🤔

Red-Eyed · 2020-06-30T07:45:37Z

@Pet222 , are u sure that versions on ubuntu and windows are same?

captainvera · 2020-06-30T10:29:19Z

Hey @williamFalcon is this intended behaviour? I was surprised to see this breaking change being introduced with no warning.
If it is intended, why not have consistent behaviour over validation_epoch_end and test_epoch_end.

If it is not intended, as it seems due to the "bug fix" tag, are you working on it or should I make a PR for this?

williamFalcon · 2020-06-30T11:43:59Z

what is the behavior? that the "loss" key is not in training_epoch_end? If so, that's a bug because it should be there

Red-Eyed · 2020-06-30T11:49:52Z

@williamFalcon , on the latest version, the loss key was changed to the batch_loss. I think it was changed here

captainvera · 2020-06-30T11:50:10Z

Yes, the fact that you need to access it through 'callback metrics'. Got it!

On Tue, 30 Jun 2020 at 12:44, William Falcon ***@***.***> wrote: what is the behavior? that the "loss" key is not in training_epoch_end? If so, that's a bug because it should be there — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2372 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKWP6XTUJDTEDJ2NZQ3RKTRZHFY5ANCNFSM4OJKX4KQ> .

-- Best Regards, Miguel Vera +351 915 198 452 miguel.coimbra.vera@protonmail.com Github/Captainvera <http://www.github.com/captainvera>

williamFalcon · 2020-06-30T12:19:01Z

@captainvera would love a PR :)

williamFalcon · 2020-06-30T13:24:38Z

@captainvera @xiadingZ sorry about that! it was a bad bug.

Made a PR #2428 and added tests to make sure this doesn't happen again!

try master now!
we’ll push a new minor again since this is a key bug (and we have a few other key bugs)

captainvera · 2020-06-30T14:11:11Z

Well, that was fast, thanks!

xiadingZ added bug Something isn't working help wanted Open to be worked on labels Jun 26, 2020

williamFalcon self-assigned this Jun 26, 2020

williamFalcon added the priority: 0 High priority task label Jun 26, 2020

williamFalcon mentioned this issue Jun 30, 2020

Fixes train outputs #2428

Merged

williamFalcon closed this as completed in #2428 Jun 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training_epoch_end's outputs doesn't have 'loss' key #2372

training_epoch_end's outputs doesn't have 'loss' key #2372

xiadingZ commented Jun 26, 2020 •

edited

Loading

rohitgr7 commented Jun 26, 2020

xiadingZ commented Jun 27, 2020 •

edited

Loading

rohitgr7 commented Jun 27, 2020 •

edited

Loading

Pet222 commented Jun 29, 2020

rohitgr7 commented Jun 29, 2020

Red-Eyed commented Jun 30, 2020

captainvera commented Jun 30, 2020

williamFalcon commented Jun 30, 2020

Red-Eyed commented Jun 30, 2020 •

edited

Loading

captainvera commented Jun 30, 2020 via email

williamFalcon commented Jun 30, 2020

williamFalcon commented Jun 30, 2020 •

edited

Loading

captainvera commented Jun 30, 2020

training_epoch_end's outputs doesn't have 'loss' key #2372

training_epoch_end's outputs doesn't have 'loss' key #2372

Comments

xiadingZ commented Jun 26, 2020 • edited Loading

rohitgr7 commented Jun 26, 2020

xiadingZ commented Jun 27, 2020 • edited Loading

rohitgr7 commented Jun 27, 2020 • edited Loading

Pet222 commented Jun 29, 2020

rohitgr7 commented Jun 29, 2020

Red-Eyed commented Jun 30, 2020

captainvera commented Jun 30, 2020

williamFalcon commented Jun 30, 2020

Red-Eyed commented Jun 30, 2020 • edited Loading

captainvera commented Jun 30, 2020 via email

williamFalcon commented Jun 30, 2020

williamFalcon commented Jun 30, 2020 • edited Loading

captainvera commented Jun 30, 2020

xiadingZ commented Jun 26, 2020 •

edited

Loading

xiadingZ commented Jun 27, 2020 •

edited

Loading

rohitgr7 commented Jun 27, 2020 •

edited

Loading

Red-Eyed commented Jun 30, 2020 •

edited

Loading

williamFalcon commented Jun 30, 2020 •

edited

Loading