What is trainer/global_step in wandb logging? #20377
Unanswered
edmcman
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment
-
Now that it's past the first epoch, it's clear that it wasn't the logger "falling behind". The second epoch started at approximately global step 7049, and according to the progress bar there are 27907 batches per epoch. So, again, a factor of four. Is this a bug? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
What does the value
trainer/global_step
mean in wandb logging?I am not using distributed training, and only looking during the first epoch. I would think that this is the number of batches processed, but it doesn't seem like it. What is it supposed to be? For example, the latest logged value is 1199. But my progress bar shows the current batch is 5000. I thought maybe the wandb logger is lagging a bit, but I doubt it is lagging that much. So where is this factor of about four difference coming from?
To make things concrete, I took these two screenshots at roughly the same time:
Beta Was this translation helpful? Give feedback.
All reactions