-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation loader flat loss #59
Comments
@Dewald928 Thanks for reporting this. After a quick run on that notebook ("examples/lrfinder_mnist.ipynb"), I found that running UPDATE: Currently, it seems there is something wrong in commit 52c189a. |
OK, I figure out the reason why In # @LRFinder.range_test()
# ...
for iteration in tqdm(range(num_iter)):
# Train on batch and retrieve loss
loss = self._train_batch(
train_iter,
accumulation_steps,
non_blocking_transfer=non_blocking_transfer,
)
if val_loader:
loss = self._validate(
val_iter, non_blocking_transfer=non_blocking_transfer
)
# ... However, # Track the best loss and smooth it if smooth_f is specified
if iteration == 0:
self.best_loss = loss
else:
if smooth_f > 0:
loss = smooth_f * loss + (1 - smooth_f) * self.history["loss"][-1]
if loss < self.best_loss:
self.best_loss = loss And that's why the lr-loss curve goes flat like the result you provided. I'll make a patch for it later. |
acquired by the syntax of normal `iterator` In `LRFinder.range_test()`, `val_iter` won't be reset after it runs out of values, and it makes `LRFinder._validate()` failed to work correctly after the first iteration of `range_test()`. To fix it, we add a counter to count the times a `ValDataLoaderIter` has run (i.e. times of `__next__()` is called). And reset it only when its `__iter__()` is called. So that it won't be reset automatically like the way `TrainDataLoaderIter` works. See also davidtvs#59 and the docstring of `ValDataLoaderIter` for further details.
… the syntax of normal `iterator` (#60) * MAINT: make `ValDataLoaderIter` able to be reset only when it is acquired by the syntax of normal `iterator` In `LRFinder.range_test()`, `val_iter` won't be reset after it runs out of values, and it makes `LRFinder._validate()` failed to work correctly after the first iteration of `range_test()`. To fix it, we add a counter to count the times a `ValDataLoaderIter` has run (i.e. times of `__next__()` is called). And reset it only when its `__iter__()` is called. So that it won't be reset automatically like the way `TrainDataLoaderIter` works. See also #59 and the docstring of `ValDataLoaderIter` for further details. * TST: add tests for `TrainDataLoaderIter` and `ValDataLoaderIter` * MAINT: remove redundant argument in `ValDataLoaderIter.__init__()` * TST: add tests to check valid types of `train_loader` and `val_loader` used in `range_test()` * TST: explicitly specify the message of exception we expect to get
Hey, man. Any progress on this issue? |
@NaleRaphael fixed te issue in #60 which has been merged to the master branch. I'm closing this issue, thanks for reporting it and feel free to reopen if needed |
Yeah, that was my bad, didn't see those commits. Updated the package, now the evaluation with valid loader takes much longer to finish, so I guess now it works. |
I copied your example notebook to colab and ran the code without changing anything. But the validation loss I get goes flat, which is clearly a mistake when compared to your example. I also experienced this with my other networks which do the same, the loss just goes flat.
You can see my results from colab and your example in the figures below.
EDIT: If I replace
val_iter
withval_loader
insideloss = self._validate(...)
it does seem to "work" as I'd expect. So somewhere there seems to be a mistake in how the val_iter is iterated.The text was updated successfully, but these errors were encountered: