-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot determine batch_size
from a list of string while running range_test()
with val_loader
#57
Comments
The current code is like that so that it can handle last batches that don't have the same batch size. Your suggestion is cleaner and works if we force I'll try to think about this a bit more and come back tomorrow/next few days. But I think if we can't find a reasonable way of having both then we should do this change and document that |
After some experimenting, replacing if isinstance(inputs, tuple) or isinstance(inputs, list):
batch_size = len(inputs[0])
else:
batch_size = len(inputs) but not sure if this would fail for some other type of dataset. |
You are right, I forgot the case that I was thinking about making some changes in class DataLoaderIter(object):
def get_current_batch_size(self, batch):
# Users can override this according to their dataset
return batch.size(0)
# So that we can get batch size dynamically
class LRFinder(object):
def _validate(self, val_iter, non_blocking_transfer=True):
# ...
batch_size = val_iter.get_current_batch_size(batch)
# ... But I suddenly got an idea that size of each batch data is already accessible from the class LRFinder(object):
def _validate(self, val_iter, non_blocking_transfer=True):
# ...
with torch.no_grad():
for inputs, labels in val_iter:
# Move data to the correct device
inputs, labels = self._move_to_device(
inputs, labels, non_blocking=non_blocking_transfer
)
# Forward pass and loss computation
outputs = self.model(inputs)
loss = self.criterion(outputs, labels)
running_loss += loss.item() * labels.size()
# ... It seems like an easier solution for this issue. 🤔 |
ya that looks like a nice solution. From what I can see all loss functions expect the label/target to be a tensor. Could also replace Either way, you can create a PR with this change |
Great, I'll create a PR for it later. |
Merged #58. Thanks @NaleRaphael for raising the issue and fixing it |
It has been my pleasure to help. 😎 |
Hey @davidtvs, this issue is found while I was writing an example for utilizing this package with
huggingface/transformers
for #55 .Condition
Dataset
returns string)range_test()
withval_loader
Error message
Description
In current implementation,
batch_size
is determined dynamically according to the shape ofinputs
inLRFinder._validate()
. (v0.2.0) L399-L402 will work normally only when giveninputs
is atorch.tensor
. And that's why it failed wheninputs
is a list of string.Maybe it's not a usual case that
Dataset
returns non-torch.tensor
values, but I think it would be more easier to access it fromDataLoader.batch_size
since it's going to iterate aval_loader
inLRFinder._validate()
.Hence that I proposed a fix for this in that notebook, it's simply add a line
batch_size = val_iter.data_loader.batch_size
before entering the loop and remove those if-else statement, you can check it out here.But I'm having doubts about adding a property
batch_size
inDataLoaderIter
, e.g.With this property, proposed fix can be simplified a little into this:
What do you think of it?
The text was updated successfully, but these errors were encountered: