-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(train): in case of last batch <=2, move to validation if possible #3036
Conversation
…ining if possible
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3036 +/- ##
==========================================
- Coverage 84.96% 84.53% -0.44%
==========================================
Files 178 178
Lines 15048 15061 +13
==========================================
- Hits 12786 12732 -54
- Misses 2262 2329 +67
|
…rought the batch_size and drop_last in the data split validation
for more information, see https://pre-commit.ci
…_than1' into Ori-3035-fix_min_bathc_size_less_than1
for more information, see https://pre-commit.ci
…_than1' into Ori-3035-fix_min_bathc_size_less_than1
…c_size_less_than1
for more information, see https://pre-commit.ci
if batch_size is not None: | ||
num_of_cells = n_train % batch_size | ||
if (num_of_cells < 3 and num_of_cells > 0) and not ( | ||
num_of_cells == 1 and drop_last is True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one if confusing if drop_last it will drop the last batch no matter how many cells it contains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. this part is not about the drop_last logic.
this part of code is about when to show the warning. and a warning will be shown if we do have 1-2 cells in last batch but also when user didnt select drop_last with cell==1. whoever did it, doesnt need to see this warning, because it will not fail for him
only if also train_size_is_none there will be adaptive cell transferring to the validation set, if exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again the check doesn't need to be num_of_cells == 1 and drop_last is True. It should be: (num_of_cells < 3 and num_of_cells > 0) and drop_last is False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
if n_val > 0: | ||
n_val += num_of_cells | ||
warnings.warn( | ||
f"{num_of_cells} cells moved from training set to validation set", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to avoid error during training. Set train_size to a fixed size to avoid this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is 0.9 if given as None during init.
but as we discussed, we adapt the last batch only if it was in fact None, this is why we have the flag train_size_is_none
did you mean something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want a message how to avound moving cells from training to validation. The user avoids it when stating train_size=0.9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
f"Last batch will have a small size of {num_of_cells} " | ||
f"samples. Consider changing settings.batch_size or batch_size in model.train " | ||
f"from currently {batch_size} to avoid errors during model training, " | ||
f"or use drop_last parameter if there is 1 cell left", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one is wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats wrong?
f"samples. Consider changing settings.batch_size or batch_size in model.train " | ||
f"from currently {settings.batch_size} to avoid errors during model training " | ||
f"or change the given external indices accordingly or use drop_last parameter if " | ||
f"there is 1 cell left", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing. See comment above - line 179 should be gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
@@ -132,6 +164,20 @@ def validate_data_split_with_external_indexing( | |||
n_train = len(external_indexing[0]) | |||
n_val = len(external_indexing[1]) | |||
|
|||
if batch_size is not None: | |||
num_of_cells = n_train % batch_size | |||
if (num_of_cells < 3 and num_of_cells > 0) and not ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update drop_last here.
… to validation if possible
In case that train_size is None and the size of the last batch during training is <=2 , we adaptively move those samples from training to validation if possible. If train_size is set by user we do no fix this error and let the user change its train_size, selected indices or use drop last batch option.
close #3035