-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Passed the scheduling argument through the *_generator
function.
#7236
Conversation
Previously, the scheduling parameter was not passed to the OrderedEnqueuer, which was needed in training. Since the scheduling paramater is currently a binary decision, I've also altered the value to a boolean.
keras/engine/training.py
Outdated
@@ -1932,7 +1940,8 @@ def evaluate_generator(self, generator, steps, | |||
try: | |||
if is_sequence: | |||
enqueuer = OrderedEnqueuer(generator, | |||
use_multiprocessing=use_multiprocessing) | |||
use_multiprocessing=use_multiprocessing, | |||
ordered=ordered) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense? evalute_generator(...,ordered=True) ==evalute_generator(...,ordered=False)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as the predict, to keep kwargs consistent.
keras/engine/training.py
Outdated
@@ -2015,6 +2025,8 @@ def predict_generator(self, generator, steps, | |||
non picklable arguments to the generator | |||
as they can't be passed | |||
easily to children processes. | |||
ordered: Sequential querying of data if `True`, | |||
random otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of shuffling for predict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was mostly to keep kwargs consistent.
keras/utils/data_utils.py
Outdated
self.sequence = sequence | ||
self.use_multiprocessing = use_multiprocessing | ||
self.scheduling = scheduling | ||
self.ordered = ordered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we test that ordered works? Probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean test it worked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a test that would validate that if ordered=False, there is some shuffling. Nothing fancy, but I could prevent some bugs in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add it in, good spot.
keras/engine/training.py
Outdated
@@ -1680,6 +1681,8 @@ def fit_generator(self, generator, | |||
non picklable arguments to the generator | |||
as they can't be passed | |||
easily to children processes. | |||
ordered: Sequential querying of data if `True`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only for Sequence. No effect for Generators
We already have a |
Pretty much, but for Sequence. So every epoch the order of the batches will be shuffled. |
1 - if this only affects |
So a property in Sequence that the Enqueuer can poke? My only concern about that would be that it's the Enqueuer responsibility to know how to iterate over the Sequence. |
1 - had to pass through the *_generator methods, since the OrderedEnqueuer isn't accessible else where, unless we make users pass the enqueuer into these methods? - the plus is that it will be even more flexible, down side is more work for the user. 2 - I'll alter when I'm back from vacation |
@fchollet It keeps timing out on this test: However, I haven't touched anything related to it. How can I fix it? |
@Dref360 Thanks for the suggestion! |
keras/engine/training.py
Outdated
@@ -1680,6 +1681,9 @@ def fit_generator(self, generator, | |||
non picklable arguments to the generator | |||
as they can't be passed | |||
easily to children processes. | |||
shuffle: whether to shuffle the data at the beginning of each | |||
epoch. Only used with instances of Sequence ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use code markers around code keywords (`)
keras/utils/data_utils.py
Outdated
@@ -351,6 +351,12 @@ def __len__(self): | |||
""" | |||
raise NotImplementedError | |||
|
|||
@abstractmethod | |||
def on_epoch_end(self): | |||
"""A function which is called at the end of the epoch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Method called at the end of every epoch."
@@ -351,6 +351,12 @@ def __len__(self): | |||
""" | |||
raise NotImplementedError | |||
|
|||
@abstractmethod | |||
def on_epoch_end(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of this method appears unrelated to the shuffle
arg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is and it isn't.
The shuffle
arg relates to the shuffling of the batch indices in the sequences, however, you may also wish to implement file path shuffling in the Sequence method. This would allow that by placing that shuffle operator into the on_epoch_end method.
The path shuffling would ensure that the files inside the batches differed as well as the batch ordering.
* commit '84ceb94055b831c486dbf4955fdf1ba0f63320d1': (42 commits) Fix conv reccurent test Style fix in conv recurrent tests. Support return_state parameter in ConvRecurrent2D (keras-team#7407) Small simplification in ResNet50 architecture Update FAQ with info about custom object loading. add example for passing in custom objects in load_model (keras-team#7420) Update applications.md (keras-team#7428) Cast constants in optimizer as floatx. Fix stop_gradient inconsistent API (keras-team#7416) Simplify static shape management in TF backend. Fixed warning showing up when channel axis is 1 (keras-team#7392) Throw exception in LSTM layer if timesteps=1 and unroll=True (keras-team#7387) Style fix Passed the scheduling argument through the `*_generator` function. (keras-team#7236) Fix typos. (keras-team#7374) Fix ImageDataGenerator.standardize to support batches (keras-team#7360) Fix learning phase info being left out in multi-input models (keras-team#7135) Fix PEP8 Fix deserialization bug with layer sharing at heterogenous depths Bug fix: Support multiple outputs in Lambda layer (keras-team#7222) ...
Previously, the scheduling parameter was not passed to the OrderedEnqueuer, which was needed in training.
Since the scheduling parameter is currently a binary decision, I've also altered the value to a boolean and changed the kwarg to be more descriptive.
It also, seems we may need a shuffle function inside the Sequence call. Currently, only the input is shuffled, whilst this ensures some randomness this could be improved on via a function call.
Also, @fchollet do you remember why in the generators you passed different seeds? - This is also missing from the current code, however, it is trivial to implement.