Passed the scheduling argument through the `*_generator` function. #7236

joeyearsley · 2017-07-04T19:47:24Z

Previously, the scheduling parameter was not passed to the OrderedEnqueuer, which was needed in training.
Since the scheduling parameter is currently a binary decision, I've also altered the value to a boolean and changed the kwarg to be more descriptive.

It also, seems we may need a shuffle function inside the Sequence call. Currently, only the input is shuffled, whilst this ensures some randomness this could be improved on via a function call.

Also, @fchollet do you remember why in the generators you passed different seeds? - This is also missing from the current code, however, it is trivial to implement.

Previously, the scheduling parameter was not passed to the OrderedEnqueuer, which was needed in training. Since the scheduling paramater is currently a binary decision, I've also altered the value to a boolean.

Dref360 · 2017-07-05T16:54:24Z

keras/engine/training.py

@@ -1932,7 +1940,8 @@ def evaluate_generator(self, generator, steps,
        try:
            if is_sequence:
                enqueuer = OrderedEnqueuer(generator,
-                                           use_multiprocessing=use_multiprocessing)
+                                           use_multiprocessing=use_multiprocessing,
+                                           ordered=ordered)


Does it make sense? evalute_generator(...,ordered=True) ==evalute_generator(...,ordered=False)

Same as the predict, to keep kwargs consistent.

Dref360 · 2017-07-05T16:55:04Z

keras/engine/training.py

@@ -2015,6 +2025,8 @@ def predict_generator(self, generator, steps,
                non picklable arguments to the generator
                as they can't be passed
                easily to children processes.
+            ordered: Sequential querying of data if `True`,
+                random otherwise.


What's the point of shuffling for predict?

That was mostly to keep kwargs consistent.

Dref360 · 2017-07-05T16:55:46Z

keras/utils/data_utils.py

        self.sequence = sequence
        self.use_multiprocessing = use_multiprocessing
-        self.scheduling = scheduling
+        self.ordered = ordered


Should we test that ordered works? Probably.

What do you mean test it worked?

Just a test that would validate that if ordered=False, there is some shuffling. Nothing fancy, but I could prevent some bugs in the future.

Sure, I'll add it in, good spot.

Dref360 · 2017-07-05T16:56:03Z

keras/engine/training.py

@@ -1680,6 +1681,8 @@ def fit_generator(self, generator,
                non picklable arguments to the generator
                as they can't be passed
                easily to children processes.
+            ordered: Sequential querying of data if `True`,


Only for Sequence. No effect for Generators

fchollet · 2017-07-10T21:14:05Z

We already have a shuffle argument in fit. Is this the same thing?

Dref360 · 2017-07-10T21:38:40Z

Pretty much, but for Sequence. So every epoch the order of the batches will be shuffled.

fchollet · 2017-07-10T21:45:43Z

1 - if this only affects Sequence, shouldn't it be part of its API, and not part of the *_generator API?
2 - the API should be consistent across the codebase, so we should use shuffle, not ordered

Dref360 · 2017-07-10T21:51:54Z

So a property in Sequence that the Enqueuer can poke?

My only concern about that would be that it's the Enqueuer responsibility to know how to iterate over the Sequence.

joeyearsley · 2017-07-11T10:53:04Z

1 - had to pass through the *_generator methods, since the OrderedEnqueuer isn't accessible else where, unless we make users pass the enqueuer into these methods? - the plus is that it will be even more flexible, down side is more work for the user.

2 - I'll alter when I'm back from vacation

joeyearsley · 2017-07-16T20:34:39Z

@fchollet It keeps timing out on this test: tests/keras/wrappers/scikit_learn_test.py::test_regression_class_build_fn

However, I haven't touched anything related to it. How can I fix it?

joeyearsley · 2017-07-18T18:13:09Z

@Dref360 Thanks for the suggestion!

fchollet · 2017-07-18T22:10:26Z

keras/engine/training.py

@@ -1680,6 +1681,9 @@ def fit_generator(self, generator,
                non picklable arguments to the generator
                as they can't be passed
                easily to children processes.
+            shuffle: whether to shuffle the data at the beginning of each
+                epoch. Only used with instances of Sequence (


Use code markers around code keywords (`)

fchollet · 2017-07-18T22:10:58Z

keras/utils/data_utils.py

@@ -351,6 +351,12 @@ def __len__(self):
        """
        raise NotImplementedError

+    @abstractmethod
+    def on_epoch_end(self):
+        """A function which is called at the end of the epoch.


"Method called at the end of every epoch."

fchollet · 2017-07-18T22:11:40Z

keras/utils/data_utils.py

@@ -351,6 +351,12 @@ def __len__(self):
        """
        raise NotImplementedError

+    @abstractmethod
+    def on_epoch_end(self):


The addition of this method appears unrelated to the shuffle arg?

It is and it isn't.
The shuffle arg relates to the shuffling of the batch indices in the sequences, however, you may also wish to implement file path shuffling in the Sequence method. This would allow that by placing that shuffle operator into the on_epoch_end method.
The path shuffling would ensure that the files inside the batches differed as well as the batch ordering.

* commit '84ceb94055b831c486dbf4955fdf1ba0f63320d1': (42 commits) Fix conv reccurent test Style fix in conv recurrent tests. Support return_state parameter in ConvRecurrent2D (keras-team#7407) Small simplification in ResNet50 architecture Update FAQ with info about custom object loading. add example for passing in custom objects in load_model (keras-team#7420) Update applications.md (keras-team#7428) Cast constants in optimizer as floatx. Fix stop_gradient inconsistent API (keras-team#7416) Simplify static shape management in TF backend. Fixed warning showing up when channel axis is 1 (keras-team#7392) Throw exception in LSTM layer if timesteps=1 and unroll=True (keras-team#7387) Style fix Passed the scheduling argument through the `*_generator` function. (keras-team#7236) Fix typos. (keras-team#7374) Fix ImageDataGenerator.standardize to support batches (keras-team#7360) Fix learning phase info being left out in multi-input models (keras-team#7135) Fix PEP8 Fix deserialization bug with layer sharing at heterogenous depths Bug fix: Support multiple outputs in Lambda layer (keras-team#7222) ...

joeyearsley added 3 commits July 4, 2017 20:45

Added keyword argument for *_generator methods

e822ea4

Previously, the scheduling parameter was not passed to the OrderedEnqueuer, which was needed in training. Since the scheduling paramater is currently a binary decision, I've also altered the value to a boolean.

updated docstring to be consistent

89574dd

Added Shuffling to Sequence Class

c624dac

Dref360 reviewed Jul 5, 2017

View reviewed changes

joeyearsley added 3 commits July 6, 2017 09:33

Updated to reflect @Dref360 's comments

79f7fd7

Added tests for ordering

dd92926

PEP8

6fb89aa

joeyearsley added 2 commits July 16, 2017 20:18

updated kwarg to be shuffle

4bf18aa

PEP8

592b680

joeyearsley added 2 commits July 18, 2017 11:58

Pass Tests

c292721

PEP*

387634b

joeyearsley closed this Jul 18, 2017

joeyearsley reopened this Jul 18, 2017

Fixed

4700413

fchollet reviewed Jul 18, 2017

View reviewed changes

joeyearsley added 2 commits July 18, 2017 23:29

Updated with suggestions

9679803

Merge branch 'master' of https://github.com/joeyearsley/keras

4b396d4

fchollet approved these changes Jul 19, 2017

View reviewed changes

fchollet merged commit 78f26df into keras-team:master Jul 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Passed the scheduling argument through the `*_generator` function. #7236

Passed the scheduling argument through the `*_generator` function. #7236

joeyearsley commented Jul 4, 2017 •

edited

Loading

Dref360 Jul 5, 2017

joeyearsley Jul 5, 2017

Dref360 Jul 5, 2017

joeyearsley Jul 5, 2017

Dref360 Jul 5, 2017 •

edited

Loading

joeyearsley Jul 5, 2017

Dref360 Jul 5, 2017

joeyearsley Jul 6, 2017

Dref360 Jul 5, 2017

fchollet commented Jul 10, 2017

Dref360 commented Jul 10, 2017

fchollet commented Jul 10, 2017

Dref360 commented Jul 10, 2017 •

edited

Loading

joeyearsley commented Jul 11, 2017

joeyearsley commented Jul 16, 2017

joeyearsley commented Jul 18, 2017

fchollet Jul 18, 2017

fchollet Jul 18, 2017

fchollet Jul 18, 2017

joeyearsley Jul 18, 2017

Passed the scheduling argument through the *_generator function. #7236

Passed the scheduling argument through the *_generator function. #7236

Conversation

joeyearsley commented Jul 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dref360 Jul 5, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fchollet commented Jul 10, 2017

Dref360 commented Jul 10, 2017

fchollet commented Jul 10, 2017

Dref360 commented Jul 10, 2017 • edited Loading

joeyearsley commented Jul 11, 2017

joeyearsley commented Jul 16, 2017

joeyearsley commented Jul 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Passed the scheduling argument through the `*_generator` function. #7236

Passed the scheduling argument through the `*_generator` function. #7236

joeyearsley commented Jul 4, 2017 •

edited

Loading

Dref360 Jul 5, 2017 •

edited

Loading

Dref360 commented Jul 10, 2017 •

edited

Loading