awslabs · roywei · May 18, 2018 · Apr 3, 2018 · Apr 3, 2018 · Apr 3, 2018
diff --git a/benchmark/README.md b/benchmark/README.md
@@ -62,28 +62,28 @@ NOTE:
 
 | Instance Type | GPUs  | Batch Size  | Keras-MXNet (img/sec)  | Keras-TensorFlow (img/sec)  |
 |---|---|---|---|---|
-|  P3.8X Large | 1  | 32  | 165  | 50  |
-|  P3.8X Large |  4 |  128 | 538  | 162  |
-|  P3.16X Large | 8  | 256  | 728  | 212  |
+|  P3.8X Large | 1  | 32  | 135  | 52  |
+|  P3.8X Large |  4 |  128 | 536  | 162  |
+|  P3.16X Large | 8  | 256  | 722  | 211  |
 
 #### ResNet50-Synthetic Data
 
 | Instance Type | GPUs  | Batch Size  | Keras-MXNet (img/sec)  | Keras-TensorFlow (img/sec)  |
 |---|---|---|---|---|
-|  C5.18X Large | 0  | 32  | 9  | 4  |
-|  P3.8X Large |  1 |  32 | 229  | 164  |
-|  P3.8X Large |  4 |  128 | 728  | 409  |
-|  P3.16X Large | 8  | 256  | 963  | 164  |
+|  C5.18X Large | 0  | 32  | 13  | 4  |
+|  P3.8X Large |  1 |  32 | 194  | 184  |
+|  P3.8X Large |  4 |  128 | 764  | 393  |
+|  P3.16X Large | 8  | 256  | 1068  | 261  |
 
 
 #### ResNet50-CIFAR10
 
 | Instance Type | GPUs  | Batch Size  | Keras-MXNet (img/sec)  | Keras-TensorFlow (img/sec)  |
 |---|---|---|---|---|
 |  C5.18X Large | 0  | 32  | 87  | 59  |
-|  P3.8X Large | 1  | 32  | TBD  | TBD  |
-|  P3.8X Large |  4 |  128 | 1792  | 1020  |
-|  P3.16X Large | 8  | 256  | 1618  | 962  |
+|  P3.8X Large | 1  | 32  | 831  | 509  |
+|  P3.8X Large |  4 |  128 | 1783  | 699  |
+|  P3.16X Large | 8  | 256  | 1680  | 435  |
 
 
 You can see more benchmark experiments with different instance types, batch_size and other parameters in [detailed CNN 
@@ -210,10 +210,10 @@ For MXNet backend benchmarks:
 
 For TensorFlow backend benchmarks:
 ```
-    $ sh run_tf_backend.sh cpu_config lstm_nietzsche False 20 # For CPU Benchmarks
-    $ sh run_tf_backend.sh gpu_config lstm_nietzsche False 20 # For 1 GPU Benchmarks
-    $ sh run_tf_backend.sh 4_gpu_config lstm_nietzsche False 20 # For 4 GPU Benchmarks
-    $ sh run_tf_backend.sh 8_gpu_config lstm_nietzsche False 20 # For 8 GPU Benchmarks
+    $ sh run_tf_backend.sh cpu_config lstm_nietzsche False 10 # For CPU Benchmarks
+    $ sh run_tf_backend.sh gpu_config lstm_nietzsche False 10 # For 1 GPU Benchmarks
+    $ sh run_tf_backend.sh 4_gpu_config lstm_nietzsche False 10 # For 4 GPU Benchmarks
+    $ sh run_tf_backend.sh 8_gpu_config lstm_nietzsche False 10 # For 8 GPU Benchmarks
 ```
 
 #### LSTM-WikiText2
@@ -230,10 +230,10 @@ For MXNet backend benchmarks:
 
 For TensorFlow backend benchmarks:
 ```
-    $ sh run_tf_backend.sh cpu_config lstm_wikitext2 False 20 # For CPU Benchmarks
-    $ sh run_tf_backend.sh gpu_config lstm_wikitext2 False 20 # For 1 GPU Benchmarks
-    $ sh run_tf_backend.sh 4_gpu_config lstm_wikitext2 False 20 # For 4 GPU Benchmarks
-    $ sh run_tf_backend.sh 8_gpu_config lstm_wikitext2 False 20 # For 8 GPU Benchmarks
+    $ sh run_tf_backend.sh cpu_config lstm_wikitext2 False 10 # For CPU Benchmarks
+    $ sh run_tf_backend.sh gpu_config lstm_wikitext2 False 10 # For 1 GPU Benchmarks
+    $ sh run_tf_backend.sh 4_gpu_config lstm_wikitext2 False 10 # For 4 GPU Benchmarks
+    $ sh run_tf_backend.sh 8_gpu_config lstm_wikitext2 False 10 # For 8 GPU Benchmarks
 ```
 
 
@@ -251,10 +251,10 @@ For MXNet backend benchmarks:
 
 For TensorFlow backend benchmarks:
 ```
-    $ sh run_tf_backend.sh cpu_config lstm_synthetic False 20 # For CPU Benchmarks
-    $ sh run_tf_backend.sh gpu_config lstm_synthetic False 20 # For 1 GPU Benchmarks
-    $ sh run_tf_backend.sh 4_gpu_config lstm_synthetic False 20 # For 4 GPU Benchmarks
-    $ sh run_tf_backend.sh 8_gpu_config lstm_synthetic False 20 # For 8 GPU Benchmarks
+    $ sh run_tf_backend.sh cpu_config lstm_synthetic False 10 # For CPU Benchmarks
+    $ sh run_tf_backend.sh gpu_config lstm_synthetic False 10 # For 1 GPU Benchmarks
+    $ sh run_tf_backend.sh 4_gpu_config lstm_synthetic False 10 # For 4 GPU Benchmarks
+    $ sh run_tf_backend.sh 8_gpu_config lstm_synthetic False 10 # For 8 GPU Benchmarks
 ```
 
 ## References

diff --git a/benchmark/benchmark_result/RNN_result.md b/benchmark/benchmark_result/RNN_result.md
@@ -9,16 +9,24 @@
 
 Please see [RNN with Keras-MXNet document](../docs/mxnet_backend/using_rnn_with_mxnet_backend.md) for more details on
  the poor CPU training performance and unsupported functionalities. 
-
+
+ ### Configuration
+|                  |                                                              |
+| :--------------- | :----------------------------------------------------------- |
+| Keras            | v2.1.6                                                       |
+| TensorFlow       | v1.8.0                                                       |
+| MXNet            | v1.2.0                                                       |
+| CUDA             | v9.0.176                                                     |
+| cuDNN            | v7.0.1                                                       |
 
 ### LSTM-Nietzsche
 
 | Instance Type | GPUs  | Batch Size  | Keras-MXNet (Time/Epoch), (GPU Mem)   | Keras-TensorFlow (Time/Epoch), (GPU Mem)   |
 |---|---|---|---|---|
 |  C5.18X Large | 0  | 128  | 78 sec, N/A | 55 sec, N/A|
-|  P3.8X Large |  1 |  128 | 52 sec, 792 MB | 51 sec, 15360 MB|
-|  P3.8X Large | 4  | 128  | 47 sec, 770 MB | 87 sec, 15410 MB |
-|  P3.16X Large | 8  | 128  | TBD | TBD |
+|  P3.8X Large |  1 |  128 | 52 sec, 792 MB | 83 sec, 15360 MB|
+|  P3.8X Large | 4  | 128  | 47 sec, 770 MB | 117 sec, 15410 MB |
+|  P3.16X Large | 8  | 128  | 72 sec, 826 MB | 183sec, 15408TBD |
 
 ### LSTM-WikiText2
 
@@ -27,7 +35,7 @@ Please see [RNN with Keras-MXNet document](../docs/mxnet_backend/using_rnn_with_
 |  C5.18X Large | 0  | 128  | 1345 sec, N/A  | 875, N/A  |
 |  P3.8X Large |  1 |  128 | 868 sec, 772 MB | 817, 15360 MB  |
 |  P3.8X Large | 4  | 128  | 775 sec, 764 MB | 1468, 15410 MB  |
-|  P3.16X Large | 8  | 128  | TBD | TBD |
+|  P3.16X Large | 8  | 128  | 1214 sec, 826 MB | 3176 sec, 15410 MB |
 
 ### Synthetic Data
 
@@ -36,7 +44,7 @@ Please see [RNN with Keras-MXNet document](../docs/mxnet_backend/using_rnn_with_
 |  C5.18X Large | 0  | 128  | 24 sec, N/A | 14 sec, N/A|
 |  P3.8X Large |  1 |  128 | 13 sec, 792 MB | 12 sec, 15360 MB|
 |  P3.8X Large | 4  | 128  | 12 sec, 770 MB | 21 sec, 15410 MB |
-|  P3.16X Large | 8  | 128  | TBD | TBD |
+|  P3.16X Large | 8  | 128  | 19 sec, 826 MB | 49 sec, 15360 MB |
 
 
 # Detailed RNN Benchmark Results

diff --git a/benchmark/scripts/benchmark_resnet.py b/benchmark/scripts/benchmark_resnet.py
@@ -86,8 +86,7 @@
 
 # prepare logging
 # file name: backend_data_format_dataset_model_batch_size_gpus.log
-log_file = K.backend() + '_' + K.image_data_format() + '_' + args.dataset + '_resnet_v' + args.version + \
-           '_' + args.layers + '_batch_size' + str(batch_size) + '_' + str(num_gpus) + 'gpus'
+log_file = K.backend() + '_' + K.image_data_format() + '_' + args.dataset + '_resnet_v' + args.version + '_' + args.layers + '_batch_size' + str(batch_size) + '_' + str(num_gpus) + 'gpus'  # nopep8
 logFormatter = logging.Formatter('%(asctime)s [%(threadName)-12.12s] [%(levelname)-5.5s]  %(message)s')
 rootLogger = logging.getLogger()
 
@@ -300,9 +299,9 @@ def lr_schedule(epoch):
                 batch_time = 1000 * (end_time - start_time)
                 speed = batch_size * 1000.0 / batch_time if batch_time != 0 else 0
                 rootLogger.info('batch {}/{} loss: {} accuracy: {} '
-                             'time: {}ms speed: {}'.format(int(current_index / batch_size),
-                                                           int(nice_n / batch_size), loss, accuracy,
-                                                           batch_time, speed))
+                                'time: {}ms speed: {}'.format(int(current_index / batch_size),
+                                                              int(nice_n / batch_size), loss, accuracy,
+                                                              batch_time, speed))
 
             rootLogger.info('finish epoch {}/{}  total epoch time: {}ms'.format(i, epochs, total_time))
 
@@ -323,4 +322,4 @@ def lr_schedule(epoch):
 # Score trained model.
 scores = model.evaluate(x_test, y_test, verbose=1)
 rootLogger.info('Test loss: %.4f' % scores[0])
-rootLogger.info('Test accuracy: %.4f'% scores[1])
+rootLogger.info('Test accuracy: %.4f' % scores[1])
diff --git a/benchmark/scripts/models/lstm_synthetic.py b/benchmark/scripts/models/lstm_synthetic.py
@@ -44,9 +44,7 @@ def __init__(self):
     def run_benchmark(self, gpus=0, inference=False, use_dataset_tensors=False, epochs=20):
         # prepare logging
         # file name: backend_data_format_dataset_model_batch_size_gpus.log
-        log_file = keras.backend.backend() + '_' + keras.backend.image_data_format() + \
-                   '_lstm_synthetic_batch_size_' + \
-                   str(self.batch_size) + '_' + str(gpus) + 'gpus.log'
+        log_file = keras.backend.backend() + '_' + keras.backend.image_data_format() + '_lstm_synthetic_batch_size_' + str(self.batch_size) + '_' + str(gpus) + 'gpus.log'  # nopep8
         logging.basicConfig(level=logging.INFO, filename=log_file)
 
         self.epochs = epochs

diff --git a/benchmark/scripts/models/lstm_text_generation.py b/benchmark/scripts/models/lstm_text_generation.py
@@ -49,9 +49,7 @@ def __init__(self, dataset_name=None):
     def run_benchmark(self, gpus=0, inference=False, use_dataset_tensors=False, epochs=20):
         # prepare logging
         # file name: backend_data_format_dataset_model_batch_size_gpus.log
-        log_file = keras.backend.backend() + '_' + keras.backend.image_data_format() + \
-                   '_lstm_test_generation_' + self.dataset_name + '_batch_size_' + \
-                   str(self.batch_size) + '_' + str(gpus) + 'gpus.log'
+        log_file = keras.backend.backend() + '_' + keras.backend.image_data_format() + '_lstm_test_generation_' + self.dataset_name + '_batch_size_' + str(self.batch_size) + '_' + str(gpus) + 'gpus.log'  # nopep8
         logging.basicConfig(level=logging.INFO, filename=log_file)
 
         self.epochs = epochs

diff --git a/benchmark/scripts/models/resnet50_benchmark.py b/benchmark/scripts/models/resnet50_benchmark.py
@@ -17,8 +17,6 @@
 from keras import backend as K
 
 
-
-
 def crossentropy_from_logits(y_true, y_pred):
     return keras.backend.categorical_crossentropy(target=y_true,
                                                   output=y_pred,
@@ -39,12 +37,11 @@ def __init__(self):
     def run_benchmark(self, gpus=0, inference=False, use_dataset_tensors=False, epochs=20):
         self.epochs = epochs
         if gpus > 1:
-            self.batch_size = self.batch_size*gpus
+            self.batch_size = self.batch_size * gpus
 
         # prepare logging
         # file name: backend_data_format_dataset_model_batch_size_gpus.log
-        log_file = K.backend() + '_' + K.image_data_format() + '_synthetic_resnet50_batch_size_' + \
-                   str(self.batch_size) + '_' + str(gpus) + 'gpus.log'
+        log_file = K.backend() + '_' + K.image_data_format() + '_synthetic_resnet50_batch_size_' + str(self.batch_size) + '_' + str(gpus) + 'gpus.log'  # nopep8
         logging.basicConfig(level=logging.INFO, filename=log_file)
 
         print("Running model ", self.test_name)

diff --git a/keras/__init__.py b/keras/__init__.py
@@ -23,4 +23,4 @@
 from .models import Model
 from .models import Sequential
 
-__version__ = '2.1.6'
+__version__ = '2.1.6'
diff --git a/keras/backend/mxnet_backend.py b/keras/backend/mxnet_backend.py
@@ -2593,13 +2593,13 @@ def rnn(step_function, inputs, initial_states,
             'Ex: new_x_train = keras.preprocessing.sequence.pad_sequences(old_x_train, '
             'maxlen=MAX_LEN_OF_INPUT_SAMPLE_TYPE_INT). '
             'More Details - '
-            'https://github.com/awslabs/keras-apache-mxnet/tree/master/docs/mxnet_backend/using_rnn_with_mxnet_backend.md')  #nopep8
+            'https://github.com/awslabs/keras-apache-mxnet/tree/master/docs/mxnet_backend/using_rnn_with_mxnet_backend.md')  # nopep8
 
     if not unroll and dshape[1] is not None:
         warnings.warn('MXNet Backend: `unroll=False` is not supported yet in RNN. Since the input_shape is known, '
                       'setting `unroll=True` and continuing the execution.'
                       'More Details - '
-                      'https://github.com/awslabs/keras-apache-mxnet/tree/master/docs/mxnet_backend/using_rnn_with_mxnet_backend.md',  stacklevel=2)  #nopep8
+                      'https://github.com/awslabs/keras-apache-mxnet/tree/master/docs/mxnet_backend/using_rnn_with_mxnet_backend.md', stacklevel=2)  # nopep8
 
     # Split the inputs across time dimension and generate the list of inputs
     # with shape `(samples, ...)` (no time dimension)

diff --git a/keras/utils/np_utils.py b/keras/utils/np_utils.py
@@ -84,34 +84,27 @@ def to_channels_first_helper(np_data):
         if data_format not in {'channels_first', 'channels_last'}:
             raise ValueError('Unknown data_format ' + str(data_format))
 
-        if data_format == 'channels_first':
-            shape = np_data.shape
-            if len(shape) == 5:
-                np_data = np.transpose(np_data, (0, 4, 1, 2, 3))
-            elif len(shape) == 4:
-                np_data = np.transpose(np_data, (0, 3, 1, 2))
-            elif len(shape) == 3:
-                raise ValueError(
-                    'Your data is either a textual data of shape '
-                    '`(num_sample, step, feature)` or a grey scale image of '
-                    'shape `(num_sample, rows, cols)`. '
-                    'Case 1: If your data is time-series or a textual data'
-                    '(probably you are using Conv1D), then there is no need of '
-                    'channel conversion.'
-                    'Case 2: If your data is image(probably you are using '
-                    'Conv2D), then you need to reshape the tension dimensions '
-                    'as follows:'
-                    '`shape = x_input.shape`'
-                    '`x_input = x_input.reshape(shape[0], 1, shape[1], shape[2])`'
-                    'Note: Do not use `to_channels_fir()` in above cases.')
-            else:
-                raise ValueError('Your input dimension tensor is incorrect.')
+        shape = np_data.shape
+        if len(shape) == 5:
+            np_data = np.transpose(np_data, (0, 4, 1, 2, 3))
+        elif len(shape) == 4:
+            np_data = np.transpose(np_data, (0, 3, 1, 2))
+        elif len(shape) == 3:
+            raise ValueError(
+                'Your data is either a textual data of shape '
+                '`(num_sample, step, feature)` or a grey scale image of '
+                'shape `(num_sample, rows, cols)`. '
+                'Case 1: If your data is time-series or a textual data'
+                '(probably you are using Conv1D), then there is no need of '
+                'channel conversion.'
+                'Case 2: If your data is image(probably you are using '
+                'Conv2D), then you need to reshape the tension dimensions '
+                'as follows:'
+                '`shape = x_input.shape`'
+                '`x_input = x_input.reshape(shape[0], 1, shape[1], shape[2])`'
+                'Note: Do not use `to_channels_fir()` in above cases.')
         else:
-            warnings.warn(
-                '`to_channels_first()` method transform the data from'
-                '`channels_last` format to `channels_first` format. Please '
-                'check the `image_data_format` and `backend` in `keras.json` '
-                'file.', stacklevel=2)
+            raise ValueError('Your input dimension tensor is incorrect.')
         return np_data
 
     assert data is not None, "A Numpy data should not be None"