BUG: Setting initial state on ConvLSTM2D with input variables #9761

mharradon · 2018-03-27T03:14:22Z

Should this work, or is this an unsupported use case?

import keras
from keras import layers as L

x = L.Input((4,6,6,3))
init_state = L.Input((6,6,3))
y = L.ConvLSTM2D(filters=3,kernel_size=(3,3),padding='same',return_sequences=True)(x,initial_state=[init_state,init_state])

Using TensorFlow backend.
Traceback (most recent call last):
  File "ConvLSTM2DTest.py", line 10, in <module>
    y = L.ConvLSTM2D(filters=3,kernel_size=(3,3),padding='same',return_sequences=True)(x,initial_state=[init_state,init_state])
  File ".../keras/layers/convolutional_recurrent.py", line 319, in __call__
    output = super(ConvRNN2D, self).__call__(full_input, **kwargs)
  File ".../keras/layers/recurrent.py", line 496, in __call__
    inputs, initial_state, constants)
  File ".../keras/layers/recurrent.py", line 655, in _standardize_args
    assert initial_state is None and constants is None
AssertionError

Thanks!

The text was updated successfully, but these errors were encountered:

mizima · 2018-04-13T08:08:16Z

Hello,
I think the issue is related to #7612. The result of the discussion therein resulted in change of API. Now (2.1.5), the initial_state is taken from inputs according to
initial_state = inputs[1:] # part of _standardize_args impl
and the original argument initial_state is checked to be None (the AsserionError you see). Therefore, your code should have been written

y = L.ConvLSTM2D(filters=3,kernel_size=(3,3),padding='same',return_sequences=True)([x,
 init_state,init_state])

BUT, it seems that _standardize_args is called twice for ConvLSTM2D (once by ConvRNN2D impl and then via super(ConvRNN2D, self).__call__ propagated to RNN impl). Therefore, the assertion fires even if you fix the code on the client side. This issue does not occur for (dense) LSTM that was discussed under #7612.

I think the solution is to remove the (redundant?)

inputs, initial_state, constants = self._standardize_args(
            inputs, initial_state, constants)

from ConvRNN2D.__call__. It fixes the AsserionError for me.

Can anybody confirm?

ChengeLi · 2018-06-06T03:42:30Z

I had the same problem with mizima, it would be really helpful if someone can confirm this hacky solution.

ribhupathria · 2018-07-30T23:09:23Z

We are hitting the same issue with Seq2Seq LSTM model based on https://arxiv.org/pdf/1409.3215.pdf. Issue is seen when creating converting model to tf estimator. Is there a fix planned anytime soon?

…ut variables Deleted redundant call of `_standardize_args`

Gerryflap · 2019-02-17T21:14:25Z

I am still running into this bug at the moment. I've tried removing those (seemingly redundant) standardize calls, but haven't had much luck with that method. If this does not work, is there any other method of implementing convolutional seq2seq models in Keras?

fangzuliang · 2019-03-31T01:52:12Z

Hello,
I think the issue is related to #7612. The result of the discussion therein resulted in change of API. Now (2.1.5), the initial_state is taken from inputs according to
initial_state = inputs[1:] # part of _standardize_args impl
and the original argument initial_state is checked to be None (the AsserionError you see). Therefore, your code should have been written
y = L.ConvLSTM2D(filters=3,kernel_size=(3,3),padding='same',return_sequences=True)([x,
 init_state,init_state])
BUT, it seems that _standardize_args is called twice for ConvLSTM2D (once by ConvRNN2D impl and then via super(ConvRNN2D, self).__call__ propagated to RNN impl). Therefore, the assertion fires even if you fix the code on the client side. This issue does not occur for (dense) LSTM that was discussed under #7612.

I think the solution is to remove the (redundant?)
inputs, initial_state, constants = self._standardize_args(
            inputs, initial_state, constants)
from ConvRNN2D.__call__. It fixes the AsserionError for me.

Can anybody confirm?

I had met the same AssertionError, thanks for your suggestions, I solved the problems.

dzhv · 2019-06-05T19:30:17Z

This issue is still not addressed in the latest release.
What can be done for it to receive attention?

Also, although the proposed solution (below) fixes the model for single GPU training, I still face issues due to the initial_state when training on multiple GPUs using keras.utils.multi_gpu_model .

I think the solution is to remove the (redundant?)
inputs, initial_state, constants = self._standardize_args(
            inputs, initial_state, constants)
from ConvRNN2D.__call__. It fixes the AsserionError for me.

aharchaoumehdi · 2019-08-25T17:37:49Z

Do we have any update on this issue?

In my seq2seq model based on convLSTM2DCell and convRNN2D, I have to pass the encoder_states as initial_state to the decoder (see code below). For now, this

decoder_outputs, _, _ = decoder(decoder_inputs, initial_state=encoder_states) # a tensor

leads to an Assertion error.

Here is the function that tries to implement a seq2seq convLSTM2D model, for sequences of video frames as input.

def s2s_convlstm(nfilters, kernel_size, nrows, ncols, nchannels, learning_rate, num_layers, num_gpus):
    optimiser = Adam(lr=learning_rate) 
    loss = "mse" 
    # encoder
    encoder_inputs = Input(shape=(None, nrows, ncols, nchannels)) # a tensor
    encoder_cells = ConvLSTM2DCell(filters=nfilters, kernel_size=kernel_size, padding='same') 
    encoder = ConvRNN2D(encoder_cells, return_state=True) # a function
    encoder_output, hidden_state, cell_state = encoder(encoder_inputs) # a tensor
    encoder_states = [hidden_state, cell_state] # a tensor
    print(encoder_states)
    # decoder
    decoder_inputs = Input(shape=(None, nrows, ncols, nchannels)) # a tensor
    decoder_cells = ConvLSTM2DCell(filters=nfilters, kernel_size=kernel_size, padding='same') 
    decoder = ConvRNN2D(decoder_cells, return_sequences=True, return_state=True) # a function
    print(decoder_inputs)
    decoder_outputs, _, _ = decoder(decoder_inputs, initial_state=encoder_states) # a tensor
    # model
    model = Model(inputs=[encoder_inputs, decoder_inputs], outputs=decoder_outputs) # a function
    parallel_model = multi_gpu_model(model, gpus=num_gpus) # a function: instantiates a Model object using inputs and outputs: behind the scenes, Keras retrives every layer involved in going from input to output, bringing them together into a graph-like structure-- a Model object. The reason it works is output was obtained by repeatedly transforming input. Otherwise, Runtime error would occur. 
    parallel_model.compile(optimizer=optimiser, loss=loss, metrics=['mae'])
    parallel_model.summary()
    return parallel_model

kisckorea · 2019-09-07T10:40:32Z

Hello, I have the same problem too. Anybody who has a solution?
thanks!

num_input_features = (201,201,1)
num_output_features =(201,201,1)

encoder_inputs = keras.layers.Input(shape=(None, num_input_features[0], num_input_features[1], num_input_features[2]))

encoder = ConvLSTM2D(filters=8, kernel_size=(3, 3),input_shape=(None, 201, 201, 1),
data_format='channels_last', padding='same', activation='tanh', return_state=True)
encoder_outputs_and_states = encoder(encoder_inputs)
encoder_states = encoder_outputs_and_states[1:]

decoder_inputs = keras.layers.Input(shape=(None, num_output_features[0], num_output_features[1], num_output_features[2]))
decoder = ConvLSTM2D(filters=8, kernel_size=(3, 3),
input_shape=(None, 201, 201, 1), data_format='channels_last',
padding='same', activation='tanh', return_sequences=True, return_state=True)
decoder_outputs_and_states = decoder(decoder_inputs, initial_state=encoder_states)
decoder_outputs = decoder_outputs_and_states[0]

model = keras.models.Model(inputs=[encoder_inputs, decoder_inputs], outputs=decoder_outputs)

Traceback (most recent call last):
File "//model/Conv_LSTM_seq2seq_draft4.py", line 73, in
decoder_outputs_and_states = decoder(decoder_inputs, initial_state=encoder_states)
File "//anaconda3/envs/dev_tf1.13/lib/python3.6/site-packages/keras/layers/convolutional_recurrent.py", line 321, in call
output = super(ConvRNN2D, self).call(full_input, **kwargs)
File "//anaconda3/envs/dev_tf1.13/lib/python3.6/site-packages/keras/layers/recurrent.py", line 529, in call
inputs, initial_state, constants, self._num_constants)
File "//anaconda3/envs/dev_tf1.13/lib/python3.6/site-packages/keras/layers/recurrent.py", line 2336, in _standardize_args
assert initial_state is None and constants is None
AssertionError

removing below does not work..
inputs, initial_state, constants = self._standardize_args(
inputs, initial_state, constants)

from ConvRNN2D.call. It fixes the AsserionError for me.

gmrhub · 2019-12-20T12:55:34Z

A better fix would follow tf.keras issue.
But that is having issue of:

but after loading the saved model weights in a complete new python session the results on validation data doesn't match at all

flyinskybtx · 2019-12-26T14:48:21Z

I think the problem comes from line 307 - line 337 in convolutional_recurrent.py.
After initial state is concated to inputs, the code also pass initial state to kwargs, which passed to super class call , and triggered _standardize_args() again. And here comes the problem.

I think a possible fix is not to update kwargs with "initial_state". So here i remove line 308:
kwargs['initial_state'] = initial_state

and things go well

gyla1993 · 2020-02-05T22:53:31Z

I meet the same problem with keras==2.3.1.

pascalxia added a commit to pascalxia/keras that referenced this issue Aug 18, 2018

Fix bug keras-team#9761: setting initial state on ConvLSTM2D with inp…

6750e1e

…ut variables Deleted redundant call of `_standardize_args`

This was referenced Oct 22, 2018

A bunch of changes to support distributed training using tf.estimator kubeflow/examples#265

Merged

[gh_issue_summarization] distributed training using Keras kubeflow/examples#196

Closed

Oblynx mentioned this issue Nov 15, 2018

[GH Issue Summarization] Keras model doesn't work with keras in TensorFlow kubeflow/examples#280

Closed

baiztencale mentioned this issue Nov 15, 2018

InvalidArgumentError when calling model as layer with ConvLSTM2D #11647

Closed

gyla1993 mentioned this issue Mar 8, 2020

A possible bug in ConvRNN2D __call__ tensorflow/tensorflow#35306

Closed

dzhv mentioned this issue May 4, 2020

Couldn't train ! dzhv/Spatio-Temporal-mobile-traffic-forecasting#2

Closed

fchollet closed this as completed Jun 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Setting initial state on ConvLSTM2D with input variables #9761

BUG: Setting initial state on ConvLSTM2D with input variables #9761

mharradon commented Mar 27, 2018 •

edited

Loading

mizima commented Apr 13, 2018

ChengeLi commented Jun 6, 2018

ribhupathria commented Jul 30, 2018

Gerryflap commented Feb 17, 2019

fangzuliang commented Mar 31, 2019

dzhv commented Jun 5, 2019

aharchaoumehdi commented Aug 25, 2019

kisckorea commented Sep 7, 2019

gmrhub commented Dec 20, 2019

flyinskybtx commented Dec 26, 2019

gyla1993 commented Feb 5, 2020

BUG: Setting initial state on ConvLSTM2D with input variables #9761

BUG: Setting initial state on ConvLSTM2D with input variables #9761

Comments

mharradon commented Mar 27, 2018 • edited Loading

mizima commented Apr 13, 2018

ChengeLi commented Jun 6, 2018

ribhupathria commented Jul 30, 2018

Gerryflap commented Feb 17, 2019

fangzuliang commented Mar 31, 2019

dzhv commented Jun 5, 2019

aharchaoumehdi commented Aug 25, 2019

kisckorea commented Sep 7, 2019

gmrhub commented Dec 20, 2019

flyinskybtx commented Dec 26, 2019

gyla1993 commented Feb 5, 2020

mharradon commented Mar 27, 2018 •

edited

Loading