[WIP] save previous output and state of RNNs #235

superbock · 2016-12-04T14:07:31Z

This change enables recurrent neural networks to be initialised with an initial output/state (#230), to be streaming compatible (#185), and also fixes some GRU related parameter glitches (#234).

Unfortunately, this change breaks the API, since parameters are reordered and renamed.
Any thoughts?

fdlm · 2016-12-05T08:38:47Z

I see two issues here:

Now I don't like the name of the parameters (prev_output) :), because from a user's perspective, I don't understand why I should give an RNN layer a "previous output" when creating it. I might want to set the "first value", or the "initial value" of the layer, but the name prev_output seems counter-intuitive for this (even though it is conceptually the same under the hood!). I would rather go with init (and set self.prev_output = init in the __init__ method).
I didn't check this, but in non-streaming mode, calling the RNN twice for the same input would return different results, because the initial value of the units is overwritten (it would be the last output of the first call). I think this will be a general problem for all "stateful" processors (HMM, CRF, ...), and I don't have a solution except having a different process method for streaming and non-streaming mode.

fdlm · 2016-12-05T08:25:28Z

madmom/ml/nn/layers.py

@@ -309,38 +316,46 @@ def activate(self, data):
            Activations for this data.

        """
+        # init previous output
+        if not hasattr(self, 'prev_output') or self.prev_output is None:


Is this necessary to ensure backwards compatibility with older models?

Yes, otherwise we can not load the old models any more, since they don't have these attributes.

fdlm · 2016-12-05T08:27:14Z

madmom/ml/nn/layers.py

@@ -287,12 +291,15 @@ class LSTMLayer(Layer):
    """

    def __init__(self, input_gate, forget_gate, cell, output_gate,
-                 activation_fn=tanh):
+                 activation_fn=tanh, prev_output=None, prev_state=None):


Shouldn't these initialisations go to the Gate class, as discussed in #233 ?

I thought about this as well, but they are technically not connected to the gates. The prev_output is the previous output of the unit, prev_state is the state, which is the cell weighted by the input gate plus the previous state weighted by the forget gate. Thus I thought it would be the easiest to add them to the layer directly.

superbock · 2016-12-05T10:17:47Z

Regarding the issues you mentioned:

As indicated elsewhere, I also voted for init, instead of prev_output, at least in the __init__() method. I just used them for clarity right now, since this PR is WIP. I should have indicated this somewhere.
Good point! I thought about adding a reset() method to BufferProcessor (or all classes buffering something) to clear the buffer. Although this works, this will be a problem, because people will forget to reset the buffers -- like I forgot to think about that in the first place.

I tried (and still want) to avoid having two different process methods, simply because this will introduce a lot of duplicated code.

How about the following:

We add a second buffer() method to buffer the data and then just use the old, unmodified process() method to do the actual work. This would require additional buffer() calls in processors.process_online(), but would not change the behaviour of existing code nor would require code changes (hopefully). Maybe we need two methods, one before process and one after, but you get the idea.

Processor will simply raises a NotImplementedError if these methods are called. This way we can easily control which Processors are streaming-capable or not, by simply implementing these methods.

Any thoughts?

superbock · 2016-12-05T16:04:05Z

The current state allows setting the initial state of recurrent layers (and cell for LSTMLayer), but does not deal with proper resetting yet.

As outlined here, it involves a lot of if/elif/else statements to get this right. Waiting for some feedback before proceeding further.

added initialisation of hidden states to layers; fixes #230 renamed GRU parameters, to be consistend with all other layers

superbock · 2017-01-20T10:29:57Z

Closing, since #243 is merged.

superbock force-pushed the save_rnn_prev_out_and_state branch from dafabcb to 718f934 Compare December 4, 2016 14:15

superbock mentioned this pull request Dec 4, 2016

[WIP] added initial states to RNN and LSTM layers #233

Closed

superbock added this to the v0.15 milestone Dec 4, 2016

superbock force-pushed the save_rnn_prev_out_and_state branch 2 times, most recently from 718f934 to 4ebd9ce Compare December 4, 2016 20:03

fdlm reviewed Dec 5, 2016

View reviewed changes

superbock changed the title ~~Save previous output and state of RNNs~~ [WIP] save previous output and state of RNNs Dec 5, 2016

superbock force-pushed the save_rnn_prev_out_and_state branch from 4ebd9ce to e30d6fe Compare December 5, 2016 15:58

Sebastian Böck added 2 commits December 6, 2016 12:36

reorder GRU parameters, fixes #234

1e9a068

add shape information to docstrings

9a0b40b

superbock force-pushed the save_rnn_prev_out_and_state branch from e30d6fe to 1e8cd39 Compare December 6, 2016 14:15

superbock changed the base branch from master to streaming_mode December 6, 2016 14:17

add tests for NN layers

421fa8f

superbock force-pushed the save_rnn_prev_out_and_state branch from 1e8cd39 to 421fa8f Compare December 8, 2016 10:39

Sebastian Böck added 2 commits December 8, 2016 15:41

add stateful processing of recurrent layers

eb0310e

added initialisation of hidden states to layers; fixes #230 renamed GRU parameters, to be consistend with all other layers

do not reset NN layers in online (streaming) mode

5a2e813

superbock force-pushed the save_rnn_prev_out_and_state branch from e64fc55 to 5a2e813 Compare December 8, 2016 14:41

superbock mentioned this pull request Jan 18, 2017

stateful RNNs #243

Merged

superbock closed this Jan 20, 2017

superbock deleted the save_rnn_prev_out_and_state branch February 3, 2017 08:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] save previous output and state of RNNs #235

[WIP] save previous output and state of RNNs #235

superbock commented Dec 4, 2016

fdlm commented Dec 5, 2016

fdlm Dec 5, 2016

superbock Dec 5, 2016

fdlm Dec 5, 2016

superbock Dec 5, 2016

superbock commented Dec 5, 2016

superbock commented Dec 5, 2016

superbock commented Jan 20, 2017

[WIP] save previous output and state of RNNs #235

[WIP] save previous output and state of RNNs #235

Conversation

superbock commented Dec 4, 2016

fdlm commented Dec 5, 2016

fdlm Dec 5, 2016

Choose a reason for hiding this comment

superbock Dec 5, 2016

Choose a reason for hiding this comment

fdlm Dec 5, 2016

Choose a reason for hiding this comment

superbock Dec 5, 2016

Choose a reason for hiding this comment

superbock commented Dec 5, 2016

superbock commented Dec 5, 2016

superbock commented Jan 20, 2017