backward_pass now clears nodes which will not be used #221
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Attempting to help issues mentioned in #219
This line removed any references to Array nodes which will not be revisited, since they have passed all their gradient information to their parents.
In the special case of the start_node, the gradients will not be lost, since setting outgrads[node] to None only deletes the reference, not the values. So as long as cur_outgrad holds on to the gradient reference all is fine.
On a test dataset, using a proprietary algorithm which can be roughly described as two RNN's feeding into eachother, I obtained the following gains in performance.
https://cloud.githubusercontent.com/assets/6620250/25814415/6a8f085a-33eb-11e7-9237-ce58fa09300a.png
In addition, I compared the numerical results of my change to the previous version, and they were identical.