Issue #96 #118

M1ngXU · 2022-07-23T14:31:24Z

See Issue #96

This reverts commit 053984e. See Issue #110.

…nection

M1ngXU · 2022-07-23T14:33:24Z

The Module::forward is quite inefficient with 3 allocations, so the old residual_add-connection is now residual_add. Also, I have no idea about the numpy stuff, so I am not sure how to fix the last numpy test (comments in code).

coreylowman

Thanks for the contribution!

src/nn/residual.rs

coreylowman · 2022-07-25T20:03:55Z

src/nn/residual.rs

 /// ```
 #[derive(Debug, Clone, Default)]
-pub struct Residual<F>(F);
+pub struct Residual<F, R>(F, R);


I like the idea of having both Residual & ResidualAdd. I imagine most people will probably use ResidualAdd, so let's rename them

Residual does f(x) + x

GeneralizedResidual does f(x) + r(x)

What about SkipConnection instead of GeneralizedResidual?

ok, after re-reading some wiki pages i think that i understand it better now - but shouldn't the gradients of F and R have to be summed AFTER the backprop of F and R (so the addition would have to be pushed to the tape first?) or am I not understanding binary_map ?

All the operations on the tape are executed in reverse, so for this function the add/binary_map backward op will be executed, then the f(x) backward op is executed using the gradient from add(), and then the r(x) backward ops using gradient from add(). Does that help?

Let's keep both Residual and SkipConnection, and then GeneralizedResidual. SkipConnection can just be a type alias (type SkipConnection<F, R> = Residual<F, R>;

do you have an idea why the numpy test fails?

add_grad + f_grad + r_grad

but if the add(y, &x) is executed before 2.a in the backprop, how can f_grad be fetched from gradients to calculate x's gradients for 1?

The three are summed together independently. After add's backprop x's gradient is only 1/3 complete (add_grad). Once f's backward op is executed then x is 2/3 complete (add_grad + f_grad), then finally after r's backward op x is complete (add_grad + f_grad + r_grad).

It works because x is input for add, f, and r, not the result. Gradients only need to be fully "complete" for backward ops in which they are the result.

but for the last backward op the gradient of x = r_grad + f_grad. in which line is this done? i can only find the backprops for r_grad and f_grad, but not their addition

it depends on what f & r are. for binary map the call to addmul does rhs_grad += rhs_deriv * result_grad, so if rhs_grad is x's grad that's where part of the accumulation happens.

If F and R were both using map for instance then both accumulations would happen in map (where x's gradient is g)

Linear<10, 5>

i mean that linear layer, what gradients will it use? it will be lhs_grad, won't it? but when is rhs_grad added to lhs_grad?

M1ngXU · 2022-07-26T09:03:42Z

hmmm, file deleted and new file oof

src/nn/generalized_residual.rs

M1ngXU · 2022-07-30T05:41:24Z

should be ready for merge now

M1ngXU · 2022-07-30T05:46:03Z

finally the tests all passed 😅

M1ngXU · 2022-07-30T07:32:47Z

with the new GeneralizedResidual, the Residual struct can be removed, if we add a Linear-Activation, so f(x) = x, and then type Residual<F> = GeneralizedResidual<F, Linear>

coreylowman · 2022-08-02T12:08:04Z

@M1ngXU this looks good to me for merging - can you update or rebase your branch to current master (e.g. the dropout change was already merged into main)

M1ngXU · 2022-08-02T14:35:36Z

@M1ngXU this looks good to me for merging - can you update or rebase your branch to current master (e.g. the dropout change was already merged into main)

how exactly do i do that? 😅

coreylowman · 2022-08-03T13:10:54Z

@M1ngXU this looks good to me for merging - can you update or rebase your branch to current master (e.g. the dropout change was already merged into main)

how exactly do i do that? 😅

This article should walk you through it https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

M1ngXU · 2022-08-03T13:15:12Z

@M1ngXU this looks good to me for merging - can you update or rebase your branch to current master (e.g. the dropout change was already merged into main)

how exactly do i do that? 😅

This article should walk you through it https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

ok, should work now. i thought that i already tried that, but i appearently synced main and not BetterResidual.

M1ngXU · 2022-08-03T13:15:33Z

with the new GeneralizedResidual, the Residual struct can be removed, if we add a Linear-Activation, so f(x) = x, and then type Residual<F> = GeneralizedResidual<F, Linear>

@coreylowman how about this?

coreylowman · 2022-08-03T13:26:03Z

with the new GeneralizedResidual, the Residual struct can be removed, if we add a Linear-Activation, so f(x) = x, and then type Residual<F> = GeneralizedResidual<F, Linear>

I like the idea about moving Residual to use GeneralizedResidual, but I think we'd need something like Identity (which just returns x in the forward pass) instead of Linear for R.

M1ngXU · 2022-08-03T13:27:02Z

with the new GeneralizedResidual, the Residual struct can be removed, if we add a Linear-Activation, so f(x) = x, and then type Residual<F> = GeneralizedResidual<F, Linear>

I like the idea about moving Residual to use GeneralizedResidual, but I think we'd need something like Identity (which just returns x in the forward pass) instead of Linear for R.

the name Identity makes more sense

M1ngXU added 5 commits July 21, 2022 17:37

Revert "Changing dropout to use map_df_uses_fx"

e86f733

This reverts commit 053984e. See Issue #110.

first changes for residual

838d7a6

mostly added residual

bb1f0c6

added residual_add for computational/memory-wise cheaper residual con…

1a4e1c2

…nection

added failure info

2b91976

M1ngXU added 2 commits July 23, 2022 16:33

cargo fmt

8436181

cargo clippy

4cc60ad

coreylowman requested changes Jul 25, 2022

View reviewed changes

simplify & rename

246b169

M1ngXU added 2 commits July 26, 2022 12:58

fixed fmt

dfb1de7

mod.rs namechange

a855a38

coreylowman reviewed Jul 29, 2022

View reviewed changes

src/nn/generalized_residual.rs Outdated Show resolved Hide resolved

M1ngXU added 2 commits July 30, 2022 07:32

fixed save/load issue

e20dae7

fixed doctest

44e3987

M1ngXU marked this pull request as ready for review July 30, 2022 05:40

fixed doctest

e7d4f35

coreylowman approved these changes Aug 2, 2022

View reviewed changes

Merge branch 'coreylowman:main' into M1ngXU-BetterResidual

c7a16a6

coreylowman merged commit 7f22b5d into coreylowman:main Aug 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #96 #118

Issue #96 #118

M1ngXU commented Jul 23, 2022

M1ngXU commented Jul 23, 2022

coreylowman left a comment

coreylowman Jul 25, 2022

M1ngXU Jul 26, 2022

M1ngXU Jul 26, 2022

coreylowman Jul 26, 2022

M1ngXU Jul 27, 2022

M1ngXU Aug 4, 2022

coreylowman Aug 4, 2022

M1ngXU Aug 4, 2022

coreylowman Aug 4, 2022

M1ngXU Aug 5, 2022

M1ngXU commented Jul 26, 2022

M1ngXU commented Jul 30, 2022

M1ngXU commented Jul 30, 2022

M1ngXU commented Jul 30, 2022

coreylowman commented Aug 2, 2022 •

edited

Loading

M1ngXU commented Aug 2, 2022

coreylowman commented Aug 3, 2022

M1ngXU commented Aug 3, 2022

M1ngXU commented Aug 3, 2022

coreylowman commented Aug 3, 2022

M1ngXU commented Aug 3, 2022

Issue #96 #118

Issue #96 #118

Conversation

M1ngXU commented Jul 23, 2022

M1ngXU commented Jul 23, 2022

coreylowman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

M1ngXU commented Jul 26, 2022

M1ngXU commented Jul 30, 2022

M1ngXU commented Jul 30, 2022

M1ngXU commented Jul 30, 2022

coreylowman commented Aug 2, 2022 • edited Loading

M1ngXU commented Aug 2, 2022

coreylowman commented Aug 3, 2022

M1ngXU commented Aug 3, 2022

M1ngXU commented Aug 3, 2022

coreylowman commented Aug 3, 2022

M1ngXU commented Aug 3, 2022

coreylowman commented Aug 2, 2022 •

edited

Loading