You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When full connected net has more than 3 layers, the backporp gradient and numerical gradient show significant difference. This issue could be reproduced in the Dropout.ipynb (in the cell Fully-connected nets with Dropout).
N, D, H1, H2, C = 2, 15, 20, 30, 10
X = np.random.randn(N, D)
y = np.random.randint(C, size=(N,))
s = np.random.randint(1)
for dropout in [0, 0.25, 1.0]:
print 'Running check with dropout = ', dropout
model = FullyConnectedNet([H1, 10,10,10,10,10,H2], input_dim=D, num_classes=C,
weight_scale=5e-2, dtype=np.float64,
dropout=dropout, seed=s)
loss, grads = model.loss(X, y)
print 'Initial loss: ', loss
for name in sorted(grads):
f = lambda _: model.loss(X, y)[0]
grad_num = eval_numerical_gradient(f, model.params[name], verbose=False, h=1e-5)
print '%s relative error: %.2e' % (name, rel_error(grad_num, grads[name]))
print
I have tried several random seeds on this and it seems the bias gradient on the last layer are always correct. And the bias error will be extremely large since last hidden layer. However, error on W seems to be correct all the time. I firstly noticed this weird thing on my own implementation and it seems that the same thing occurs in your implementation. Any ideas?
The text was updated successfully, but these errors were encountered:
When full connected net has more than 3 layers, the backporp gradient and numerical gradient show significant difference. This issue could be reproduced in the
Dropout.ipynb
(in the cellFully-connected nets with Dropout
).The output of this would be:
I have tried several random seeds on this and it seems the bias gradient on the last layer are always correct. And the bias error will be extremely large since last hidden layer. However, error on W seems to be correct all the time. I firstly noticed this weird thing on my own implementation and it seems that the same thing occurs in your implementation. Any ideas?
The text was updated successfully, but these errors were encountered: