Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

do not regularize beta and bias #11953

Closed
wants to merge 1 commit into from
Closed

Conversation

piiswrong
Copy link
Contributor

@piiswrong piiswrong commented Jul 31, 2018

In Module we only put weight decay on variables ending in "_weight" or "_gamma" while in gluon we are regularizing everything.

This PR removes regularization on bias and beta. Further issues to discuss:

  1. should we regularize Embedding layer's weight? (this is currently regularized in module)
  2. should we regularize alpha in PReLU?

For some context:

  1. Pytorch by default regularizes everything. Users need to manually specify params_groups to filter out regularization for some weights.
  2. Keras by default doesn't regularize anything. Users need to manually attach regularization for each weight.

@piiswrong piiswrong requested a review from szha as a code owner July 31, 2018 19:12
Copy link
Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it break all existing training scripts....?

Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's make this a 2.0 change?...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Feature request Gluon pr-awaiting-response PR is reviewed and waiting for contributor to respond
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants