Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to bug 1729: no bias regularization for BFGS not working #1794

Merged
merged 3 commits into from
Mar 20, 2019

Conversation

pavithraks
Copy link
Collaborator

@pavithraks pavithraks commented Mar 19, 2019

fixes #1729

Cause of problem: weights[constant] always returns 0, so there was no subtraction happening at all with the below two lines:

(&weights[constant])[W_GT] -= regularization * weights[constant]; ret -= 0.5 * regularization * (weights[constant]) * (weights[constant]);

This is because we weren't accessing it the right way.

weights.strided_index(constant) gives us what we want. Thus, the correct code should probably be:

(&weights.strided_index(constant))[W_GT] -= regularization * (weights.strided_index(constant)); ret -= 0.5*regularization*(weights.strided_index(constant))*(weights.strided_index(constant));

if (all.no_bias)
{
if (b.regularizers == nullptr)
{
(&weights[constant])[W_GT] -= regularization * weights[constant];
ret -= 0.5 * regularization * (weights[constant]) * (weights[constant]);
(&weights.strided_index(constant))[W_GT] -= regularization * (weights.strided_index(constant));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case to NOT using strided_index? If not then perhaps we should make the operator[] private to avoid this bug in future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know vw well enough to comment on that but like we discussed in our conversation earlier today, it would be a big change, probably out of scope of this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely out of scope of the PR and if we decide we want to look into this I can create and issue to track it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, there is. Currently, in setup_example all features are multipled by a stride which allows direct access to the parameters via [] as per https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/gd_predict.h#L33

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does offset correspond with the stride? Maybe the default should be to access using the stride and then explicitly state when you want unstrided? It depends on what the more common/intuitive usage is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly.

Stride controls floats per parameter. Offset indexes into logically distinct sets of parameters.

@JohnLangford
Copy link
Member

Pavithra, thanks for this. Let's merge as soon as tests pass.

@pavithraks pavithraks merged commit 5a8c495 into VowpalWabbit:master Mar 20, 2019
@pavithraks pavithraks deleted the pks_fix_bug1729 branch March 20, 2019 18:36
jackgerrits pushed a commit to jackgerrits/vowpal_wabbit that referenced this pull request May 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

no_bias_regularization not working
3 participants