Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "average loss" bug in cb_explore.cc #1825

Merged
merged 3 commits into from
Apr 18, 2019

Conversation

marco-rossi29
Copy link
Collaborator

@marco-rossi29 marco-rossi29 commented Apr 3, 2019

action and observation->action were one 1-index and 0-index based

Fixes #1811

action and observation->action were one 1-index and 0-index based
@marco-rossi29
Copy link
Collaborator Author

@jackgerrits This changes the default behavior, so tests are failing. How do you suggest to proceed?

@JohnLangford
Copy link
Member

Update the tests.

@marco-rossi29
Copy link
Collaborator Author

marco-rossi29 commented Apr 18, 2019

To prove that this was indeed a bug, we can take a look at Test 121.

First example from data file (train-sets/rcv1_raw_cb_small.vw) is:
1:1:0.5 | tuesday year million short compan vehicl ...

Action 1 is played both in the buggy version and in the correct version with p = 0.975.
CMD output of first line in the old version was:

average  since         example        example  current  current  current
loss     last          counter         weight    label  predict features
0.050000 0.050000            1            1.0        1 1:0.975000      280

CMD output of first line in new version is:

average  since         example        example  current  current  current
loss     last          counter         weight    label  predict features
1.950000 1.950000            1            1.0        1 1:0.975000      280

Who is correct?

By definition:
ave. loss = Reward / P_log * P_policy = 1 / 0.5 * 0.975 = 1.95.
In the buggy version ave. loss = 0.05 = 1 / 0.5 * 0.025, where 0.025 was the probability of playing action 2. The missmatch caused by the bug.
QED

@marco-rossi29 marco-rossi29 changed the title Fix loss bug in cb_explore.cc Fix ave. loss bug in cb_explore.cc Apr 18, 2019
@marco-rossi29 marco-rossi29 changed the title Fix ave. loss bug in cb_explore.cc Fix "average loss" bug in cb_explore.cc Apr 18, 2019
@marco-rossi29 marco-rossi29 added Bug Bug in learning semantics, critical by default and removed Bug Bug in learning semantics, critical by default labels Apr 18, 2019
@JohnLangford JohnLangford merged commit 70ca7d7 into master Apr 18, 2019
@JohnLangford
Copy link
Member

Merged, thanks :-)

@marco-rossi29 marco-rossi29 deleted the marco-rossi29/fix_cb_explore_loss branch April 18, 2019 19:35
jackgerrits pushed a commit to jackgerrits/vowpal_wabbit that referenced this pull request May 15, 2019
* Fix bug in cb_explore.cc

action and observation->action were one 1-index and 0-index based

* cb_explore: expected output
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in calculating reported cost under cb_explore.
2 participants