Warm start for cbify #1534

zcc1307 · 2018-07-16T16:01:52Z

This patch is for a new mode in vw: contextual bandit learning with warm start (CB-WS). It is mainly based on modifying cbify.cc - in its predict_or_learn_adf, the learning is now broken to two phases: 1. warm start phase and 2. interaction phase.

Remarks:

As of now, the CB-WS mode only works if the base exploration algorithm is epsilon-greedy. For more complex exploration algorithms (e.g. cover/bagging), a difficulty is that, we need to initialize all the base learners in cover/bagging using warm start examples - this seems to require use to change the code of predict_or_learn_cover/predict_or_learn_bag in cb_explore_adf.cc.
We additionally scale the cb examples' importance weight by a 1/num_actions in the mtr mode of cb_adf.cc - this has the effect of ensuring the warm start examples have the same weight as the CB examples.
Fixed an offset issue in predict_or_learn_greedy (should use example's offset - otherwise will have the wrong behavior when multiple cb_explore learners are initialized in cbify) and multiline_predict_or_learn (store / restore the examples' offsets properly)
Added some simple test cases.
There are lots of debugging cout's - perhaps I should delete them?

…repare multiple copies of weights

…n - the ec.l field is anunion

zcc1307 · 2019-02-08T16:15:50Z

Some more questions:

I think test case 173 fails because we are scaling the importance weight of each example in the MTR reduction by 1/num_actions - shall I change the test result reference instead?

On the other hand, I am confused about the implementation of cost regression in csoaa - in csoaa, is each example (x,c) converted to a loss \ell(f, (x,c)) = \sum_{a=1}^K (f(x,a) - c(a))^2, or \ell(f, (x,c)) = 1/K * \sum_{a=1}^K (f(x,a) - c(a))^2?

If it is the former, then I think we shouldn't scale the loss in the MTR reduction by 1/K, as before scaling, \EE[\hat{\ell}(f, (x,c))] = \sum_{a=1}^K (f(x,a) - c(a))^2. (Basically we would like the optimization objective to be exactly the same as the one in our paper, Appendix A.)

Are we happy with showing the VW doubling progress report with number of examples starting from #(warm start examples)? (see e.g. the first VW output in this page) In my implementation, I set the examples' importance weight to zero if they are not in the interaction stage, so that we are only counting the average loss in the interaction stage.
We explicitly use --warm_start_update and --interaction_update as two input options to indicate if VW turn on updates in the respective two stages. Is it too long for users?

jackgerrits · 2019-02-14T18:09:36Z

@zcc1307 I'm going to help you get this in. Grab me at some point and let me know how I can help here.

…bify_ws

… of each example in the mtr reduction by 1/num_actions

zcc1307 · 2019-02-25T14:12:48Z

vowpalwabbit/cb_adf.cc

-  GEN_CS::call_cs_ldf<true>(
-      base, mydata.gen_cs.mtr_ec_seq, mydata.cb_labels, mydata.cs_labels, mydata.prepped_cs_labels, mydata.offset);
+  examples[mydata.gen_cs.mtr_example]->weight *= 1.f / examples[mydata.gen_cs.mtr_example]->l.cb.costs[0].probability * ((float)mydata.gen_cs.event_sum / (float)mydata.gen_cs.action_sum) * (1.f / (float)examples.size());
+  GEN_CS::call_cs_ldf<true>(base, mydata.gen_cs.mtr_ec_seq, mydata.cb_labels, mydata.cs_labels, mydata.prepped_cs_labels, mydata.offset);


Sorry, I am a little confused about this line - what is ((float)mydata.gen_cs.event_sum / (float)mydata.gen_cs.action_sum)? Is it 1/K?

It's more like 1 / average K. These are defined via this: https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/gen_cs_example.cc#L171

zcc1307 · 2019-02-25T14:13:25Z

@jackgerrits Thanks, Jack! Yes, I think I have made changes to pass the checks.

@JohnLangford As we further divide the importance weights in the mtr reduction by 1/K, this caused some changes in regcb's test results. I am not sure if I should change the Lambda set setting as we discussed before to accomodate this 1/K change in mtr, because we also hope that the algorithm works also for other reductions (ips/dr)?

Instructions on using the --warm_cb option can be found in https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Warm-starting-contextual-bandits .

JohnLangford · 2019-03-05T14:37:51Z

@zcc1307 I'm kind of confused. What is the question exactly?

We have some conflicts to merge.

…e of lambda

JohnLangford · 2019-04-01T14:34:37Z

Is this ready to merge?

zcc1307 · 2019-04-02T03:41:00Z

Yes, it is ready to merge! I have also updated the documentations:
https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Warm-starting-contextual-bandits

jackgerrits · 2019-04-02T11:53:30Z

Thanks for the docs! I’ll add it to the sidebar today EDIT: done

…

On Mon, Apr 1, 2019 at 11:41 PM, Chicheng Zhang ***@***.***> wrote: Yes, it is ready to merge! I have also updated the documentations: https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Warm-starting-contextual-bandits — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1534 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHNVUnjPWygOQ7DrA4cwkOJip0CGW5NVks5vctFWgaJpZM4VRZJ6> .

JohnLangford · 2019-04-02T14:34:54Z

Merged, thanks :-)

* / * not sure if the cost vector retrieved is correct * not sure if the cost vector retrieved is correct * added cbify warm start code * commented out the multiple lambda code in cbify * commented out the multiple lambda code in cbify * the cbexplore approach seems not working, as the first stage cannot prepare multiple copies of weights * . * properly store the temp labels * back * . * fixed the bug with assigning cb label before cost sensitive prediction - the ec.l field is anunion * the cumulative cost become diverse * modified csoaa so that it can take example weights now. * . * added some results of warm starting * added some results of warm starting * before modifying cbify adf code * start modifying cbify adf code * unkwown segfault error * everything good except for the cost sensitive learn part * . * . * fixed the bug of empty example cost wrongly set * fixed the bug of empty example cost wrongly set * partially fix the importance weight issue * fixed memory leak bug * start changing the sample size paramters * adding the bandit period as an explicit option * file reorg * tweak the python script * added scatterplot script * retracted the matplotlib inclusion * . * . * regexp based line parsing for vw output (not tested yet) * . * . * tweaked the scripts * . * . * label corruption code * supervised dataset validation * lambda script * weighting scheme * . * start properly copying the examples * model is not updating in the supervised phase * change to using proper copy example functions. Memory leak issues persist. * . * updated the lambda tuning scheme * . * fixed bug on zero warm start examples on small datasets * added a refined weighting scheme and cumulative var calculation (not tested yet) * warm start = 0 does not work * fixed the csl label zero problem - now the label is set properly: 1,2,..K * . * make the lambda weighting more modular * make adf modular * the version where there is an error on memory free * finished cleanup (need to double check the cb label swap in the adf case) * adjusted the output of the script so that it is more systematic * a more complete summary file * bring back the pairwise comparison plot * added type 3 noise * (warm start type = 2, adf) setting gives wrong results * (warm start type = 2, adf) setting gives wrong results * fixed the place of weight multiplier calculation * force the changes * before modifying the baseline of no update * a new parameter enumeration scheme * . * . * updated scripts * cleaned up the run vw script; need more tests on more choices of param settings * fixed memory lost problems; still reachable problems still not resolved * started cleaning up the cost-sensitive mc to cs conversion * begin changing the cb learning w/o adf part * finished cleaning up the no adf part * before cleaning up adf * mwt explorer kept outputting action 0 * roll back to a state before reorg that is working * intermediate state * fixed a problem in noadf:lambda selection now happens before update * there is still a memory leak issue for ecs[0].pred.a_s * lines for respective validation methods * commented out matplotlib * commented out matplotlib * rename running script * trial on compiling vw in one of the subtasks * before merging * cleaned up all errors except for calling cost sensitive learning * fixed offset bugs in cb_explore and multiline_predict_or_learn * fixed error on split/nosplit swapping * fixed all memory leaks in warm start ground truth * fixed memory leaks in supervised ground truth * added cbify warm start test cases * removed unnecessary include path prefix * cleaning up script * finished updating the running vw script * . * removed running scripts * removed spurious changes * removed spurious changes * undoing the weight scaling by 1/k in mtr * updated tests * added warm_cb as a separate file * . * removed part on non-adf * redoing the importance weight scaling by a factor of 1/k * . * comma typo * removed redundant comments * resolve conflicts * compile error on peeking epsilon in warm_cb.cc * fixed sim-bandit option, disallow cost-sensitive corruption * begin fixing importance weight in cs examples * revert cost_sensitive.cc * fixed the weighting issue in cs examples * . * edited vw_core.vcxproj * added new warm cb test cases * overwrote regcb test results, as we further divide importance weights of each example in the mtr reduction by 1/num_actions * corrected a mistake in new regcb test result * reorder reduction stack * changed the weight scaling back without 1/K; changed the central value of lambda * changed back regcbopt test results; undo changes in cb_adf.cc

Chicheng Zhang and others added 30 commits January 25, 2018 18:47

/

c891ae8

not sure if the cost vector retrieved is correct

cc0ac23

not sure if the cost vector retrieved is correct

4a27941

added cbify warm start code

539b1e4

commented out the multiple lambda code in cbify

961a5a5

commented out the multiple lambda code in cbify

0fbc26a

the cbexplore approach seems not working, as the first stage cannot p…

369b3ea

…repare multiple copies of weights

.

8f096a5

properly store the temp labels

904134f

back

e271344

.

8879525

fixed the bug with assigning cb label before cost sensitive predictio…

ced4bbd

…n - the ec.l field is anunion

the cumulative cost become diverse

ac71d8d

modified csoaa so that it can take example weights now.

9debba8

.

e295aff

added some results of warm starting

ed2f2bf

added some results of warm starting

0da506a

before modifying cbify adf code

122c8a3

start modifying cbify adf code

c01f8cc

unkwown segfault error

0d4d633

everything good except for the cost sensitive learn part

ded8f53

.

68d8600

.

aace037

fixed the bug of empty example cost wrongly set

41127f8

fixed the bug of empty example cost wrongly set

94c8103

partially fix the importance weight issue

0a25495

fixed memory leak bug

46d91c0

start changing the sample size paramters

fad3955

adding the bandit period as an explicit option

1351a31

merged with adf_modification

f921051

zcc1307 added 2 commits February 7, 2019 23:26

edited vw_core.vcxproj

63d8c40

added new warm cb test cases

99d642b

resolve test case conflict

217ee32

JohnLangford assigned jackgerrits and unassigned lokitoth Feb 14, 2019

zcc1307 added 3 commits February 24, 2019 22:48

Merge branch 'master' of github.com:VowpalWabbit/vowpal_wabbit into c…

6cf41b6

…bify_ws

overwrote regcb test results, as we further divide importance weights…

3ad0f7b

… of each example in the mtr reduction by 1/num_actions

corrected a mistake in new regcb test result

2fa610e

zcc1307 commented Feb 25, 2019

View reviewed changes

JohnLangford and others added 10 commits March 5, 2019 09:40

Merge branch 'master' into cbify_ws

5775bd6

Merge branch 'master' into cbify_ws

753d885

reorder reduction stack

5e923d8

Merge branch 'master' into cbify_ws

35d9ab0

changed the weight scaling back without 1/K; changed the central valu…

c71d3e3

…e of lambda

Merge branch 'master' into cbify_ws

94c7147

resolve conflict on test data

a7408f0

resolve conflict on test data

3df666e

changed back regcbopt test results; undo changes in cb_adf.cc

13bf77c

Merge branch 'master' into cbify_ws

cca8449

Merge branch 'master' into cbify_ws

5776849

JohnLangford merged commit 31859a3 into VowpalWabbit:master Apr 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warm start for cbify #1534

Warm start for cbify #1534

zcc1307 commented Jul 16, 2018

zcc1307 commented Feb 8, 2019

jackgerrits commented Feb 14, 2019

zcc1307 Feb 25, 2019 •

edited

Loading

JohnLangford Mar 22, 2019

zcc1307 commented Feb 25, 2019

JohnLangford commented Mar 5, 2019

JohnLangford commented Apr 1, 2019

zcc1307 commented Apr 2, 2019

jackgerrits commented Apr 2, 2019 via email •

edited

Loading

JohnLangford commented Apr 2, 2019

Warm start for cbify #1534

Warm start for cbify #1534

Conversation

zcc1307 commented Jul 16, 2018

zcc1307 commented Feb 8, 2019

jackgerrits commented Feb 14, 2019

zcc1307 Feb 25, 2019 • edited Loading

Choose a reason for hiding this comment

JohnLangford Mar 22, 2019

Choose a reason for hiding this comment

zcc1307 commented Feb 25, 2019

JohnLangford commented Mar 5, 2019

JohnLangford commented Apr 1, 2019

zcc1307 commented Apr 2, 2019

jackgerrits commented Apr 2, 2019 via email • edited Loading

JohnLangford commented Apr 2, 2019

zcc1307 Feb 25, 2019 •

edited

Loading

jackgerrits commented Apr 2, 2019 via email •

edited

Loading