test: [RLOS_2023] test for contextual bandit #4612

michiboo · 2023-06-08T17:05:11Z

No description provided.

python/tests/pytest.json

python/tests/test_cb.json

ataymano · 2023-06-09T13:17:17Z

python/tests/core.py

+
+CURR_DICT = os.path.dirname(os.path.abspath(__file__))
+
+def combine_list_cmds_grids(cmds, base_grid):


seems like a lot of it is already implemented as add/mul operators of Grid and can be reused

ataymano · 2023-06-12T14:53:59Z

python/tests/probability_functions.py

@@ -0,0 +1,7 @@
+def constant_probability(chosen_action, **kwargs):


seems to be even_probability(1)?

the fact that this function both can be represented as even_probability and not using any of its arguments is confuing.
Can we either remove it or make it with no arguments?

ataymano · 2023-06-12T14:54:48Z

python/tests/reward_functions.py

+    return reward[chosen_action - 1]
+
+
+def fixed_reward_for_diff_context(**kwargs):


name seems to be misleading

python/tests/core.py

ataymano · 2023-06-14T14:47:37Z

python/tests/probability_functions.py

+    return 1
+
+
+def even_probability(chosen_action, **kwargs):


Why do we need to use kwargs here but not regular named arguments?

ataymano · 2023-06-14T14:49:04Z

python/tests/reward_functions.py

+    return 1
+
+
+def constant_reward(**kwargs):


kwargs seem to be replaceable with regular arguments

ataymano · 2023-06-14T14:51:00Z

python/tests/test_cb.json

+                "name": "fixed_reward",
+                "params": {}
+            },
+            "probability_function": {


better to rename to "logging_policy"

ataymano · 2023-06-19T17:15:52Z

python/tests/reward_functions.py

+    return 1
+
+
+def constant_reward(**kwargs):


Suggested change

def constant_reward(**kwargs):

def constant_reward(chosen_action: int, reward: float):

ataymano · 2023-06-19T17:32:18Z

python/tests/reward_functions.py

@@ -0,0 +1,22 @@
+def fixed_reward(**kwargs):


Let's do fixed signature here:
def reward_func(context: int, chosen_action: int)

ataymano · 2023-06-19T17:42:44Z

python/tests/reward_functions.py

+def fixed_reward_two_action(**kwargs):
+    chosen_action = kwargs["chosen_action"]
+    chosen_context = kwargs["chosen_context"]
+    if chosen_context == 1 and chosen_action == 2:


better to expose this logic to config file somehow:

either via more descriptive function name

or via extra parameter (like reward matrix)

but let's do it in next Pr

can you please replace chosen_context with context? - agent is not choosing it

* intro notebook * test: [RLOS_2023][WIP] updated test for regression weight (#4600) * test: add test for regression weight * test: make test more reusable by using json to specify pytest * test: minor fix on naming * test: add and option to python json test * test: [RLOS_2023] test for contextual bandit (#4612) * test: add basic cb test and configuration * test: add shared context data generation * add test for cb_explore_adf * test: dynamically create pytest test case * test: give fixed reward function signature * test: [RLOS_2023] [WIP] Support + and * expression for grids (#4618) * test: add basic cb test and configuration * test: add shared context data generation * add test for cb_explore_adf * test: dynamically create pytest test case * test: give fixed reward function signature * test: support + and * expression for grids * fix empty expression bugs * test: [RLOS2023] [WIP] add more arguments for reg&cb tests (#4619) * test: add more arguments for reg&cb tests * test: fix minor bug in generate expression & add loss funcs to tests * test: [RLOS2023] [WIP] add classification test (#4623) * test: add more arguments for reg&cb tests * test: fix minor bug in generate expression & add loss funcs to tests * test: add test for classification * test: organize test framework structure (#4624) * test: [RLOS2023][WIP] add option for storing output and grid language redefinition (#4627) * test: redesign grid lang * test: add option for store output * test: change list to dict for config vars * test: [RLOS2023] add test for slate (#4629) * test: add test for slate * test: test cleanup and slate test update * test: minor cleanup and change assert_loss function to equal instead of lower * test: [RLOS2023] add test for cb with continous action (#4630) * test: add test for slate * test: test cleanup and slate test update * test: minor cleanup and change assert_loss function to equal instead of lower * test: add test for cb with continous action * modify blocker testcase * test: [RLOS2023] clean for e2e testing framework v2 (#4633) * test: clean for e2e test v2 * test:change seed to same value for all tests * test: add datagen driver (#4638) * python black * python black 2 * minor demo cleanup --------- Co-authored-by: Alexey Taymanov <ataymano@microsoft.com> Co-authored-by: Alexey Taymanov <41013086+ataymano@users.noreply.github.com>

ataymano reviewed Jun 9, 2023

View reviewed changes

python/tests/pytest.json Outdated Show resolved Hide resolved

ataymano reviewed Jun 9, 2023

View reviewed changes

python/tests/test_cb.json Outdated Show resolved Hide resolved

ataymano reviewed Jun 9, 2023

View reviewed changes

michiboo changed the base branch from master to rlos2023/test June 9, 2023 15:25

test: add basic cb test and configuration

6991585

michiboo force-pushed the py_cb_test branch from 46b4402 to 6991585 Compare June 9, 2023 16:09

test: add shared context data generation

c884f53

michiboo force-pushed the py_cb_test branch from db3bb69 to c884f53 Compare June 11, 2023 11:12

add test for cb_explore_adf

e0245d4

ataymano reviewed Jun 12, 2023

View reviewed changes

ataymano reviewed Jun 14, 2023

View reviewed changes

python/tests/core.py Outdated Show resolved Hide resolved

ataymano reviewed Jun 14, 2023

View reviewed changes

python/tests/core.py Outdated Show resolved Hide resolved

ataymano reviewed Jun 14, 2023

View reviewed changes

michiboo force-pushed the py_cb_test branch 3 times, most recently from b17b944 to bab17ed Compare June 18, 2023 13:58

test: dynamically create pytest test case

1dd3401

michiboo force-pushed the py_cb_test branch from bab17ed to 1dd3401 Compare June 18, 2023 14:18

ataymano reviewed Jun 19, 2023

View reviewed changes

michiboo force-pushed the py_cb_test branch 2 times, most recently from c4e4079 to cd1b870 Compare June 20, 2023 14:10

michiboo changed the title ~~test: [RLOS_2023][WIP] test for contextual bandit~~ test: [RLOS_2023] test for contextual bandit Jun 20, 2023

test: give fixed reward function signature

d3fdf18

michiboo force-pushed the py_cb_test branch from cd1b870 to d3fdf18 Compare June 20, 2023 15:48

ataymano merged commit d5f3c96 into VowpalWabbit:rlos2023/test Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: [RLOS_2023] test for contextual bandit #4612

test: [RLOS_2023] test for contextual bandit #4612

michiboo commented Jun 8, 2023

ataymano Jun 9, 2023

ataymano Jun 12, 2023

ataymano Jun 14, 2023

ataymano Jun 12, 2023

ataymano Jun 14, 2023

ataymano Jun 14, 2023

ataymano Jun 14, 2023

ataymano Jun 19, 2023 •

edited

Loading

ataymano Jun 19, 2023

ataymano Jun 19, 2023

ataymano Jun 19, 2023

ataymano Jun 20, 2023


		CURR_DICT = os.path.dirname(os.path.abspath(__file__))

		def combine_list_cmds_grids(cmds, base_grid):

		@@ -0,0 +1,7 @@
		def constant_probability(chosen_action, **kwargs):

		return reward[chosen_action - 1]


		def fixed_reward_for_diff_context(**kwargs):

	def constant_reward(**kwargs):
	def constant_reward(chosen_action: int, reward: float):

test: [RLOS_2023] test for contextual bandit #4612

test: [RLOS_2023] test for contextual bandit #4612

Conversation

michiboo commented Jun 8, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ataymano Jun 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ataymano Jun 19, 2023 •

edited

Loading