-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: [RLOS_2023] test for contextual bandit #4612
Conversation
python/tests/core.py
Outdated
|
||
CURR_DICT = os.path.dirname(os.path.abspath(__file__)) | ||
|
||
def combine_list_cmds_grids(cmds, base_grid): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like a lot of it is already implemented as add/mul operators of Grid and can be reused
@@ -0,0 +1,7 @@ | |||
def constant_probability(chosen_action, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to be even_probability(1)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the fact that this function both can be represented as even_probability and not using any of its arguments is confuing.
Can we either remove it or make it with no arguments?
python/tests/reward_functions.py
Outdated
return reward[chosen_action - 1] | ||
|
||
|
||
def fixed_reward_for_diff_context(**kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name seems to be misleading
return 1 | ||
|
||
|
||
def even_probability(chosen_action, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to use kwargs here but not regular named arguments?
python/tests/reward_functions.py
Outdated
return 1 | ||
|
||
|
||
def constant_reward(**kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kwargs seem to be replaceable with regular arguments
python/tests/test_cb.json
Outdated
"name": "fixed_reward", | ||
"params": {} | ||
}, | ||
"probability_function": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to rename to "logging_policy"
b17b944
to
bab17ed
Compare
python/tests/reward_functions.py
Outdated
return 1 | ||
|
||
|
||
def constant_reward(**kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def constant_reward(**kwargs): | |
def constant_reward(chosen_action: int, reward: float): |
python/tests/reward_functions.py
Outdated
@@ -0,0 +1,22 @@ | |||
def fixed_reward(**kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do fixed signature here:
def reward_func(context: int, chosen_action: int)
python/tests/reward_functions.py
Outdated
def fixed_reward_two_action(**kwargs): | ||
chosen_action = kwargs["chosen_action"] | ||
chosen_context = kwargs["chosen_context"] | ||
if chosen_context == 1 and chosen_action == 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to expose this logic to config file somehow:
- either via more descriptive function name
- or via extra parameter (like reward matrix)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but let's do it in next Pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please replace chosen_context with context? - agent is not choosing it
c4e4079
to
cd1b870
Compare
* intro notebook * test: [RLOS_2023][WIP] updated test for regression weight (#4600) * test: add test for regression weight * test: make test more reusable by using json to specify pytest * test: minor fix on naming * test: add and option to python json test * test: [RLOS_2023] test for contextual bandit (#4612) * test: add basic cb test and configuration * test: add shared context data generation * add test for cb_explore_adf * test: dynamically create pytest test case * test: give fixed reward function signature * test: [RLOS_2023] [WIP] Support + and * expression for grids (#4618) * test: add basic cb test and configuration * test: add shared context data generation * add test for cb_explore_adf * test: dynamically create pytest test case * test: give fixed reward function signature * test: support + and * expression for grids * fix empty expression bugs * test: [RLOS2023] [WIP] add more arguments for reg&cb tests (#4619) * test: add more arguments for reg&cb tests * test: fix minor bug in generate expression & add loss funcs to tests * test: [RLOS2023] [WIP] add classification test (#4623) * test: add more arguments for reg&cb tests * test: fix minor bug in generate expression & add loss funcs to tests * test: add test for classification * test: organize test framework structure (#4624) * test: [RLOS2023][WIP] add option for storing output and grid language redefinition (#4627) * test: redesign grid lang * test: add option for store output * test: change list to dict for config vars * test: [RLOS2023] add test for slate (#4629) * test: add test for slate * test: test cleanup and slate test update * test: minor cleanup and change assert_loss function to equal instead of lower * test: [RLOS2023] add test for cb with continous action (#4630) * test: add test for slate * test: test cleanup and slate test update * test: minor cleanup and change assert_loss function to equal instead of lower * test: add test for cb with continous action * modify blocker testcase * test: [RLOS2023] clean for e2e testing framework v2 (#4633) * test: clean for e2e test v2 * test:change seed to same value for all tests * test: add datagen driver (#4638) * python black * python black 2 * minor demo cleanup --------- Co-authored-by: Alexey Taymanov <ataymano@microsoft.com> Co-authored-by: Alexey Taymanov <41013086+ataymano@users.noreply.github.com>
No description provided.