title

booktitle

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models

Proceedings of the 39th International Conference on Machine Learning

We consider the regret minimization task in a dueling bandits problem with context information. In every round of the sequential decision problem, the learner makes a context-dependent selection of two choice alternatives (arms) to be compared with each other and receives feedback in the form of noisy preference information. We assume that the feedback process is determined by a linear stochastic transitivity model with contextualized utilities (CoLST), and the learner’s task is to include the best arm (with highest latent context-dependent utility) in the duel. We propose a computationally efficient algorithm, \Algo{CoLSTIM}, which makes its choice based on imitating the feedback process using perturbed context-dependent utility estimates of the underlying CoLST model. If each arm is associated with a $d$-dimensional feature vector, we show that \Algo{CoLSTIM} achieves a regret of order $\tilde O( \sqrt{dT})$ after $T$ learning rounds. Additionally, we also establish the optimality of \Algo{CoLSTIM} by showing a lower bound for the weak regret that refines the existing average regret analysis. Our experiments demonstrate its superiority over state-of-art algorithms for special cases of CoLST models.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

bengs22a

0

Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models

1764

1786

1764-1786

1764

false

Bengs, Viktor and Saha, Aadirupa and H{\"u}llermeier, Eyke

given	family
Viktor	Bengs

given	family
Aadirupa	Saha

given	family
Eyke	Hüllermeier

2022-06-28

Proceedings of the 39th International Conference on Machine Learning

162

inproceedings

date-parts

2022

6

28

https://proceedings.mlr.press/v162/bengs22a/bengs22a.pdf

label	link
Other Files	https://media.icml.cc/Conferences/ICML2022/supplementary/bengs22a-supp.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2022-06-28-bengs22a.md

2022-06-28-bengs22a.md

Files

2022-06-28-bengs22a.md

Latest commit

History

2022-06-28-bengs22a.md

File metadata and controls