This repository is the implementation of SOFA, the Simulator for OFfline leArning and evaluation.
Keeping Dataset Biases out of the Simulation: A Debiased Simulator for Reinforcement Learning based Recommender Systems. Jin Huang, Harrie Oosterhuis, Maarten de Rijke, Herke van Hoof. Recsys 2020.
The framework shows how RL4Rec typically interacts with a simulation-based environment. A state is user historical interactions, an action is an item being recommended bytheRS, and a reward is related to user feedback.
As a solution to the effect of bias present in logged data, we introduce a debiasing step in the simulation pipeline, which corrects for the biases present in the logged data before it is used to simulate user behavior.
$ cd examples
$ python run_dqn.py
We provide the details of DQN-based Policy used in experiments and the related hyperparamters (See Appendix). And we also provide the slide used for presentation in recsys 2020.
If you use our code, please cite our paper:
@inproceedings{huang2020keeping,
title={Keeping Dataset Biases out of the Simulation: A Debiased Simulator for Reinforcement Learning based Recommender Systems},
author={Huang, Jin and Oosterhuis, Harrie and de Rijke, Maarten and van Hoof, Herke},
booktitle={Fourteenth ACM Conference on Recommender Systems},
pages={190--199},
year={2020}
}