Skip to content

Bilkent-CYBORG/OCLOK-UCB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contextual Combinatorial Volatile Bandits via Gaussian Processes

This repository is the official implementation of Contextual Combinatorial Volatile Bandits via Gaussian Processes, submitted to Machine Learning. Illustration of our algorithm called O'CLOCK-UCB.

Requirements

To install requirements:

pip install -r requirements.txt

We use the gpflow library for all GP-related computations and gpflow uses tensorflow. Our code uses the TIM+ algorithm, for which you must link the C++ TIM+ code to Python. Follow here for linking instructions. Once the library has been generated, place it both in the root directory where main.py is and also inside the tim_plus directory.

Running the simulations

We ran a total of three simulations. Moreover, none of the algorithms that we implement and test do offline-learning, thus there is no 'training' to be done. However, to be able to repeat the simulations and also improve speed, we first generate the arm contexts, rewards, and other setup-related information and save them as HDF5, in the case of Simulation I, and pickled DataFrames, in the case of Simulations II & III. We provide the links to the generated datasets that we used at the bottom of this README file. By default, when you run the script (main.py), it re-generates new datasets and runs the simulations on them.

Simulation I (movie recommendation)

To run Simulation I, provide the argument sim_1 to the main.py script. For example, to re-generate this simulation's datasets and run the simulations on the newly generated datasets use

python main.py main

and to run the main paper simulations using pre-generated datasets , which must be in the root directory, use

python main.py main --use_saved_dataset

Simulation II (Foursquare)

To run Simulation II, use the argument sim_2. You can provide the --use_saved_dataset argument to use pre-generated and saved datasets.

Simulation III (Varying kernel parameters)

To run Simulation III, use the argument sim_3. You can provide the --use_saved_dataset argument to use pre-generated and saved datasets.

Generating plots and figures

After the script has run the simulations, it will automatically plot the reward and regret curves and save them as PDFs. Then, if you want to re-generate the plots without running the whole simulation again, you can give the --only_plot argument to the main.py script.

Generated datasets

You can download the generated datasets that we ran the simulations with below:

Simulation I dataset

The HDF5 dataset file can be downloaded here. Make sure to place the 'movielens_simulation.hdf5' file in the root directory where main.py is.

Simulation II datasets

A zip file of the pickled DataFrames used for the Foursquare simulation (Simulation II) can be downloaded here. Make sure to extract both 'fs_tky_simulation_df_uni' and 'fs_tky_simulation_df_nuni' and place them in the root directory where main.py is.

We use the Wolfram Engine to learn the distribution of the TKY dataset's locations; thus to generate the dataset, you must have the Wolfram Engine installed. It can be installed for free here. Moreover, you must download the exported LearnedDistribution file available here and set its absolute path in fs_problem_model.py. Note that if you download the saved datasets (‘fs_tky_simulation_df_uni’ and ‘fs_tky_simulation_df_nuni’), you will NOT need to download the Wolfram Engine. The Wolfram Engine is only needed to generate the datasets.

Simulation III datasets

A zip file of the pickled DataFrames used for the varying arm codependence simulation (Simulation III) can be downloaded here. Make sure to extract all of the five files, each corresponding to a different kernel lengthscale, and place them in the root directory where main.py is.

Results

Our algorithm beats the current combinatorial contextual volatile multi-armed bandit (CCV-MAB) algorithm, ACC-UCB. The figure below shows the time-averaged reward of the sparse version of our algorithm (SO'CLOK_UCB) and ACC-UCB on a movie-recommendation simulation (Simulation I) using the MovieLens dataset. Notice that even with just 2 inducing points, we manage to outperform ACC-UCB. See Section 5 of the paper for a detailed explanation of the setup and in-depth analysis. Main results figure

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published