PO4ISR is a simple yet effective paradigm for Intent-aware session recommendation (ISR) motivated by the advanced reasoning capability of large language models (LLMs). Specifically, we first create an initial prompt to instruct LLMs to predict the next interacted item by inferring varying user intents reflected in a session. Then, an effective prompt optimization mechanism is proposed to automatically optimize prompts with iterative self-reflection. Finally, the prompt selection module is designed to efficiently choose optimized prompts, leveraging the robust generalizability of LLMs across diverse domains.
Install all dependencies via:
pip install -r requirements.txt
We adopt three real-world datasets from various domains: MovieLens-1M, Games and Bundle. You can find the datasets used in the experiment under the Dataset
directory. Each dataset includes both its ID and text format. In addition to providing the randomly-sampled training data, we also provide full versions of the training data under the Dataset
directory.
train_sample_x.npy
: randomly select x sessions from the full version of the training dataset as the training set. x can be 50 or 150.train.npy
: full version of the training data.valid.npy
: the validation set containing all validation sessions.valid_candidate.npy
: the candidate set corresponding to each session in the validation set.test.npy
: test set containing all test sessions.test_candidate_x.npy
: the candidate sets constructed by 5 different random seeds, corresponding to each session in the test set. x can be 0, 10, 42, 625 or 2023.
train_x.json
: the training set in text format corresponding to thetrain_sample_x.npy
file in ID format.valid.json
: the validation set in text format containing both validation sessions and the candidates, which corresponds to thevalid.npy
andvalid_candidate.npy
file in ID format.test_seed_x.json
: the test set in text format containing both test sessions and the candidates, which corresponds to thetest.npy
andtest_candidate_x.npy
file in ID format.
The tune.py
corresponds to the process of prompt optimization. Before running the code, you need to fill in your OpenAI API token in the ./PO4ISR/assets/openai.yaml
file and wandb token in the ./PO4ISR/assets/overall.yaml
file.
python tune.py --dataset='dataset name' --sample_num='number of training data'
The test.py
file corresponds to the evaluation of the prompt.
python test.py --dataset='dataset name' --seed='value of the seed'
Note that all the optimal prompts are saved in the PO4ISR/prompts.py
file. If you want to test the results with these prompts, you can replace them in the test.py
.
python test.py --dataset='dataset name' --seed='value of the seed' --api_key='your OpenAI API token'
We use the open-source framework Optuna to automatically find out the optimal hyperparameters of all methods with 50 trails. The item embedding size is searched from {32, 64, 128}; learning rate is searched from {10−4, 10−3, 10−2}; batch size is searched from {64, 128, 256} and we use an early stop mechanism to halt the model training, with a maximum of 100 epochs. For SKNN, 𝐾 is searched from {50, 100, 150}. For NARM, the hidden size is searched in [50, 200] stepped by 50, and the number of layers is searched in {1, 2, 3}. For GCE-GNN, the number of hops is searched in {1, 2}; the dropout rate for global aggregators is searched in [0, 0.8] stepped by 0.2 and the dropout rate for local aggregators is searched in {0, 0.5}. For MCPRN, 𝜏 is searched in {0.01, 0.1, 1, 10} and the number of purpose channels is searched in {1, 2, 3, 4}. For HIDE, the number of factors is searched in {1, 3, 5, 7, 9}; the regularization and balance weights are searched in {10−5, 10−4, 10−3, 10−2}; the window size is searched in [1, 10] stepped by 1; and the sparsity coefficient is set as 0.4. For Atten-Mixer, the intent level 𝐿 is searched in [1, 10] stepped by 1 and the number of attention heads is searched in {1, 2, 4, 8}. The optimal parameter settings are shown in Table 1.
Table 1: Optimal Parameter Settings for Non-LLM-Baselines.
Bundle | ML-1M | Games | |
---|---|---|---|
SKNN | |||
FPMC | |
|
|
NARM | |||
STAMP | |
|
|
GCE-GNN | |
|
|
MCPRN | |
|
|
HIDE | |
|
|
Atten-Mixer | |
|
|
You can run the following command to tune the model and find the optimal parameter combination.
python tune.py --dataset='dataset name' --sample_num='number of training data' --model='model name'
After completing the tuning process, we can find the optimal parameter settings in the tune_log
directory. Additionally, we have placed the optimal parameters obtained during the experiment in the config.py
. You can also use the following command to directly test the model.
python test.py --dataset='dataset name' --model='model name' --seed='value of the seed'
We refer to the following repositories to implement the baselines in our code:
- Non-LLM-Baselines part with Understanding-Diversity-in-SBRSs
- LLM-Baseline-NIR parts with LLM-Next-Item-Rec