Skip to content

Commit

Permalink
Merge branch 'feat/add-random-algorithm' of https://github.com/guedes…
Browse files Browse the repository at this point in the history
  • Loading branch information
Paitesanshi committed Oct 10, 2023
2 parents 6d2ee98 + 4395650 commit e4fe4e9
Show file tree
Hide file tree
Showing 34 changed files with 1,553 additions and 79 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ jobs:
pip install torch-scatter -f https://data.pyg.org/whl/torch-`python -c "import torch;print(torch.__version__)"`.html
pip install setuptools==59.5.0
pip install plotly
pip install kmeans-pytorch
# Use "python -m pytest" instead of "pytest" to fix imports
- name: Test Overall
run: |
Expand Down Expand Up @@ -90,4 +91,4 @@ jobs:
- name: Apply code-format changes
uses: stefanzweifel/git-auto-commit-action@v4
with:
commit_message: Format Python code according to PEP8
commit_message: Format Python code according to PEP8
50 changes: 25 additions & 25 deletions asset/dataset_list.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"inter_num": "-",
"sparsity": "-",
"type": "Rating",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/MovieLens.md"
},
{
Expand All @@ -19,7 +19,7 @@
"inter_num": "7,813,737",
"sparsity": "99.05%",
"type": "Rating [-1, 1-10]",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Anime.md"
},
{
Expand All @@ -30,7 +30,7 @@
"inter_num": "188,478",
"sparsity": "99.99%",
"type": "Rating [1-5]",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Epinions.md"
},
{
Expand All @@ -41,7 +41,7 @@
"inter_num": "-",
"sparsity": "-",
"type": "Rating",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Yelp.md"
},
{
Expand All @@ -52,7 +52,7 @@
"inter_num": "100,480,507",
"sparsity": "98.82%",
"type": "Rating [1-5]",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Netflix.md"
},
{
Expand All @@ -63,7 +63,7 @@
"inter_num": "1,149,780",
"sparsity": "99.99%",
"type": "Rating [0-10]",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Book-Crossing.md"
},
{
Expand All @@ -74,7 +74,7 @@
"inter_num": "4,136,360",
"sparsity": "44.22%",
"type": "Rating [-10, 10]",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Jester.md"
},
{
Expand All @@ -85,18 +85,18 @@
"inter_num": "2,125,056",
"sparsity": "89.73%",
"type": "Rating [0, 5]",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Douban.md"
},
{
"dataset": "Yahoo Music",
"dataset_link": "",
"user_num": "1,948,882",
"item_num": "98,211",
"inter_num": "11,557,943",
"inter_num": "111,557,943",
"sparsity": "99.99%",
"type": "Rating [0, 100]",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/YahooMusic.md"
},
{
Expand All @@ -107,7 +107,7 @@
"inter_num": "-",
"sparsity": "-",
"type": "Rating",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/KDD2010.md"
},
{
Expand All @@ -118,7 +118,7 @@
"inter_num": "-",
"sparsity": "-",
"type": "Rating [0, 5]",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Amazon.md"
},
{
Expand All @@ -129,7 +129,7 @@
"inter_num": "1,445,622",
"sparsity": "99.74%",
"type": "-",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Pinterest.md"
},
{
Expand All @@ -140,7 +140,7 @@
"inter_num": "6,442,892",
"sparsity": "99.99%",
"type": "Check-in",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Gowalla.md"
},
{
Expand All @@ -151,7 +151,7 @@
"inter_num": "92,834",
"sparsity": "99.72%",
"type": "Click",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/LastFM.md"
},
{
Expand All @@ -162,7 +162,7 @@
"inter_num": "993,483",
"sparsity": "99.99%",
"type": "Click",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/DIGINETICA.md"
},
{
Expand All @@ -173,7 +173,7 @@
"inter_num": "7,793,069",
"sparsity": "99.99%",
"type": "Buy",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Steam.md"
},
{
Expand All @@ -184,7 +184,7 @@
"inter_num": "817,741",
"sparsity": "99.89%",
"type": "Click",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/TaFeng.md"
},
{
Expand All @@ -195,7 +195,7 @@
"inter_num": "-",
"sparsity": "-",
"type": "Check-in",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Foursquare.md"
},
{
Expand All @@ -206,7 +206,7 @@
"inter_num": "44,528,127",
"sparsity": "99.99%",
"type": "Click/Buy",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Tmall.md"
},
{
Expand All @@ -217,7 +217,7 @@
"inter_num": "34,154,697",
"sparsity": "99.99%",
"type": "Click/Buy",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/YOOCHOOSE.md"
},
{
Expand All @@ -228,7 +228,7 @@
"inter_num": "2,756,101",
"sparsity": "99.99%",
"type": "View/Addtocart/Transaction",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/Retailrocket.md"
},
{
Expand All @@ -239,7 +239,7 @@
"inter_num": "1,088,161,692",
"sparsity": "99.71%",
"type": "Click",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/LFM-1b.md"
},
{
Expand All @@ -250,7 +250,7 @@
"inter_num": "-",
"sparsity": "-",
"type": "Click",
"link_name": "scipt",
"link_name": "script",
"link_url": "https://github.com/RUCAIBox/RecDatasets/blob/master/conversion_tools/usage/MIND.md"
},
{
Expand Down Expand Up @@ -452,4 +452,4 @@
"link_url": "https://tianchi.aliyun.com/dataset/dataDetail?dataId=56#1"
}
]
}
}
Binary file added asset/questionnaire.xlsx
Binary file not shown.
Binary file added docs/source/asset/diffrec.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/asset/ldiffrec.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Introduction
RecBole is a unified, comprehensive and efficient framework developed based on PyTorch.
It aims to help the researchers to reproduce and develop recommendation models.

In the lastest release, our library includes 87 recommendation algorithms `[Model List]`_, covering four major categories:
In the lastest release, our library includes 89 recommendation algorithms `[Model List]`_, covering four major categories:

- General Recommendation
- Sequential Recommendation
Expand Down
94 changes: 94 additions & 0 deletions docs/source/user_guide/model/general/diffrec.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
DiffRec
===========

Introduction
---------------------

`[paper] <https://dl.acm.org/doi/10.1145/3539618.3591663>`_

**Title:** Diffusion Recommender Model

**Authors:** Wenjie Wang, Yiyan Xu, Fuli Feng, Xinyu Lin, Xiangnan He, Tat-Seng Chua

**Abstract:** Generative models such as Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) are widely utilized to model the generative process of user interactions. However, they suffer from intrinsic limitations such as the instability of GANs and the restricted representation ability of VAEs. Such limitations hinder the accurate modeling of the complex user interaction generation procedure, such as noisy interactions caused by various interference factors. In light of the impressive advantages of Diffusion Models (DMs) over traditional generative models in image synthesis, we propose a novel Diffusion Recommender Model (named DiffRec) to learn the generative process in a denoising manner. To retain personalized information in user interactions, DiffRec reduces the added noises and avoids corrupting users’ interactions into pure noises like in image synthesis. In addition, we extend traditional DMs to tackle the unique challenges in recommendation: high resource costs for large-scale item prediction and temporal shifts of user preference. To this end, we propose two extensions of DiffRec: L-DiffRec clusters items for dimension compression and conducts the diffusion processes in the latent space; and T-DiffRec reweights user interactions based on the interaction timestamps to encode temporal information. We conduct extensive experiments on three datasets under multiple settings (e.g., clean training, noisy training, and temporal training). The empirical results validate the superiority of DiffRec with two extensions over competitive baselines.

.. image:: ../../../asset/diffrec.png
:width: 500
:align: center

Running with RecBole
-------------------------

**Model Hyper-Parameters:**

- ``noise_schedule (str)`` : The schedule for noise generating: ['linear', 'linear-var', 'cosine', 'binomial']. Defaults to ``'linear'``.
- ``noise_scale (int)`` : The scale for noise generating. Defaults to ``0.001``.
- ``noise_min (int)`` : Noise lower bound for noise generating. Defaults to ``0.0005``.
- ``noise_max (int)`` : 0.005 Noise upper bound for noise generating. Defaults to ``0.005``.
- ``sampling_noise (bool)`` : Whether to use sampling noise. Defaults to ``False``.
- ``sampling_steps (int)`` : Steps of the forward process during inference. Defaults to ``0``.
- ``reweight (bool)`` : Assign different weight to different timestep or not. Defaults to ``True``.
- ``mean_type (str)`` : MeanType for diffusion: ['x0', 'eps']. Defaults to ``'x0'``.
- ``steps (int)`` : Diffusion steps. Defaults to ``5``.
- ``history_num_per_term (int)`` : The number of history items needed to calculate loss weight. Defaults to ``10``.
- ``beta_fixed (bool)`` : Whether to fix the variance of the first step to prevent overfitting. Defaults to ``True``.
- ``dims_dnn (list of int)`` : The dims for the DNN. Defaults to ``[300]``.
- ``embedding_size (int)`` : Timestep embedding size. Defaults to ``10``.
- ``mlp_act_func (str)`` : Activation function for MLP. Defaults to ``'tanh'``.
- ``time-aware (bool)`` : T-DiffRec or not. Defaults to ``False``.
- ``w_max (int)`` : The upper bound of the time-aware interaction weight. Defaults to ``1``.
- ``w_min (int)`` : The lower bound of the time-aware interaction weight. Defaults to ``0.1``.


**A Running Example:**

Write the following code to a python file, such as `run.py`

.. code:: python
from recbole.quick_start import run_recbole
run_recbole(model='DiffRec', dataset='ml-100k')
And then:

.. code:: bash
python run.py
**Notes:**

- ``w_max`` and ``w_min`` are unused when ``time-aware`` is False.

Tuning Hyper Parameters
-------------------------

If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``.

.. code:: bash
learning_rate choice [1e-3,1e-4,1e-5]
dims_dnn choice ['[300]','[200,600]','[1000]']
steps choice [2,5,10,50]
noice_scale choice [0,1e-5,1e-4,1e-3,1e-2,1e-1]
noice_min choice [5e-4,1e-3,5e-3]
noice_max choice [5e-3,1e-2]
w_min choice [0.1,0.2,0.3]
Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model.

Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning:

.. code:: bash
python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test
For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`.


If you want to change parameters, dataset or evaluation settings, take a look at

- :doc:`../../../user_guide/config_settings`
- :doc:`../../../user_guide/data_intro`
- :doc:`../../../user_guide/train_eval_intro`
- :doc:`../../../user_guide/usage`
Loading

0 comments on commit e4fe4e9

Please sign in to comment.