Merge pull request #1383 from Sherry-XLL/1.1.x

FIX: rename neg_sampling and fix bugs
RUCAIBox · Aug 15, 2022 · 2cab6c8 · 2cab6c8
2 parents 5e21302 + 1a63916
commit 2cab6c8
Show file tree

Hide file tree

Showing 30 changed files with 41 additions and 33 deletions.
diff --git a/docs/source/user_guide/model/sequential/bert4rec.rst b/docs/source/user_guide/model/sequential/bert4rec.rst
@@ -47,7 +47,7 @@ Running with RecBole
 - ``layer_norm_eps (float)`` : A value added to the denominator for numerical stability. Defaults to ``1e-12``.
 - ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to ``0.02``.
 - ``mask_ratio (float)`` : The probability for a item replaced by MASK token. Defaults to ``0.2``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 
 **A Running Example:**

diff --git a/docs/source/user_guide/model/sequential/caser.rst b/docs/source/user_guide/model/sequential/caser.rst
@@ -26,7 +26,7 @@ Running with RecBole
 - ``n_v (int)`` : The number of vertical Convolutional filters. Defaults to ``8``.
 - ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-4``.
 - ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.4``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 **A Running Example:**
 

diff --git a/docs/source/user_guide/model/sequential/core.rst b/docs/source/user_guide/model/sequential/core.rst
@@ -32,7 +32,7 @@ Running with RecBole
 - ``hidden_act (str)`` : The activation function in feed-forward layer. Defaults to ``'gelu'``. Range in ``['gelu', 'relu', 'swish', 'tanh', 'sigmoid']``.
 - ``layer_norm_eps (float)`` : A value added to the denominator for numerical stability. Defaults to ``1e-12``.
 - ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to 0.02``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 - ``dnn_type (str)`` : The type of DNN. If it set to ``'trm'``, CORE will leverage a Transformer encoder to learn weights. If it set to ``ave``, CORE will simply use mean pooling for session encoding. Defaults to ``'trm'``. Range in ``['trm', 'ave']``.
 - ``sess_dropout (float)`` : The probability of an element of item embeddings in a session to be zeroed. Defaults to ``0.2``.
 - ``item_dropout (float)`` : The probability of an element of candidate item embeddings to be zeroed. Defaults to ``0.2``.

diff --git a/docs/source/user_guide/model/sequential/fdsa.rst b/docs/source/user_guide/model/sequential/fdsa.rst
@@ -45,7 +45,7 @@ Running with RecBole
 - ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to ``0.02``.
 - ``selected_features (list)`` : The list of selected item features. Defaults to ``['class']`` for ml-100k dataset.
 - ``pooling_mode (str)``: The intra-feature pooling mode. Defaults to ``'mean'``. Range in ``['max', 'mean', 'sum']``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 
 **A Running Example:**

diff --git a/docs/source/user_guide/model/sequential/fossil.rst b/docs/source/user_guide/model/sequential/fossil.rst
@@ -46,7 +46,7 @@ Running with RecBole
 - ``order_len (int)`` : The last N items . Defaults to ``3``.
 - ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``0.0``.
 - ``alpha (float)`` : The parameter of alpha in calculate the similarity. Defaults to ``0.6``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 **A Running Example:**
 

diff --git a/docs/source/user_guide/model/sequential/fpmc.rst b/docs/source/user_guide/model/sequential/fpmc.rst
@@ -42,6 +42,7 @@ Running with RecBole
 **Model Hyper-Parameters:**
 
 - ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``.
+- ``loss_type (str)`` : The type of loss function. It is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'BPR'``. Range in ``['BPR']``.
 
 
 **A Running Example:**
@@ -52,7 +53,10 @@ Write the following code to a python file, such as `run.py`
 
    from recbole.quick_start import run_recbole
 
-   run_recbole(model='FPMC', dataset='ml-100k')
+   parameter_dict = {
+      'train_neg_sample_args': None,
+   }
+   run_recbole(model='FPMC', dataset='ml-100k', config_dict=parameter_dict)
 
 And then:
 
@@ -62,7 +66,7 @@ And then:
 
 **Notes:**
 
-- Different from other sequential models, FPMC must be optimized in pair-wise way using negative sampling, so it needs ``neg_sampling="{'uniform': 1}"``.
+- Different from other sequential models, FPMC must be optimized in pair-wise way using negative sampling, so it needs ``train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``.
 
 Tuning Hyper Parameters
 -------------------------

diff --git a/docs/source/user_guide/model/sequential/gcsan.rst b/docs/source/user_guide/model/sequential/gcsan.rst
@@ -43,7 +43,7 @@ Running with RecBole
 - ``step (int)`` : The number of layers in GNN. Defaults to ``1``.
 - ``weight (float)`` : The weight parameter controls the contribution of self-attention representation and the last-clicked action, the original paper suggests that setting w to a value of 0.4 to 0.8 is more desirable. Defaults to ``0.6``.
 - ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``[5e-5]``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 **A Running Example:**
 

diff --git a/docs/source/user_guide/model/sequential/gru4rec.rst b/docs/source/user_guide/model/sequential/gru4rec.rst
@@ -33,7 +33,7 @@ Running with RecBole
 - ``hidden_size (int)`` : The number of features in the hidden state. Defaults to ``128``.
 - ``num_layers (int)`` : The number of layers in GRU. Defaults to ``1``.
 - ``dropout_prob (float)``: The dropout rate. Defaults to ``0.3``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 
 **A Running Example:**

diff --git a/docs/source/user_guide/model/sequential/gru4recf.rst b/docs/source/user_guide/model/sequential/gru4recf.rst
@@ -42,7 +42,7 @@ Running with RecBole
 - ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.3``.
 - ``selected_features (list)`` : The list of selected item features. Defaults to ``['class']`` for ml-100k dataset.
 - ``pooling_mode (str)`` : The intra-feature pooling mode. Defaults to ``'sum'``. Range in ``['max', 'mean', 'sum']``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 
 **A Running Example:**

diff --git a/docs/source/user_guide/model/sequential/gru4reckg.rst b/docs/source/user_guide/model/sequential/gru4reckg.rst
@@ -16,7 +16,7 @@ Running with RecBole
 - ``num_layers (int)`` : The number of layers in GRU. Defaults to ``1``.
 - ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.1``.
 - ``freeze_kg (bool)`` : Whether to freeze the pre-trained knowledge embedding feature. Defaults to ``True``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 
 **A Running Example:**

diff --git a/docs/source/user_guide/model/sequential/hgn.rst b/docs/source/user_guide/model/sequential/hgn.rst
@@ -42,7 +42,7 @@ Running with RecBole
 - ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``.
 - ``pooling_type (str)`` : The type of pooling include average pooling and max pooling . Defaults to ``average``.
 - ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``[0.00,0.00]``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 **A Running Example:**
 

diff --git a/docs/source/user_guide/model/sequential/hrm.rst b/docs/source/user_guide/model/sequential/hrm.rst
@@ -46,7 +46,7 @@ Running with RecBole
 - ``pooling_type_layer_1 (str)`` : The type of pooling in the first floor include average pooling and max pooling . Defaults to ``max``.
 - ``pooling_type_layer_2 (str)`` : The type of pooling in the second floor include average pooling and max pooling . Defaults to ``max``.
 - ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.2``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 **A Running Example:**
 

diff --git a/docs/source/user_guide/model/sequential/ksr.rst b/docs/source/user_guide/model/sequential/ksr.rst
@@ -27,7 +27,7 @@ Running with RecBole
 - ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.1``.
 - ``freeze_kg (bool)`` : Whether to freeze the pre-trained knowledge embedding feature. Defaults to ``True``.
 - ``gamma (float)`` : The scaling factor used to read operation when calculating the attention weights of user preference on attributes. Defaults to ``10.0``.
-- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--neg_sampling="{'uniform': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
+- ``loss_type (str)`` : The type of loss function. If it is set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it is set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximizes the difference between the positive item and the negative one. In this way, negative sampling is necessary, such as setting ``--train_neg_sample_args="{'distribution': 'uniform', 'sample_num': 1}"``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``.
 
 
 **A Running Example:**