Some minor changes (including Readme)

kdis-lab · Dec 17, 2019 · c270917 · c270917
1 parent 990fef1
commit c270917
Show file tree

Hide file tree

Showing 3 changed files with 33 additions and 22 deletions.
diff --git a/ExecuteMulan_1.3.jar b/ExecuteMulan_1.3.jar
diff --git a/README.md b/README.md
@@ -6,49 +6,60 @@ In this repository, we created an executable .jar file to execute MLC and MTR me
 
 To execute it, the following command have to be executed in a console:
 ```sh
-java -jar ExecuteMulan.jar [parameters]
+java -jar ExecuteMulan_1.3.jar [parameters]
 ```
 
-The jar file allows several parameters to indicate the method to execute, the dataset, and so on. The different parameters are the following:
+The jar file allows several parameters to indicate the method to execute, the dataset, and so on. Some of the parameters are common to all or almost all algorithms, and are the following:
 * With the ```-t``` parameter, we define the .arff filename of the training dataset. In case of performing CV procedure, it defines the full set.
-* With the ```-T``` parameter, we define the .arff filename of the test dataset. It should not be used if CV procedure is used.
+* With the ```-T``` parameter, we define the .arff filename of the test dataset. It should not be used if CV procedure is used. If not CV is performed and test data is not given, train data is used as default for testing too.
 * With the ```-x``` parameter, we define the .xml filename of the dataset [(know more about the format of the datasets)](http://www.uco.es/kdis/mllresources/).
 * With the ```-a``` parameter, the MLC or MTR method to be executed is indicated. Below is provided a list with all the available methods in this jar.
-* With the ```-f``` parameter, we define the number of folds for the cross-validation procedure. If it is not set, the CV procedure is not performed.
-* With the ```-o``` parameter, we define the filename of the putput file where the results are stored as a .csv file.
-* With the ```-i``` parameter, the number of different seeds for random numbers (for those algorithms that need) is indicated; i.e., the number of executions for those algorithms that use random numbers. If this parameter is passed to a method that does not use random numbers, it is omitted. By default, its value is 10.
-* The ```-l``` parameter, indicates if, for the macro-averaged measures, the values of the measure for each label is reported or not. It could take the values 0 (false) and 1 (true). By default, its value is 0.
+* With the ```-f``` parameter, we define the number of folds for the cross-validation procedure. If it is not set, the CV procedure is not performed, and both train and test are used.
+* With the ```-o``` parameter, we define the filename of the output file where the results are stored as a .csv file.
+* With the ```-i``` parameter, the number of different seeds for random numbers (for those algorithms that need) is indicated; i.e., the number of executions for those algorithms that use random numbers. If this parameter is passed to a method that does not use random numbers, it is omitted. By default, its value is 1.
+* If the ```-l``` flag is included, the values of the macro-averaged measures for each label are reported. By default, they are not reported.
+
+Each method also accept a number of different parameters to tune them. The methods _printDefaultParameters()_ and _printParametersDescription()_ of the _LearnerParameters_ class are very helpful to understand which parameters need each of the method. However, we present some of the most common here:
+* With the ```-c``` parameter, we indicate the base learner to use in the given method. Allowed values for this parameter are _J48_ (J48/C4.5 decision tree), _RT_ (Random Tree), _SMO_ (SMO Support Vector) for multi-label classification algorithms, and also _REPTree_ for multi-target regression algorithms. However, if another base learner is needed to be used, the full java classname for the desired base learner can be given, as for example: ```-c weka.classifiers.trees.DecisionStump```.
+* For ensemble-based methods, with the parameter ```-n``` we indicate the number of members to use in the ensemble.
 
 The full list of methods availables in this jar is provided below. It is shown the value of the ```-a``` parameter and the full name of the method:
-* Multi-label classification methods:
-  * ```AdaBoostMH```: AdaBoost.MH
-  * ```BPMLL```: Back-Propagation for MLL
+* Multi-label classification. Problem Transformation methods:
   * ```BR```: Binary Relevance
   * ```CC```: Classifier Chains
-  * ```CDE```: Chi-Dep Ensemble
   * ```CLR```: Calibrated Label Ranking
-  * ```EBR```: Ensemble of BRs
-  * ```ECC```: Ensemble of Classifier Chains
-  * ```EPS```: Ensemble of Pruned Sets
-  * ```HOMER```: HOMER
-  * ```IBLR```: Instance-Based Logistic Regression
   * ```LP```: Label Powerset
   * ```LPBR```: LP-BR (a.k.a. Chi-Dep)
+  * ```PS```: Pruned Sets
+* Multi-label classification. Algorithm Adaptation methods:
+  * ```AdaBoostMH```: AdaBoost.MH
+  * ```BPMLL```: Back-Propagation for Multi-Label Learning
+  * ```IBLR```: Instance-Based Logistic Regression for Multi-Label Learning
   * ```MLkNN```: Multi-Label k-Nearest Neighbors
+* Multi-label classification. Ensembles of Multi-Label Classifiers:
+  * ```CDE```: Chi-Dep Ensemble
+  * ```EBR```: Ensemble of Binary Relevances
+  * ```ECC```: Ensemble of Classifier Chains
+  * ```ELP```: Ensemble of Label Powersets
+  * ```EPS```: Ensemble of Pruned Sets
+  * ```HOMER```: Hierarchy Of Multilabel classifiERs
   * ```MLS```: Multi-Label Stacking
-  * ```PS```: Pruned Sets
-  * ```RAkEL```: RAkEL
+  * ```RAkEL```: RAndom k-labELsets
   * ```RFPCT```: Random Forest of Predictive Clustering Trees
 * Multi-target regression methods:
   * ```ERC```: Ensemble of Regressor Chains
   * ```RC```: Regressor Chain
-  * ```RLC```: Random Linear Combinations
+  * ```RLC```: Random Linear Combinations (Normalized)
   * ```ST```: Single-Target
   * ```SST```: Stacked ST
 
-Two multi-label datasets have been included in the repository as example; a wide variety of dataset are available at the [KDIS Research Group Repository](http://www.uco.es/kdis/mllresources/) and the [Mulan webpage](http://mulan.sourceforge.net/datasets.html). 
+Two datasets have been included in the repository as example: _Emotions_ [[Tso08]](#Tso08) (for multi-label classification), and _andro_ [[Hat08]](#Hat08) (for multi-target regression); a wide variety of dataset are available at the [KDIS Research Group Repository](http://www.uco.es/kdis/mllresources/) and the [Mulan webpage](http://mulan.sourceforge.net/datasets.html). 
 
 ### References
+<a name="Hat08"></a>**[Hat08]** E. V. Hatzikos, G. Tsoumakas, G. Tzanis, N. Bassiliades, and I. Vlahavas. (2008). An empirical study on sea water quality prediction. Knowledge-Based Systems, 21(6), 471-478.
+
+<a name="Tso08"></a>**[Tso08]** G. Tsoumakas, I. Katakis, and I. Vlahavas. (2008). Effective and Efficient Multilabel Classification in Domains with Large Number of Labels. In Proc. ECML/PKDD 2008 Workshop on Mining Multidimensional Data (MMD’08), 53-59.
+
 <a name="Tso11"></a>**[Tso11]** G. Tsoumakas, E. Spyromitros-Xioufis, J. Vilcek, and I. Vlahavas. (2011). Mulan: A java library for multi-label learning. Journal of Machine Learning Research, 12, 2411-2414.
 
 <a name="Moy18"></a>**[Moy18]** J. M. Moyano and E. L. Gibaja and K. J. Cios and S. Ventura. (2018). Review of ensembles of multi-label classifiers: Models, experimental study and prospects. Information Fusion, 44, 33-45.
diff --git a/src/parameters/Parameters.java b/src/parameters/Parameters.java
@@ -340,7 +340,7 @@ public static LinkedHashMap<String, String> includedClassificationAlgorithms(){
 		algorithms.put("CLR", "Calibrated Label Ranking.");
 		algorithms.put("LP", "Label Powerset.");
 		algorithms.put("LPBR", "LPBR.");
-		algorithms.put("PCC", "Parallel Classifier Chains.");
+		//algorithms.put("PCC", "Parallel Classifier Chains.");
 		algorithms.put("PS", "Pruned Sets.");
 
 		//Algorithm Adaptation (AAs)
@@ -373,7 +373,7 @@ public static LinkedHashMap<String, String> includedRegressionAlgorithms(){
 		LinkedHashMap<String, String> algorithms = new LinkedHashMap<String, String>(20);
 
 		algorithms.put("ERC", "Ensemble of Regressor Chains.");
-		algorithms.put("RC", "Regressor Chains.");
+		algorithms.put("RC", "Regressor Chain.");
 		algorithms.put("RLC", "Random Linear Combinations (Normalized).");
 		algorithms.put("ST", "Single Target.");
 		algorithms.put("SST", "Stacked ST.");