Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine the fold metrics into one data view in CV macro. #207

Merged
merged 12 commits into from
May 24, 2018
1 change: 1 addition & 0 deletions ZBaselines/Common/EntryPoints/core_ep-list.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Models.BinaryClassificationEvaluator Evaluates a binary classification scored da
Models.BinaryCrossValidator Cross validation for binary classification Microsoft.ML.Runtime.EntryPoints.CrossValidationBinaryMacro CrossValidateBinary Microsoft.ML.Runtime.EntryPoints.CrossValidationBinaryMacro+Arguments Microsoft.ML.Runtime.EntryPoints.CommonOutputs+MacroOutput`1[Microsoft.ML.Runtime.EntryPoints.CrossValidationBinaryMacro+Output]
Models.ClassificationEvaluator Evaluates a multi class classification scored dataset. Microsoft.ML.Runtime.Data.Evaluate MultiClass Microsoft.ML.Runtime.Data.MultiClassMamlEvaluator+Arguments Microsoft.ML.Runtime.EntryPoints.CommonOutputs+ClassificationEvaluateOutput
Models.ClusterEvaluator Evaluates a clustering scored dataset. Microsoft.ML.Runtime.Data.Evaluate Clustering Microsoft.ML.Runtime.Data.ClusteringMamlEvaluator+Arguments Microsoft.ML.Runtime.EntryPoints.CommonOutputs+CommonEvaluateOutput
Models.CrossValidationResultsCombiner Combine the metric data views returned from cross validation. Microsoft.ML.Runtime.EntryPoints.CrossValidationMacro CombineMetrics Microsoft.ML.Runtime.EntryPoints.CrossValidationMacro+CombineMetricsInput Microsoft.ML.Runtime.EntryPoints.CrossValidationMacro+CombinedOutput
Copy link
Member

@codemzs codemzs May 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaeldekel Seems like a positive change but out of curiosity what was the rationale behind it? I'll reference this change in my PR(not yet out) that augments CV and TrainTest macro to work with LearningPipeline framework. #Closed

Copy link
Author

@yaeldekel yaeldekel May 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @codemzs , the reason I was making this change is so that we won't have code duplication between different users of this macro, when processing its outputs (such as finding the weighted/unweighted metrics for each data view, computing the average and standard deviation of each metric, and making sure that multi-class metrics refer to the same classes in all folds). #Closed

Models.CrossValidator Cross validation for general learning Microsoft.ML.Runtime.EntryPoints.CrossValidationMacro CrossValidate Microsoft.ML.Runtime.EntryPoints.CrossValidationMacro+Arguments Microsoft.ML.Runtime.EntryPoints.CommonOutputs+MacroOutput`1[Microsoft.ML.Runtime.EntryPoints.CrossValidationMacro+Output]
Models.CrossValidatorDatasetSplitter Split the dataset into the specified number of cross-validation folds (train and test sets) Microsoft.ML.Runtime.EntryPoints.CVSplit Split Microsoft.ML.Runtime.EntryPoints.CVSplit+Input Microsoft.ML.Runtime.EntryPoints.CVSplit+Output
Models.DatasetTransformer Applies a TransformModel to a dataset. Microsoft.ML.Runtime.EntryPoints.ModelOperations Apply Microsoft.ML.Runtime.EntryPoints.ModelOperations+ApplyTransformModelInput Microsoft.ML.Runtime.EntryPoints.ModelOperations+ApplyTransformModelOutput
Expand Down
130 changes: 114 additions & 16 deletions ZBaselines/Common/EntryPoints/core_manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -1238,6 +1238,116 @@
"IEvaluatorOutput"
]
},
{
"Name": "Models.CrossValidationResultsCombiner",
"Desc": "Combine the metric data views returned from cross validation.",
"FriendlyName": null,
"ShortName": null,
"Inputs": [
{
"Name": "OverallMetrics",
"Type": {
"Kind": "Array",
"ItemType": "DataView"
},
"Desc": "Overall metrics datasets",
"Required": false,
"SortOrder": 1.0,
"IsNullable": false,
"Default": null
},
{
"Name": "PerInstanceMetrics",
"Type": {
"Kind": "Array",
"ItemType": "DataView"
},
"Desc": "Per instance metrics datasets",
"Required": false,
"SortOrder": 2.0,
"IsNullable": false,
"Default": null
},
{
"Name": "ConfusionMatrix",
"Type": {
"Kind": "Array",
"ItemType": "DataView"
},
"Desc": "Confusion matrix datasets",
"Required": false,
"SortOrder": 3.0,
"IsNullable": false,
"Default": null
},
{
"Name": "Warnings",
"Type": {
"Kind": "Array",
"ItemType": "DataView"
},
"Desc": "Warning datasets",
"Required": false,
"SortOrder": 4.0,
"IsNullable": false,
"Default": null
},
{
"Name": "LabelColumn",
"Type": "String",
"Desc": "The label column name",
"Aliases": [
"Label"
],
"Required": false,
"SortOrder": 6.0,
"IsNullable": false,
"Default": "Label"
},
{
"Name": "Kind",
"Type": {
"Kind": "Enum",
"Values": [
"SignatureBinaryClassifierTrainer",
"SignatureMultiClassClassifierTrainer",
"SignatureRankerTrainer",
"SignatureRegressorTrainer",
"SignatureMultiOutputRegressorTrainer",
"SignatureAnomalyDetectorTrainer",
"SignatureClusteringTrainer"
]
},
"Desc": "Specifies the trainer kind, which determines the evaluator to be used.",
"Required": true,
"SortOrder": 7.0,
"IsNullable": false,
"Default": "SignatureBinaryClassifierTrainer"
}
],
"Outputs": [
{
"Name": "Warnings",
"Type": "DataView",
"Desc": "Warning dataset"
},
{
"Name": "OverallMetrics",
"Type": "DataView",
"Desc": "Overall metrics dataset"
},
{
"Name": "PerInstanceMetrics",
"Type": "DataView",
"Desc": "Per instance metrics dataset"
},
{
"Name": "ConfusionMatrix",
"Type": "DataView",
"Desc": "Confusion matrix dataset"
}
]
},
{
"Name": "Models.CrossValidator",
"Desc": "Cross validation for general learning",
Expand Down Expand Up @@ -1368,34 +1478,22 @@
},
{
"Name": "Warnings",
"Type": {
"Kind": "Array",
"ItemType": "DataView"
},
"Type": "DataView",
"Desc": "Warning dataset"
},
{
"Name": "OverallMetrics",
"Type": {
"Kind": "Array",
"ItemType": "DataView"
},
"Type": "DataView",
"Desc": "Overall metrics dataset"
},
{
"Name": "PerInstanceMetrics",
"Type": {
"Kind": "Array",
"ItemType": "DataView"
},
"Type": "DataView",
"Desc": "Per instance metrics dataset"
},
{
"Name": "ConfusionMatrix",
"Type": {
"Kind": "Array",
"ItemType": "DataView"
},
"Type": "DataView",
"Desc": "Confusion matrix dataset"
}
]
Expand Down
Loading