Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Make sure Done() is called in cleanup if it hasn't been called #4817

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -455,7 +455,7 @@ private protected override CalibratedModelParametersBase<LinearBinaryModelParame
CurrentWeights.CopyTo(ref weights, 1, CurrentWeights.Length - 1);
return new ParameterMixingCalibratedModelParameters<LinearBinaryModelParameters, PlattCalibrator>(Host,
new LinearBinaryModelParameters(Host, in weights, bias, _stats),
new PlattCalibrator(Host, -1, 0));
new PlattCalibrator(Host, 1, 0));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the bug is actually in the PlattCalibrator itself:

While in the comment it says this:

    /// The Platt calibrator calculates the probability following:
    /// P(x) = 1 / (1 + exp(-<see cref="PlattCalibrator.Slope"/> * x + <see cref="PlattCalibrator.Offset"/>)

The actual calculation does this:

        public float PredictProbability(float output)
        {
            if (float.IsNaN(output))
                return output;
            return PredictProbability(output, Slope, Offset);
        }

        internal static float PredictProbability(float output, Double a, Double b)
        {
            return (float)(1 / (1 + Math.Exp(a * output + b)));
        }

The value here should be changed to 1 only if we change the computation in PlattCalibrator. (looking at the baselines, the computation without the change looks better: label 1 examples give higher probabilities, and label 0 examples give lower probabilities.)

}

[TlcModule.EntryPoint(Name = "Trainers.LogisticRegressionBinaryClassifier",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ TRUTH ||======================
||======================
Precision || 0.9489 | 0.9816 |
OVERALL 0/1 ACCURACY: 0.968927
LOG LOSS/instance: 0.143504
LOG LOSS/instance: 8.202349
Test-set entropy (prior Log-Loss/instance): 0.956998
LOG-LOSS REDUCTION (RIG): 0.850048
LOG-LOSS REDUCTION (RIG): -7.570916
AUC: 0.994132
Warning: The predictor produced non-finite prediction values on 8 instances during testing. Possible causes: abnormal data or the predictor is numerically unstable.
TEST POSITIVE RATIO: 0.3191 (105.0/(105.0+224.0))
Expand All @@ -39,9 +39,9 @@ TRUTH ||======================
||======================
Precision || 0.9697 | 0.9609 |
OVERALL 0/1 ACCURACY: 0.963526
LOG LOSS/instance: 0.111794
LOG LOSS/instance: 6.914975
Test-set entropy (prior Log-Loss/instance): 0.903454
LOG-LOSS REDUCTION (RIG): 0.876260
LOG-LOSS REDUCTION (RIG): -6.653935
AUC: 0.997236

OVERALL RESULTS
Expand All @@ -52,8 +52,8 @@ Positive precision: 0.959301 (0.0104)
Positive recall: 0.942217 (0.0279)
Negative precision: 0.971218 (0.0103)
Negative recall: 0.977394 (0.0092)
Log-loss: 0.127649 (0.0159)
Log-loss reduction: 0.863154 (0.0131)
Log-loss: 7.558662 (0.6437)
Log-loss reduction: -7.112425 (0.4585)
F1 Score: 0.950293 (0.0091)
AUPRC: 0.991584 (0.0025)

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
LogisticRegression
AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction F1 Score AUPRC /l2 /ot /nt Learner Name Train Dataset Test Dataset Results File Run Time Physical Memory Virtual Memory Command Line Settings
0.995684 0.966226 0.959301 0.942217 0.971218 0.977394 0.127649 0.863154 0.950293 0.991584 0.1 0.001 1 LogisticRegression %Data% %Output% 99 0 0 maml.exe CV tr=LogisticRegression{l1=1.0 l2=0.1 ot=1e-3 nt=1} threads=- norm=No dout=%Output% data=%Data% seed=1 /l2:0.1;/ot:0.001;/nt:1
0.995684 0.966226 0.959301 0.942217 0.971218 0.977394 7.558662 -7.112425 0.950293 0.991584 0.1 0.001 1 LogisticRegression %Data% %Output% 99 0 0 maml.exe CV tr=LogisticRegression{l1=1.0 l2=0.1 ot=1e-3 nt=1} threads=- norm=No dout=%Output% data=%Data% seed=1 /l2:0.1;/ot:0.001;/nt:1

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -19,42 +19,42 @@ Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 118 | 16 | 0.8806
negative || 3 | 217 | 0.9864
positive || 0 | 134 | 0.0000
negative || 202 | 18 | 0.0818
||======================
Precision || 0.9752 | 0.9313 |
OVERALL 0/1 ACCURACY: 0.946328
LOG LOSS/instance: 0.143504
Precision || 0.0000 | 0.1184 |
OVERALL 0/1 ACCURACY: 0.050847
LOG LOSS/instance: 8.202349
Test-set entropy (prior Log-Loss/instance): 0.956998
LOG-LOSS REDUCTION (RIG): 0.850048
LOG-LOSS REDUCTION (RIG): -7.570916
AUC: 0.994132
Warning: The predictor produced non-finite prediction values on 8 instances during testing. Possible causes: abnormal data or the predictor is numerically unstable.
TEST POSITIVE RATIO: 0.3191 (105.0/(105.0+224.0))
Confusion table
||======================
PREDICTED || positive | negative | Recall
TRUTH ||======================
positive || 81 | 24 | 0.7714
negative || 0 | 224 | 1.0000
positive || 0 | 105 | 0.0000
negative || 210 | 14 | 0.0625
||======================
Precision || 1.0000 | 0.9032 |
OVERALL 0/1 ACCURACY: 0.927052
LOG LOSS/instance: 0.111794
Precision || 0.0000 | 0.1176 |
OVERALL 0/1 ACCURACY: 0.042553
LOG LOSS/instance: 6.914975
Test-set entropy (prior Log-Loss/instance): 0.903454
LOG-LOSS REDUCTION (RIG): 0.876260
LOG-LOSS REDUCTION (RIG): -6.653935
AUC: 0.997236

OVERALL RESULTS
---------------------------------------
AUC: 0.995684 (0.0016)
Accuracy: 0.936690 (0.0096)
Positive precision: 0.987603 (0.0124)
Positive recall: 0.826013 (0.0546)
Negative precision: 0.917278 (0.0141)
Negative recall: 0.993182 (0.0068)
Log-loss: 0.127649 (0.0159)
Log-loss reduction: 0.863154 (0.0131)
F1 Score: 0.898229 (0.0273)
Accuracy: 0.046700 (0.0041)
Positive precision: 0.000000 (0.0000)
Positive recall: 0.000000 (0.0000)
Negative precision: 0.118034 (0.0004)
Negative recall: 0.072159 (0.0097)
Log-loss: 7.558662 (0.6437)
Log-loss reduction: -7.112425 (0.4585)
F1 Score: 0.000000 (0.0000)
AUPRC: 0.991584 (0.0025)

---------------------------------------
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
LogisticRegression
AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction F1 Score AUPRC /l2 /ot /nt Learner Name Train Dataset Test Dataset Results File Run Time Physical Memory Virtual Memory Command Line Settings
0.995684 0.93669 0.987603 0.826013 0.917278 0.993182 0.127649 0.863154 0.898229 0.991584 0.1 0.001 1 LogisticRegression %Data% %Output% 99 0 0 maml.exe CV tr=LogisticRegression{l1=1.0 l2=0.1 ot=1e-3 nt=1} eval=BinaryClassifier{threshold=0.95 useRawScore=-} threads=- norm=No dout=%Output% data=%Data% seed=1 /l2:0.1;/ot:0.001;/nt:1
0.995684 0.0467 0 0 0.118034 0.072159 7.558662 -7.112425 0 0.991584 0.1 0.001 1 LogisticRegression %Data% %Output% 99 0 0 maml.exe CV tr=LogisticRegression{l1=1.0 l2=0.1 ot=1e-3 nt=1} eval=BinaryClassifier{threshold=0.95 useRawScore=-} threads=- norm=No dout=%Output% data=%Data% seed=1 /l2:0.1;/ot:0.001;/nt:1

Loading