Added Done() call in BaseTestBaseline.Cleanup and added related fixes #4823

harishsk · 2020-02-10T22:46:22Z

This PR is similar to #4817 but standardizes the default value for Slope in PlattCalibrator to -1 instead of 1 since -1 seems to be used everywhere in the code. (If this PR is approved, I will be closing the other PR)

mstfbl

Output on your most recent build:

2020-02-10T23:01:35.8655896Z X Microsoft.ML.Tests.Transformers.TextFeaturizerTests.LdaWorkoutEstimatorCore [155ms]
2020-02-10T23:01:35.8656992Z Error Message:
2020-02-10T23:01:35.8706041Z System.AggregateException : One or more errors occurred. (Different values in column 1 of row 0) (Odd same failed)
2020-02-10T23:01:35.8706491Z ---- System.InvalidOperationException : Different values in column 1 of row 0
2020-02-10T23:01:35.8706738Z ---- System.InvalidOperationException : Odd same failed
2020-02-10T23:01:35.8707131Z Stack Trace:
2020-02-10T23:01:35.8707356Z at Microsoft.ML.RunTests.BaseTestBaseline.Done() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 161
2020-02-10T23:01:35.8707606Z at Microsoft.ML.RunTests.BaseTestBaseline.Cleanup() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 133
2020-02-10T23:01:35.8707856Z at Microsoft.ML.TestFramework.BaseTestClass.System.IDisposable.Dispose() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestClass.cs:line 76
2020-02-10T23:01:35.8708508Z at ReflectionAbstractionExtensions.DisposeTestClass(ITest test, Object testClass, IMessageBus messageBus, ExecutionTimer timer, CancellationTokenSource cancellationTokenSource) in C:\Dev\xunit\xunit\src\xunit.execution\Extensions\ReflectionAbstractionExtensions.cs:line 79
2020-02-10T23:01:35.8709014Z ----- Inner Stack Trace #1 (System.InvalidOperationException) -----
2020-02-10T23:01:35.8709656Z at Microsoft.ML.RunTests.BaseTestBaseline.Fail(String fmt, Object[] args) in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 184
2020-02-10T23:01:35.8709929Z ----- Inner Stack Trace #2 (System.InvalidOperationException) -----
2020-02-10T23:01:35.8710190Z at Microsoft.ML.RunTests.BaseTestBaseline.Fail(String fmt, Object[] args) in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 184
2020-02-10T23:01:35.8710408Z Standard Output Messages:
2020-02-10T23:01:35.8711219Z *** Failure: Different values in column 1 of row 0
2020-02-10T23:01:35.8711704Z *** Failure: Odd same failed
2020-02-10T23:01:35.8711990Z Test LdaWorkoutEstimatorCore: aborted: failed

Here is the data in question:

machinelearning/test/Microsoft.ML.Tests/Transformers/TextFeaturizerTests.cs

Lines 717 to 733 in 9d2b198

    
           public void LdaWorkoutEstimatorCore() 
        
           { 
        
               var ml = new MLContext(1); 
        
               var builder = new ArrayDataViewBuilder(Env); 
        
               var data = new[] 
        
               { 
        
                   new[] {  (float)1.0,  (float)0.0,  (float)0.0 }, 
        
                   new[] {  (float)0.0,  (float)1.0,  (float)0.0 }, 
        
                   new[] {  (float)0.0,  (float)0.0,  (float)1.0 }, 
        
               }; 
        
               builder.AddColumn("F1V", NumberDataViewType.Single, data); 
        
               var srcView = builder.GetDataView(); 
        
               var est = ml.Transforms.Text.LatentDirichletAllocation("F1V"); 
        
               TestEstimatorCore(est, srcView); 
        
           }

Is there a change your changes triggered this tests' failure? I'm not entirely sure how they're related in this case.

harishsk · 2020-02-10T23:42:21Z

Output on your most recent build:

2020-02-10T23:01:35.8655896Z X Microsoft.ML.Tests.Transformers.TextFeaturizerTests.LdaWorkoutEstimatorCore [155ms]
2020-02-10T23:01:35.8656992Z Error Message:
2020-02-10T23:01:35.8706041Z System.AggregateException : One or more errors occurred. (Different values in column 1 of row 0) (Odd same failed)
2020-02-10T23:01:35.8706491Z ---- System.InvalidOperationException : Different values in column 1 of row 0
2020-02-10T23:01:35.8706738Z ---- System.InvalidOperationException : Odd same failed
2020-02-10T23:01:35.8707131Z Stack Trace:
2020-02-10T23:01:35.8707356Z at Microsoft.ML.RunTests.BaseTestBaseline.Done() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 161
2020-02-10T23:01:35.8707606Z at Microsoft.ML.RunTests.BaseTestBaseline.Cleanup() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 133
2020-02-10T23:01:35.8707856Z at Microsoft.ML.TestFramework.BaseTestClass.System.IDisposable.Dispose() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestClass.cs:line 76
2020-02-10T23:01:35.8708508Z at ReflectionAbstractionExtensions.DisposeTestClass(ITest test, Object testClass, IMessageBus messageBus, ExecutionTimer timer, CancellationTokenSource cancellationTokenSource) in C:\Dev\xunit\xunit\src\xunit.execution\Extensions\ReflectionAbstractionExtensions.cs:line 79
2020-02-10T23:01:35.8709014Z ----- Inner Stack Trace #1 (System.InvalidOperationException) -----
2020-02-10T23:01:35.8709656Z at Microsoft.ML.RunTests.BaseTestBaseline.Fail(String fmt, Object[] args) in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 184
2020-02-10T23:01:35.8709929Z ----- Inner Stack Trace #2 (System.InvalidOperationException) -----
2020-02-10T23:01:35.8710190Z at Microsoft.ML.RunTests.BaseTestBaseline.Fail(String fmt, Object[] args) in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 184
2020-02-10T23:01:35.8710408Z Standard Output Messages:
2020-02-10T23:01:35.8711219Z *** Failure: Different values in column 1 of row 0
2020-02-10T23:01:35.8711704Z *** Failure: Odd same failed
2020-02-10T23:01:35.8711990Z Test LdaWorkoutEstimatorCore: aborted: failed

Here is the data in question:

machinelearning/test/Microsoft.ML.Tests/Transformers/TextFeaturizerTests.cs

Lines 717 to 733 in 9d2b198

    
           public void LdaWorkoutEstimatorCore() 
        
           { 
        
               var ml = new MLContext(1); 
        
               var builder = new ArrayDataViewBuilder(Env); 
        
               var data = new[] 
        
               { 
        
                   new[] {  (float)1.0,  (float)0.0,  (float)0.0 }, 
        
                   new[] {  (float)0.0,  (float)1.0,  (float)0.0 }, 
        
                   new[] {  (float)0.0,  (float)0.0,  (float)1.0 }, 
        
               }; 
        
               builder.AddColumn("F1V", NumberDataViewType.Single, data); 
        
               var srcView = builder.GetDataView(); 
        
               var est = ml.Transforms.Text.LatentDirichletAllocation("F1V"); 
        
               TestEstimatorCore(est, srcView); 
        
           }

Is there a change your changes triggered this tests' failure? I'm not entirely sure how they're related in this case.

In reply to: 356336770 [](ancestors = 356336770)

yaeldekel · 2020-02-11T10:14:41Z

src/Microsoft.ML.Data/Prediction/Calibrator.cs

@@ -2086,7 +2086,7 @@ public sealed class NoArgumentsInput : CalibrateInputBase
        public sealed class FixedPlattInput : CalibrateInputBase
        {
            [Argument(ArgumentType.AtMostOnce, ShortName = "slope", HelpText = "The slope parameter of the calibration function 1 / (1 + exp(-slope * x + offset)", SortOrder = 1)]


-slope [](start = 141, length = 6)

Can you change this to exp(slope * x + offset)? (and in the Offset argument as well). #Resolved

yaeldekel · 2020-02-11T10:15:15Z

src/Microsoft.ML.Data/Prediction/Calibrator.cs

@@ -2086,7 +2086,7 @@ public sealed class NoArgumentsInput : CalibrateInputBase
        public sealed class FixedPlattInput : CalibrateInputBase
        {
            [Argument(ArgumentType.AtMostOnce, ShortName = "slope", HelpText = "The slope parameter of the calibration function 1 / (1 + exp(-slope * x + offset)", SortOrder = 1)]
-            public Double Slope = 1;
+            public Double Slope = -1;


Slope [](start = 26, length = 5)

FixedPlattCalibratorTrainer.Arguments also has a Slope field that defaults to 1. #Resolved

yaeldekel · 2020-02-11T10:16:32Z

/// P(x) = 1 / (1 + exp(-<see cref="PlattCalibrator.Slope"/> * x + <see cref="PlattCalibrator.Offset"/>)

This should also be fixed. #Resolved

Refers to: src/Microsoft.ML.Data/Prediction/Calibrator.cs:1621 in f86faac. [](commit_id = f86faac, deletion_comment = False)

frank-dong-ms-zz · 2020-02-11T21:06:16Z

    protected void Done()

In this new implementation, we can get rid of this Done() method and remove all usage. These assertions are done automatically in Cleanup method #Resolved

Refers to: test/Microsoft.ML.TestFramework/BaseTestBaseline.cs:145 in f86faac. [](commit_id = f86faac, deletion_comment = False)

frank-dong-ms-zz · 2020-02-11T21:08:46Z

test/Microsoft.ML.Predictor.Tests/TestPredictors.cs

@@ -948,7 +949,7 @@ public void TestMulticlassEnsembleCombiner()
                            predGetters[i](ref preds[i]);
                        }
                        if (scores.All(s => !float.IsNaN(s)))
-                            CompareNumbersWithTolerance(score, scores.Sum() / predCount);
+                            CompareNumbersWithTolerance(score, scores.Sum() / predCount, digitsOfPrecision: 5);


digitsOfPrecision [](start = 89, length = 17)

why we use less precise here? #Resolved

The default is 7 digits. All tests are not equally precise.

In reply to: 377899328 [](ancestors = 377899328)

frank-dong-ms-zz · 2020-02-11T21:22:19Z

This one test: LdaWorkoutEstimatorCore is failing on all configurations and seems like real fail #Resolved

yaeldekel · 2020-02-12T13:10:09Z

LDA has some non determinism in it (see issue #4143), most of the LDA tests are disabled (or they don't do any comparisons of the outputs), I guess this one slipped through the cracks because it doesn't call Done()...

In reply to: 584859078 [](ancestors = 584859078)

harishsk · 2020-02-13T04:45:31Z

@yaeldekel Thanks for the info. I have added it to the list of skipped tests for now.

In reply to: 585197699 [](ancestors = 585197699,584859078)

harishsk · 2020-02-13T05:00:20Z

    protected void Done()

There is a problem with the current implementation. The assertion happens in Cleanup() but when Cleanup is called, the test context is lost. I am still debating whether to remove Done completely or find a different way to force Done to be called in the context of the original function. For now, I am leaving it as is.

In reply to: 584851784 [](ancestors = 584851784)

Refers to: test/Microsoft.ML.TestFramework/BaseTestBaseline.cs:145 in f86faac. [](commit_id = f86faac, deletion_comment = False)

…st of skipped tests

…nInMemoryDataZeroBaseIndex

frank-dong-ms-zz · 2020-03-03T01:15:19Z

test/Microsoft.ML.Tests/TrainerEstimators/MatrixFactorizationTests.cs

@@ -485,10 +485,14 @@ public void OneClassMatrixFactorizationInMemoryDataZeroBaseIndex()
            var testPrediction = model.Transform(testDataView);

            var testResults = mlContext.Data.CreateEnumerable<OneClassMatrixElementZeroBasedForScore>(testPrediction, false).ToList();
+
+            // REVIEW: We are seeing lower precision on non-Windows platforms


REVIEW [](start = 15, length = 6)

maybe use below tags in comments so we can find this easily from task list:
TODO TEST_STABILITY:

frank-dong-ms-zz · 2020-03-03T01:28:51Z

test/Microsoft.ML.Tests/Transformers/TextFeaturizerTests.cs

@@ -714,6 +714,7 @@ public void LdaWorkout()
        }

        [Fact]
+        [Trait("Category", "SkipInCI")]
        public void LdaWorkoutEstimatorCore()


LdaWorkoutEstimatorCore [](start = 20, length = 23)

I haven't see this fail in full build and CI build in last 14 days, this should be low fail rate tests and I think it is okay to left this one enabled, if we see this fails often later we can go back to this one

frank-dong-ms-zz

harishsk requested a review from a team as a code owner February 10, 2020 22:46

mstfbl reviewed Feb 10, 2020

View reviewed changes

yaeldekel reviewed Feb 11, 2020

View reviewed changes

frank-dong-ms-zz reviewed Feb 11, 2020

View reviewed changes

harishsk added 4 commits March 2, 2020 14:23

Added Done() call in BaseTestBaseline.Cleanup and added related fixes

3a37863

Updated call stack collection mechanism to collect complete call stacks

bc5255a

Fixed defaults, documentation and added LdaWorkoutEstimatorCore to li…

3d9a630

…st of skipped tests

Updated baseline and reduced precision for OneClassMatrixFactorizatio…

073f247

…nInMemoryDataZeroBaseIndex

harishsk mentioned this pull request Mar 2, 2020

WIP: Make sure Done() is called in cleanup if it hasn't been called #4817

Closed

frank-dong-ms-zz reviewed Mar 3, 2020

View reviewed changes

frank-dong-ms-zz approved these changes Mar 3, 2020

View reviewed changes

Changed REVIEW tag to TODO TEST_STABILITY

20aef91

harishsk merged commit 5449b91 into dotnet:master Mar 3, 2020

harishsk deleted the testCleanup2 branch April 21, 2020 23:58

ghost locked as resolved and limited conversation to collaborators Mar 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Done() call in BaseTestBaseline.Cleanup and added related fixes #4823

Added Done() call in BaseTestBaseline.Cleanup and added related fixes #4823

harishsk commented Feb 10, 2020 •

edited

Loading

mstfbl left a comment

harishsk commented Feb 10, 2020 •

edited

Loading

yaeldekel Feb 11, 2020 •

edited by harishsk

Loading

yaeldekel Feb 11, 2020 •

edited by harishsk

Loading

yaeldekel commented Feb 11, 2020 •

edited by harishsk

Loading

frank-dong-ms-zz commented Feb 11, 2020 •

edited by harishsk

Loading

frank-dong-ms-zz Feb 11, 2020 •

edited by harishsk

Loading

harishsk Feb 13, 2020

frank-dong-ms-zz commented Feb 11, 2020 •

edited by harishsk

Loading

yaeldekel commented Feb 12, 2020

harishsk commented Feb 13, 2020

harishsk commented Feb 13, 2020

frank-dong-ms-zz Mar 3, 2020

frank-dong-ms-zz Mar 3, 2020

frank-dong-ms-zz left a comment

	public void LdaWorkoutEstimatorCore()
	{
	var ml = new MLContext(1);

	var builder = new ArrayDataViewBuilder(Env);
	var data = new[]
	{
	new[] { (float)1.0, (float)0.0, (float)0.0 },
	new[] { (float)0.0, (float)1.0, (float)0.0 },
	new[] { (float)0.0, (float)0.0, (float)1.0 },
	};
	builder.AddColumn("F1V", NumberDataViewType.Single, data);
	var srcView = builder.GetDataView();

	var est = ml.Transforms.Text.LatentDirichletAllocation("F1V");
	TestEstimatorCore(est, srcView);
	}

Added Done() call in BaseTestBaseline.Cleanup and added related fixes #4823

Added Done() call in BaseTestBaseline.Cleanup and added related fixes #4823

Conversation

harishsk commented Feb 10, 2020 • edited Loading

mstfbl left a comment

Choose a reason for hiding this comment

harishsk commented Feb 10, 2020 • edited Loading

yaeldekel Feb 11, 2020 • edited by harishsk Loading

Choose a reason for hiding this comment

yaeldekel Feb 11, 2020 • edited by harishsk Loading

Choose a reason for hiding this comment

yaeldekel commented Feb 11, 2020 • edited by harishsk Loading

frank-dong-ms-zz commented Feb 11, 2020 • edited by harishsk Loading

frank-dong-ms-zz Feb 11, 2020 • edited by harishsk Loading

Choose a reason for hiding this comment

harishsk Feb 13, 2020

Choose a reason for hiding this comment

frank-dong-ms-zz commented Feb 11, 2020 • edited by harishsk Loading

yaeldekel commented Feb 12, 2020

harishsk commented Feb 13, 2020

harishsk commented Feb 13, 2020

frank-dong-ms-zz Mar 3, 2020

Choose a reason for hiding this comment

frank-dong-ms-zz Mar 3, 2020

Choose a reason for hiding this comment

frank-dong-ms-zz left a comment

Choose a reason for hiding this comment

harishsk commented Feb 10, 2020 •

edited

Loading

harishsk commented Feb 10, 2020 •

edited

Loading

yaeldekel Feb 11, 2020 •

edited by harishsk

Loading

yaeldekel Feb 11, 2020 •

edited by harishsk

Loading

yaeldekel commented Feb 11, 2020 •

edited by harishsk

Loading

frank-dong-ms-zz commented Feb 11, 2020 •

edited by harishsk

Loading

frank-dong-ms-zz Feb 11, 2020 •

edited by harishsk

Loading

frank-dong-ms-zz commented Feb 11, 2020 •

edited by harishsk

Loading