Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Done() call in BaseTestBaseline.Cleanup and added related fixes #4823

Merged
merged 5 commits into from
Mar 3, 2020
Merged

Added Done() call in BaseTestBaseline.Cleanup and added related fixes #4823

merged 5 commits into from
Mar 3, 2020

Conversation

harishsk
Copy link
Contributor

@harishsk harishsk commented Feb 10, 2020

This PR is similar to #4817 but standardizes the default value for Slope in PlattCalibrator to -1 instead of 1 since -1 seems to be used everywhere in the code. (If this PR is approved, I will be closing the other PR)

@harishsk harishsk requested a review from a team as a code owner February 10, 2020 22:46
Copy link
Contributor

@mstfbl mstfbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output on your most recent build:

2020-02-10T23:01:35.8655896Z X Microsoft.ML.Tests.Transformers.TextFeaturizerTests.LdaWorkoutEstimatorCore [155ms]
2020-02-10T23:01:35.8656992Z Error Message:
2020-02-10T23:01:35.8706041Z System.AggregateException : One or more errors occurred. (Different values in column 1 of row 0) (Odd same failed)
2020-02-10T23:01:35.8706491Z ---- System.InvalidOperationException : Different values in column 1 of row 0
2020-02-10T23:01:35.8706738Z ---- System.InvalidOperationException : Odd same failed
2020-02-10T23:01:35.8707131Z Stack Trace:
2020-02-10T23:01:35.8707356Z at Microsoft.ML.RunTests.BaseTestBaseline.Done() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 161
2020-02-10T23:01:35.8707606Z at Microsoft.ML.RunTests.BaseTestBaseline.Cleanup() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 133
2020-02-10T23:01:35.8707856Z at Microsoft.ML.TestFramework.BaseTestClass.System.IDisposable.Dispose() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestClass.cs:line 76
2020-02-10T23:01:35.8708508Z at ReflectionAbstractionExtensions.DisposeTestClass(ITest test, Object testClass, IMessageBus messageBus, ExecutionTimer timer, CancellationTokenSource cancellationTokenSource) in C:\Dev\xunit\xunit\src\xunit.execution\Extensions\ReflectionAbstractionExtensions.cs:line 79
2020-02-10T23:01:35.8709014Z ----- Inner Stack Trace #1 (System.InvalidOperationException) -----
2020-02-10T23:01:35.8709656Z at Microsoft.ML.RunTests.BaseTestBaseline.Fail(String fmt, Object[] args) in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 184
2020-02-10T23:01:35.8709929Z ----- Inner Stack Trace #2 (System.InvalidOperationException) -----
2020-02-10T23:01:35.8710190Z at Microsoft.ML.RunTests.BaseTestBaseline.Fail(String fmt, Object[] args) in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 184
2020-02-10T23:01:35.8710408Z Standard Output Messages:
2020-02-10T23:01:35.8711219Z *** Failure: Different values in column 1 of row 0
2020-02-10T23:01:35.8711704Z *** Failure: Odd same failed
2020-02-10T23:01:35.8711990Z Test LdaWorkoutEstimatorCore: aborted: failed

Here is the data in question:

public void LdaWorkoutEstimatorCore()
{
var ml = new MLContext(1);
var builder = new ArrayDataViewBuilder(Env);
var data = new[]
{
new[] { (float)1.0, (float)0.0, (float)0.0 },
new[] { (float)0.0, (float)1.0, (float)0.0 },
new[] { (float)0.0, (float)0.0, (float)1.0 },
};
builder.AddColumn("F1V", NumberDataViewType.Single, data);
var srcView = builder.GetDataView();
var est = ml.Transforms.Text.LatentDirichletAllocation("F1V");
TestEstimatorCore(est, srcView);
}

Is there a change your changes triggered this tests' failure? I'm not entirely sure how they're related in this case.

@harishsk
Copy link
Contributor Author

harishsk commented Feb 10, 2020

Output on your most recent build:

2020-02-10T23:01:35.8655896Z X Microsoft.ML.Tests.Transformers.TextFeaturizerTests.LdaWorkoutEstimatorCore [155ms]
2020-02-10T23:01:35.8656992Z Error Message:
2020-02-10T23:01:35.8706041Z System.AggregateException : One or more errors occurred. (Different values in column 1 of row 0) (Odd same failed)
2020-02-10T23:01:35.8706491Z ---- System.InvalidOperationException : Different values in column 1 of row 0
2020-02-10T23:01:35.8706738Z ---- System.InvalidOperationException : Odd same failed
2020-02-10T23:01:35.8707131Z Stack Trace:
2020-02-10T23:01:35.8707356Z at Microsoft.ML.RunTests.BaseTestBaseline.Done() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 161
2020-02-10T23:01:35.8707606Z at Microsoft.ML.RunTests.BaseTestBaseline.Cleanup() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 133
2020-02-10T23:01:35.8707856Z at Microsoft.ML.TestFramework.BaseTestClass.System.IDisposable.Dispose() in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestClass.cs:line 76
2020-02-10T23:01:35.8708508Z at ReflectionAbstractionExtensions.DisposeTestClass(ITest test, Object testClass, IMessageBus messageBus, ExecutionTimer timer, CancellationTokenSource cancellationTokenSource) in C:\Dev\xunit\xunit\src\xunit.execution\Extensions\ReflectionAbstractionExtensions.cs:line 79
2020-02-10T23:01:35.8709014Z ----- Inner Stack Trace #1 (System.InvalidOperationException) -----
2020-02-10T23:01:35.8709656Z at Microsoft.ML.RunTests.BaseTestBaseline.Fail(String fmt, Object[] args) in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 184
2020-02-10T23:01:35.8709929Z ----- Inner Stack Trace #2 (System.InvalidOperationException) -----
2020-02-10T23:01:35.8710190Z at Microsoft.ML.RunTests.BaseTestBaseline.Fail(String fmt, Object[] args) in D:\a\1\s\test\Microsoft.ML.TestFramework\BaseTestBaseline.cs:line 184
2020-02-10T23:01:35.8710408Z Standard Output Messages:
2020-02-10T23:01:35.8711219Z *** Failure: Different values in column 1 of row 0
2020-02-10T23:01:35.8711704Z *** Failure: Odd same failed
2020-02-10T23:01:35.8711990Z Test LdaWorkoutEstimatorCore: aborted: failed

Here is the data in question:

public void LdaWorkoutEstimatorCore()
{
var ml = new MLContext(1);
var builder = new ArrayDataViewBuilder(Env);
var data = new[]
{
new[] { (float)1.0, (float)0.0, (float)0.0 },
new[] { (float)0.0, (float)1.0, (float)0.0 },
new[] { (float)0.0, (float)0.0, (float)1.0 },
};
builder.AddColumn("F1V", NumberDataViewType.Single, data);
var srcView = builder.GetDataView();
var est = ml.Transforms.Text.LatentDirichletAllocation("F1V");
TestEstimatorCore(est, srcView);
}

Is there a change your changes triggered this tests' failure? I'm not entirely sure how they're related in this case.


In reply to: 356336770 [](ancestors = 356336770)

@@ -2086,7 +2086,7 @@ public sealed class NoArgumentsInput : CalibrateInputBase
public sealed class FixedPlattInput : CalibrateInputBase
{
[Argument(ArgumentType.AtMostOnce, ShortName = "slope", HelpText = "The slope parameter of the calibration function 1 / (1 + exp(-slope * x + offset)", SortOrder = 1)]
Copy link

@yaeldekel yaeldekel Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-slope [](start = 141, length = 6)

Can you change this to exp(slope * x + offset)? (and in the Offset argument as well). #Resolved

@@ -2086,7 +2086,7 @@ public sealed class NoArgumentsInput : CalibrateInputBase
public sealed class FixedPlattInput : CalibrateInputBase
{
[Argument(ArgumentType.AtMostOnce, ShortName = "slope", HelpText = "The slope parameter of the calibration function 1 / (1 + exp(-slope * x + offset)", SortOrder = 1)]
public Double Slope = 1;
public Double Slope = -1;
Copy link

@yaeldekel yaeldekel Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slope [](start = 26, length = 5)

FixedPlattCalibratorTrainer.Arguments also has a Slope field that defaults to 1. #Resolved

@yaeldekel
Copy link

yaeldekel commented Feb 11, 2020

/// P(x) = 1 / (1 + exp(-<see cref="PlattCalibrator.Slope"/> * x + <see cref="PlattCalibrator.Offset"/>)

This should also be fixed. #Resolved


Refers to: src/Microsoft.ML.Data/Prediction/Calibrator.cs:1621 in f86faac. [](commit_id = f86faac, deletion_comment = False)

@frank-dong-ms-zz
Copy link
Contributor

frank-dong-ms-zz commented Feb 11, 2020

    protected void Done()

In this new implementation, we can get rid of this Done() method and remove all usage. These assertions are done automatically in Cleanup method #Resolved


Refers to: test/Microsoft.ML.TestFramework/BaseTestBaseline.cs:145 in f86faac. [](commit_id = f86faac, deletion_comment = False)

@@ -948,7 +949,7 @@ public void TestMulticlassEnsembleCombiner()
predGetters[i](ref preds[i]);
}
if (scores.All(s => !float.IsNaN(s)))
CompareNumbersWithTolerance(score, scores.Sum() / predCount);
CompareNumbersWithTolerance(score, scores.Sum() / predCount, digitsOfPrecision: 5);
Copy link
Contributor

@frank-dong-ms-zz frank-dong-ms-zz Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

digitsOfPrecision [](start = 89, length = 17)

why we use less precise here? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is 7 digits. All tests are not equally precise.


In reply to: 377899328 [](ancestors = 377899328)

@frank-dong-ms-zz
Copy link
Contributor

frank-dong-ms-zz commented Feb 11, 2020

This one test: LdaWorkoutEstimatorCore is failing on all configurations and seems like real fail #Resolved

@yaeldekel
Copy link

LDA has some non determinism in it (see issue #4143), most of the LDA tests are disabled (or they don't do any comparisons of the outputs), I guess this one slipped through the cracks because it doesn't call Done()...


In reply to: 584859078 [](ancestors = 584859078)

@harishsk
Copy link
Contributor Author

@yaeldekel Thanks for the info. I have added it to the list of skipped tests for now.


In reply to: 585197699 [](ancestors = 585197699,584859078)

@harishsk
Copy link
Contributor Author

    protected void Done()

There is a problem with the current implementation. The assertion happens in Cleanup() but when Cleanup is called, the test context is lost. I am still debating whether to remove Done completely or find a different way to force Done to be called in the context of the original function. For now, I am leaving it as is.


In reply to: 584851784 [](ancestors = 584851784)


Refers to: test/Microsoft.ML.TestFramework/BaseTestBaseline.cs:145 in f86faac. [](commit_id = f86faac, deletion_comment = False)

@@ -485,10 +485,14 @@ public void OneClassMatrixFactorizationInMemoryDataZeroBaseIndex()
var testPrediction = model.Transform(testDataView);

var testResults = mlContext.Data.CreateEnumerable<OneClassMatrixElementZeroBasedForScore>(testPrediction, false).ToList();

// REVIEW: We are seeing lower precision on non-Windows platforms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REVIEW [](start = 15, length = 6)

maybe use below tags in comments so we can find this easily from task list:
TODO TEST_STABILITY:

@@ -714,6 +714,7 @@ public void LdaWorkout()
}

[Fact]
[Trait("Category", "SkipInCI")]
public void LdaWorkoutEstimatorCore()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LdaWorkoutEstimatorCore [](start = 20, length = 23)

I haven't see this fail in full build and CI build in last 14 days, this should be low fail rate tests and I think it is okay to left this one enabled, if we see this fails often later we can go back to this one

Copy link
Contributor

@frank-dong-ms-zz frank-dong-ms-zz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@harishsk harishsk merged commit 5449b91 into dotnet:master Mar 3, 2020
@harishsk harishsk deleted the testCleanup2 branch April 21, 2020 23:58
@ghost ghost locked as resolved and limited conversation to collaborators Mar 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants