Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] FieldSortIT.testTrackScores tests are flaky for concurrent segment search #11189

Closed
Gankris96 opened this issue Nov 13, 2023 · 2 comments · Fixed by #11267
Closed

[BUG] FieldSortIT.testTrackScores tests are flaky for concurrent segment search #11189

Gankris96 opened this issue Nov 13, 2023 · 2 comments · Fixed by #11267
Assignees
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run

Comments

@Gankris96
Copy link
Contributor

Describe the bug
The test testTrackScores is flaky for concurrent segment search

Stacktrace from local flakiness

REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.sort.FieldSortIT" -Dtests.method="testTrackScores [p0={"search.concurrent_segment_search.enabled":"true"} seed=[B3D6766FB7DD2CD0:7A556B1607ED25C4]]" -Dtests.seed=B3D6766FB7DD2CD0 -Dtests.locale=zh-SG -Dtests.timezone=Australia/Tasmania -Druntime.java=17

java.lang.AssertionError: 
Expected: not <NaNF>
     but: was <NaNF>
Expected :not <NaNF>
Actual   :<NaNF>
<Click to see difference>


	at __randomizedtesting.SeedInfo.seed([B3D6766FB7DD2CD0:7A556B1607ED25C4]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
	at org.junit.Assert.assertThat(Assert.java:964)
	at org.junit.Assert.assertThat(Assert.java:930)
	at org.opensearch.search.sort.FieldSortIT.testTrackScores(FieldSortIT.java:295)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
	at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
	at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
	at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
	at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.base/java.lang.Thread.run(Thread.java:840)

@Gankris96 Gankris96 added bug Something isn't working untriaged labels Nov 13, 2023
@jed326 jed326 added flaky-test Random test failure that succeeds on second run and removed untriaged labels Nov 13, 2023
@Gankris96 Gankris96 changed the title [BUG] FieldSortIt.testTrackScores tests are flaky for concurrent segment search [BUG] FieldSortIT.testTrackScores tests are flaky for concurrent segment search Nov 16, 2023
@jed326
Copy link
Collaborator

jed326 commented Nov 17, 2023

Was able to confirm with the provided test seed B3D6766FB7DD2CD0 that this test fails around 40% of the time with the above error.

Sample flaky response:

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": null,
        "hits": [{
            "_index": "test",
            "_id": "bYBn34sBfoF47yjduqGm",
            "_score": 1.0,
            "_source": {
                "id": "1",
                "svalue": "aaa",
                "ivalue": 100,
                "dvalue": 0.1
            },
            "sort": ["aaa"]
        }, {
            "_index": "test",
            "_id": "boBn34sBfoF47yjduqHo",
            "_score": 1.0,
            "_source": {
                "id": "2",
                "svalue": "bbb",
                "ivalue": 200,
                "dvalue": 0.2
            },
            "sort": ["bbb"]
        }]
    }
}

Correct response:

{
    "took": 8,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [{
            "_index": "test",
            "_id": "a4Bn34sBfoF47yjdtKGl",
            "_score": 1.0,
            "_source": {
                "id": "1",
                "svalue": "aaa",
                "ivalue": 100,
                "dvalue": 0.1
            },
            "sort": ["aaa"]
        }, {
            "_index": "test",
            "_id": "bIBn34sBfoF47yjdtKHm",
            "_score": 1.0,
            "_source": {
                "id": "2",
                "svalue": "bbb",
                "ivalue": 200,
                "dvalue": 0.2
            },
            "sort": ["bbb"]
        }]
    }
}

Additionally, this test is flaky only when adding the indexRandomForConcurrentSearch method. So, at a glance it seems like there's an issue with setTrackScores(true) and multi slice case.

Perhaps it's also related to using maxDoc like #11233, but in this case we don't see any flakiness for non-concurrent search.

@jed326
Copy link
Collaborator

jed326 commented Nov 17, 2023

Ah this looks like the problem:

for (final MaxScoreCollector collector : maxScoreCollectors) {
float score = collector.getMaxScore();
if (Float.isNaN(maxScore)) {
maxScore = score;
} else {
maxScore = Math.max(maxScore, score);
}
}

Math.max(1.0, Float.NaN) will return Float.NaN which doesn't seem correct because if one collector returns NaN then maxScore will also be NaN. I think this can also happen in non-concurrent search case but I'm guessing we need to apply some special shard routing to trigger this case in the tests.

See: https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#max-float-float-

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants