Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Internal data frame indices can cause unrelated docs tests to fail #43271

Closed
droberts195 opened this issue Jun 17, 2019 · 15 comments · Fixed by #44123 or #47016
Closed

[CI] Internal data frame indices can cause unrelated docs tests to fail #43271

droberts195 opened this issue Jun 17, 2019 · 15 comments · Fixed by #44123 or #47016
Assignees
Labels
:ml/Transform Transform >test-failure Triaged test failures from CI

Comments

@droberts195
Copy link
Contributor

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.2+intake/84/console failed due to the .data-frame-notifications-1 index existing in a docs test that didn't expect it to exist:

org.elasticsearch.smoketest.DocsClientYamlTestSuiteIT > test {yaml=reference/api-conventions/line_345} FAILED
    java.lang.AssertionError: Failure at [reference/api-conventions:275]: $body didn't match expected value:
                             $body: 
                            metadata: 
                               indices: 
                                 index-1: 
                                     state: same [open]
                                 index-2: 
                                     state: same [open]
                                 index-3: 
                                     state: same [open]
             .data-frame-notifications-1: unexpected but found [{state=open}]

        Caused by:
        java.lang.AssertionError: $body didn't match expected value:
                                 $body: 
                                metadata: 
                                   indices: 
                                     index-1: 
                                         state: same [open]
                                     index-2: 
                                         state: same [open]
                                     index-3: 
                                         state: same [open]
                 .data-frame-notifications-1: unexpected but found [{state=open}]

I imagine we had similar problems previously with ML internal indices leaking into unrelated docs tests. We should look at how that was solved and implement something similar for data frame internal indices.

@droberts195 droberts195 added >test-failure Triaged test failures from CI :Docs :ml/Transform Transform labels Jun 17, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs

talevy added a commit to talevy/elasticsearch that referenced this issue Jun 18, 2019
the geo-bounding-box and phrase-suggest docs were susceptible to
failing due to other indices in the cluster. This change restricts
the queries to the index that is set up for the test.

relates to elastic#43271.
@talevy
Copy link
Contributor

talevy commented Jun 18, 2019

Assuming this is due to pending tasks, do you think it can be resolved the same way Rollup Jobs are in ESRestTestCase#wipeCluster?

@droberts195
Copy link
Contributor Author

I had a look through the main data frame transforms API reference documentation and testing is skipped for all the console snippets, so the problem cannot be originating there. This means it must be coming from the data frame transform HLRC docs tests.

do you think it can be resolved the same way Rollup Jobs are in ESRestTestCase#wipeCluster?

DataFrameTransformDocumentationIT.cleanUpTransforms() is supposed to be doing something very similar, albeit only for the HLRC docs tests rather than all REST tests. However, it requires that the names of transforms to clean up are registered by adding them to the transformsToClean list during each test. It looks like there are some tests in DataFrameTransformDocumentationIT that don't do this. @davidkyle it looks like you wrote a lot of the tests so please can you have a look and see whether it's a mistake that some tests don't add to transformsToClean or if it's really not necessary for those tests?

talevy added a commit that referenced this issue Jun 18, 2019
the geo-bounding-box and phrase-suggest docs were susceptible to
failing due to other indices in the cluster. This change restricts
the queries to the index that is set up for the test.

relates to #43271.
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this issue Jun 20, 2019
…43307)

the geo-bounding-box and phrase-suggest docs were susceptible to
failing due to other indices in the cluster. This change restricts
the queries to the index that is set up for the test.

relates to elastic#43271.
@davidkyle
Copy link
Member

A couple of recent failures:
https://scans.gradle.com/s/elkcdyg2v3smm
https://gradle.com/s/n3nomp3njwit6

In all cases the error occurs in the test following put-transform.asciidoc line 67.

Data Frames have a mechanism to audit notable events such as the creation of a data frame by indexing a document to .data-frame-notifications-1. PUT Data Frame Transform does this async after the data frame is created

In the failures this async indexing occurs after the test has finished and after the test teardown which deletes the indices, so the index is re-created post teardown and leaks into the next test.

XPackRestIT#cleanup uses ESRestTestCase#waitForPendingTasks to handle this problem but that may not be the best solution here.

I'll mute put-transformL67 for now until we find good fix.

davidkyle added a commit that referenced this issue Jun 20, 2019
davidkyle added a commit that referenced this issue Jun 20, 2019
@davidkyle
Copy link
Member

Muted in master 2f9e8a8 and 7.x 12bc38d

@benwtrent
Copy link
Member

Its weird that we have not ran into this with ML doc tests. I suppose we don't make any API calls in the doc tests that cause an audit message.

davidkyle added a commit that referenced this issue Jun 21, 2019
…43428)

the geo-bounding-box and phrase-suggest docs were susceptible to
failing due to other indices in the cluster. This change restricts
the queries to the index that is set up for the test.

relates to #43271.
@polyfractal
Copy link
Contributor

Just as another data point, this failed in a few days ago in #43287 on a geo sort documentation test.

       "failed_shards" : [
          {
            "shard" : 0,
            "index" : ".data-frame-notifications-1",
            "node" : "uhxxwHciSt-IDFhse57DIQ",
            "reason" : {
              "type" : "illegal_argument_exception",
              "reason" : "failed to find mapper for [pin.location] for geo distance based sort",

droberts195 pushed a commit that referenced this issue Jun 25, 2019
@droberts195
Copy link
Contributor Author

This is also failing in 7.2 so I cherry-picked the mute back to the 7.2 branch in ee4e940

@henningandersen
Copy link
Contributor

A slightly different but related test failure occurred here: https://scans.gradle.com/s/fto67xzs3f5na/tests/jbm2x6tnwtxku-yqtt2tcdvwk4m, this time the failing test is reference/search/request/sort as in #43287, but the index is .ml-annotations-6:

            "shard" : 0,
            "index" : ".ml-annotations-6",
            "node" : "PV3AoyDpTtGFKaXMA0wxcQ",
            "reason" : {
              "type" : "query_shard_exception",
              "reason" : "failed to create query: {
...
              "index_uuid" : "VTBWPEdyShq3l4KCm4d0aQ",
              "index" : ".ml-annotations-6",
              "caused_by" : {
                "type" : "illegal_state_exception",
                "reason" : "[nested] failed to find nested object under path [parent]",
                "stack_trace" : "java.lang.IllegalStateException: [nested] failed to find nested object under path [parent]
	at org.elasticsearch.index.query.NestedQueryBuilder.doToQuery(NestedQueryBuilder.java:274)

@davidkyle
Copy link
Member

The failure above was in the test following reference/ml/apis/put-job/line_98, recently enabled in #44022. PUT job also makes an async write to the .ml-annotations index causing the index to be recreated post teardown.

I'll make the ESRestTestCase#waitForPendingTasks fix as used in XPackRestIT#cleanup before muting these tests becomes my full time job

@davidkyle
Copy link
Member

davidkyle commented Jul 9, 2019

I've muted the put-job test in master, 7.x, 7.3 and 7.2

@DaveCTurner
Copy link
Contributor

Reopening this as I've seen a few failures recently that match the linked #43287, for instance:

https://gradle-enterprise.elastic.co/s/mbzws4mjvdbea/console-log?task=:docs:integTestRunner

@DaveCTurner DaveCTurner reopened this Sep 23, 2019
@romseygeek
Copy link
Contributor

@droberts195 droberts195 self-assigned this Sep 24, 2019
droberts195 added a commit to droberts195/elasticsearch that referenced this issue Sep 24, 2019
The renaming of the tests in elastic#46760 caused the
cleanup between tests to be skipped.

Fixes elastic#43271
Fixes elastic#47012
@droberts195
Copy link
Contributor Author

Hopefully #47016 will fix this

droberts195 added a commit that referenced this issue Sep 24, 2019
The renaming of the tests in #46760 caused the
cleanup between tests to be skipped.

Fixes #43271
Fixes #47012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml/Transform Transform >test-failure Triaged test failures from CI
Projects
None yet
9 participants