Use 'node_selector' for type-less yaml-tests in mixed clusters #36717

cbuescher · 2018-12-17T15:12:37Z

Currently we skip yaml integration tests using the new typless API versions for
all clusters where any node has a version earlier than 7.0. This effectively
means we don't run this tests in mixed clusters or uphrade scenarios, which is
why we also copy each test to a version "*_with_type".

The integration test framework seems to have a feature where do-sections can
specify that a node it is run agains has a certain version or provides certain
features. We can use this "node_selector" feature to make sure we only hit nodes
in version >7.0 in the yaml tests that use type-less API calls. This way we
don't have to effectively disable mixed version or bwc tests and we might reduce
the number of tests we need to copy using the old typed APIs to the bare minimum
since we already cover most of the other testing with the new typless versions.

This PR shows this with the "_explain" API as an example.

Currently we skip yaml integration tests using the new typless API versions for all clusters where any node has a version earlier than 7.0. This effectively means we don't run this tests in mixed clusters or uphrade scenarios, which is why we also copy each test to a version "*_with_type". The integration test framework seems to have a feature where do-sections can specify that a node it is run agains has a certain version or provides certain features. We can use this "node_selector" feature to make sure we only hit nodes in version >7.0 in the yaml tests that use type-less API calls. This way we don't have to effectively disable mixed version or bwc tests and we might reduce the number of tests we need to copy using the old typed APIs to the bare minimum since we already cover most of the other testing with the new typless versions. This PR shows this with the "_explain" API as an example.

elasticmachine · 2018-12-17T15:12:39Z

Pinging @elastic/es-search

cbuescher · 2018-12-17T15:17:46Z

This can be checked e.g. with running ./gradlew qa:mixed-cluster:check -Dtests.class="*Yaml*IT" -Dtests.method="test {p0=explain/10_*}" with this changes which should pass. If you comment out the "node_selector" sections and run the above this will likely fail because we might be hitting a node without the new typeless endpount yet.

Mainly opening this as a suggestion for discussion. My main goal would be to not having to skip all "new" types of yaml tests in mixed cluster scenarios and to be able to remove a majority of the copied "*_with_types" tests. I think it would be enough to have only a few of those left for each API to e.g. check that the deprecated enpoints still work, but we should not have to copy all specialized cases. We have a slow build anyway, which is why I'm trying to keep us from having to needlessly duplicate all those integration tests.

jtibshirani · 2018-12-18T04:49:40Z

I'm wondering if I'm missing something -- I'm having trouble seeing what switching to node_selector accomplishes. The tests can now technically be run in a mixed-version cluster, but we aren't actually testing backwards compatibility or the mixed-cluster behavior, because we are excluding 6.x nodes for the majority of tests.

cbuescher · 2018-12-18T09:06:37Z

we aren't actually testing backwards compatibility or the mixed-cluster behavior, because we are excluding 6.x nodes for the majority of tests.

The way I see it, we are only excluding 6.x nodes from receiving rest requests with the new incompatible type-less format. But they are still part of the mixed cluster, so all other aspects that these tests cover, like internal serialization, responses etc... are still tested. Currently we are effectively disabling bwc and mixed cluster tests whenever we use a new typeless API, which is why we then need to add another test that still uses the typed version. With this change, I think, we can make the "new" tests the default and would need to add less of the "*_with_types" tests.

Lets take the explain tests as an example. We are adding "21_source_filtering_with_types" only so that this kind of test runs in a mixed scenario, I think. What is tested here (the source filtering) isn't really affected at all by types vs. no-types except for the rest request side (which is already covered if we have typed and type-less versions in "10_basic"). Same with "31_query_string_with_types". I think the core of these tests checks functionality that isn't affected by whether we test with or without types, we only run it so we have coverage where we currently skip testing in the type-less tests.

I think we are currently duplicating to much tests which is bad considering that we already have a slow build and we can do better using this approach.

mayya-sharipova · 2018-12-18T16:08:46Z

@cbuescher Thanks for the bringing up this feature.

I think we should start with asking ourselves what we want to test. I think we need to test the following (please add/correct here):

Old typed API works in a mixed cluster (6.x + 7.x)
Old typed API works on 7.x cluster
New typeless API works on 7.x cluster
New typeless API works in a mixed cluster ???

For #1 and #2 we already have our current yml tests.
For #3 we need to write some duplicates with typeless API. May be, Christoph is right here, and there is no need to duplicate all functionalities, just some core ones.
For #4, node_selector will allow to accomplish this in tests, but let's ask ourselves do we need these tests? Would a user in the real-life scenario, will bother to send new typeless request to particular nodes?

jtibshirani · 2018-12-18T17:03:12Z

But they are still part of the mixed cluster, so all other aspects that these tests cover, like internal serialization, responses etc... are still tested.

Thanks @cbuescher, I was missing this part in my thinking. It sounds like one piece we wouldn't be covering in those particular tests is REST BWC. Apart from keeping everything simple + consistent, I don't have very strong opinions here, and am interested to hear what others think.

One other question I had was around whether this would make it more difficult to other language clients to use these tests. I would guess that they already implement skip based on a version, but not node_selector?

cbuescher · 2018-12-18T17:04:42Z

Would a user in the real-life scenario, will bother to send new typeless request to particular nodes?

No, that's not what I would expect. The idea was to make the new typless API tests the default going forward, and to be able to do that we'd also make sure the things these tests cover (which usually is more than just how the rest endpoint works) is also tested in mixed cluster scenarios (like e.g. serialization, other functionality that doesn't have to do with types or rest endpoints). I was suggesting making the test covering scenarios 3 and 4 the default tests in 7.0 and only have a few special ones checking 1 and 2, since those mixed cluster scenarios should be happening on rolling upgrades and we want to move people away from the old typed APIs.

Unfortunately I just learned today that the language clients also use the rest-api-spec the yaml tests are part of for their tests and most of the client test runners don't support the "node_selector" feature, which makes this whole approach somehow void since they would essentially have to skip all tests using it. Just trying to summarize my thinking here.

As an aside, clicking on either #1, #2, #3 or #4 is worth a few minutes :-)

cbuescher · 2018-12-18T17:05:41Z

One other question I had was around whether this would make it more difficult to other language clients to use these tests.

Sorry, our comments just crossed. Exactly, I wasn't aware of these tests being used by the language clients too.

cbuescher · 2018-12-18T17:08:25Z

To summarize my current thinking around this: I'd be really glad if we didn't have to basically double the amount of yaml tests for the whole duration of 7.x, but unfortunately node_selector doesn't seem to completely solve the problem. Lets close this PR for now then, I will keep the main question (how can we potentially reduce the amount of tests necessary to a minimum) in the back of my head for now and see if I have another idea but not focus on it atm.

cbuescher · 2019-04-05T08:48:50Z

I think we can close this PR for now. I can dig it up again if needed.

cbuescher added >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0 labels Dec 17, 2018

cbuescher added the WIP label Dec 17, 2018

Merge branch 'master' into use-node_selector-yaml

73a0625

jasontedor added v8.0.0 and removed v7.0.0 labels Feb 6, 2019

cbuescher closed this Apr 5, 2019

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use 'node_selector' for type-less yaml-tests in mixed clusters #36717

Use 'node_selector' for type-less yaml-tests in mixed clusters #36717

cbuescher commented Dec 17, 2018 •

edited

Loading

elasticmachine commented Dec 17, 2018

cbuescher commented Dec 17, 2018

jtibshirani commented Dec 18, 2018

cbuescher commented Dec 18, 2018

mayya-sharipova commented Dec 18, 2018

jtibshirani commented Dec 18, 2018

cbuescher commented Dec 18, 2018

cbuescher commented Dec 18, 2018

cbuescher commented Dec 18, 2018 •

edited

Loading

cbuescher commented Apr 5, 2019

Use 'node_selector' for type-less yaml-tests in mixed clusters #36717

Use 'node_selector' for type-less yaml-tests in mixed clusters #36717

Conversation

cbuescher commented Dec 17, 2018 • edited Loading

elasticmachine commented Dec 17, 2018

cbuescher commented Dec 17, 2018

jtibshirani commented Dec 18, 2018

cbuescher commented Dec 18, 2018

mayya-sharipova commented Dec 18, 2018

jtibshirani commented Dec 18, 2018

cbuescher commented Dec 18, 2018

cbuescher commented Dec 18, 2018

cbuescher commented Dec 18, 2018 • edited Loading

cbuescher commented Apr 5, 2019

cbuescher commented Dec 17, 2018 •

edited

Loading

cbuescher commented Dec 18, 2018 •

edited

Loading