Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'node_selector' for type-less yaml-tests in mixed clusters #36717

Closed
wants to merge 2 commits into from

Conversation

cbuescher
Copy link
Member

@cbuescher cbuescher commented Dec 17, 2018

Currently we skip yaml integration tests using the new typless API versions for
all clusters where any node has a version earlier than 7.0. This effectively
means we don't run this tests in mixed clusters or uphrade scenarios, which is
why we also copy each test to a version "*_with_type".

The integration test framework seems to have a feature where do-sections can
specify that a node it is run agains has a certain version or provides certain
features. We can use this "node_selector" feature to make sure we only hit nodes
in version >7.0 in the yaml tests that use type-less API calls. This way we
don't have to effectively disable mixed version or bwc tests and we might reduce
the number of tests we need to copy using the old typed APIs to the bare minimum
since we already cover most of the other testing with the new typless versions.

This PR shows this with the "_explain" API as an example.

Currently we skip yaml integration tests using the new typless API versions for
all clusters where any node has a version earlier than 7.0. This effectively
means we don't run this tests in mixed clusters or uphrade scenarios, which is
why we also copy each test to a version "*_with_type".

The integration test framework seems to have a feature where do-sections can
specify that a node it is run agains has a certain version or provides certain
features. We can use this "node_selector" feature to make sure we only hit nodes
in version >7.0 in the yaml tests that use type-less API calls. This way we
don't have to effectively disable mixed version or bwc tests and we might reduce
the number of tests we need to copy using the old typed APIs to the bare minimum
since we already cover most of the other testing with the new typless versions.

This PR shows this with the "_explain" API as an example.
@cbuescher cbuescher added >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0 labels Dec 17, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@cbuescher cbuescher added the WIP label Dec 17, 2018
@cbuescher
Copy link
Member Author

This can be checked e.g. with running ./gradlew qa:mixed-cluster:check -Dtests.class="*Yaml*IT" -Dtests.method="test {p0=explain/10_*}" with this changes which should pass. If you comment out the "node_selector" sections and run the above this will likely fail because we might be hitting a node without the new typeless endpount yet.

Mainly opening this as a suggestion for discussion. My main goal would be to not having to skip all "new" types of yaml tests in mixed cluster scenarios and to be able to remove a majority of the copied "*_with_types" tests. I think it would be enough to have only a few of those left for each API to e.g. check that the deprecated enpoints still work, but we should not have to copy all specialized cases. We have a slow build anyway, which is why I'm trying to keep us from having to needlessly duplicate all those integration tests.

@jtibshirani
Copy link
Contributor

I'm wondering if I'm missing something -- I'm having trouble seeing what switching to node_selector accomplishes. The tests can now technically be run in a mixed-version cluster, but we aren't actually testing backwards compatibility or the mixed-cluster behavior, because we are excluding 6.x nodes for the majority of tests.

@cbuescher
Copy link
Member Author

we aren't actually testing backwards compatibility or the mixed-cluster behavior, because we are excluding 6.x nodes for the majority of tests.

The way I see it, we are only excluding 6.x nodes from receiving rest requests with the new incompatible type-less format. But they are still part of the mixed cluster, so all other aspects that these tests cover, like internal serialization, responses etc... are still tested. Currently we are effectively disabling bwc and mixed cluster tests whenever we use a new typeless API, which is why we then need to add another test that still uses the typed version. With this change, I think, we can make the "new" tests the default and would need to add less of the "*_with_types" tests.

Lets take the explain tests as an example. We are adding "21_source_filtering_with_types" only so that this kind of test runs in a mixed scenario, I think. What is tested here (the source filtering) isn't really affected at all by types vs. no-types except for the rest request side (which is already covered if we have typed and type-less versions in "10_basic"). Same with "31_query_string_with_types". I think the core of these tests checks functionality that isn't affected by whether we test with or without types, we only run it so we have coverage where we currently skip testing in the type-less tests.

I think we are currently duplicating to much tests which is bad considering that we already have a slow build and we can do better using this approach.

@mayya-sharipova
Copy link
Contributor

@cbuescher Thanks for the bringing up this feature.

I think we should start with asking ourselves what we want to test. I think we need to test the following (please add/correct here):

  1. Old typed API works in a mixed cluster (6.x + 7.x)
  2. Old typed API works on 7.x cluster
  3. New typeless API works on 7.x cluster
  4. New typeless API works in a mixed cluster ???

For #1 and #2 we already have our current yml tests.
For #3 we need to write some duplicates with typeless API. May be, Christoph is right here, and there is no need to duplicate all functionalities, just some core ones.
For #4, node_selector will allow to accomplish this in tests, but let's ask ourselves do we need these tests? Would a user in the real-life scenario, will bother to send new typeless request to particular nodes?

@jtibshirani
Copy link
Contributor

But they are still part of the mixed cluster, so all other aspects that these tests cover, like internal serialization, responses etc... are still tested.

Thanks @cbuescher, I was missing this part in my thinking. It sounds like one piece we wouldn't be covering in those particular tests is REST BWC. Apart from keeping everything simple + consistent, I don't have very strong opinions here, and am interested to hear what others think.

One other question I had was around whether this would make it more difficult to other language clients to use these tests. I would guess that they already implement skip based on a version, but not node_selector?

@cbuescher
Copy link
Member Author

Would a user in the real-life scenario, will bother to send new typeless request to particular nodes?

No, that's not what I would expect. The idea was to make the new typless API tests the default going forward, and to be able to do that we'd also make sure the things these tests cover (which usually is more than just how the rest endpoint works) is also tested in mixed cluster scenarios (like e.g. serialization, other functionality that doesn't have to do with types or rest endpoints). I was suggesting making the test covering scenarios 3 and 4 the default tests in 7.0 and only have a few special ones checking 1 and 2, since those mixed cluster scenarios should be happening on rolling upgrades and we want to move people away from the old typed APIs.

Unfortunately I just learned today that the language clients also use the rest-api-spec the yaml tests are part of for their tests and most of the client test runners don't support the "node_selector" feature, which makes this whole approach somehow void since they would essentially have to skip all tests using it. Just trying to summarize my thinking here.

As an aside, clicking on either #1, #2, #3 or #4 is worth a few minutes :-)

@cbuescher
Copy link
Member Author

One other question I had was around whether this would make it more difficult to other language clients to use these tests.

Sorry, our comments just crossed. Exactly, I wasn't aware of these tests being used by the language clients too.

@cbuescher
Copy link
Member Author

cbuescher commented Dec 18, 2018

To summarize my current thinking around this: I'd be really glad if we didn't have to basically double the amount of yaml tests for the whole duration of 7.x, but unfortunately node_selector doesn't seem to completely solve the problem. Lets close this PR for now then, I will keep the main question (how can we potentially reduce the amount of tests necessary to a minimum) in the back of my head for now and see if I have another idea but not focus on it atm.

@jasontedor jasontedor added v8.0.0 and removed v7.0.0 labels Feb 6, 2019
@cbuescher
Copy link
Member Author

I think we can close this PR for now. I can dig it up again if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v8.0.0-alpha1 WIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants