Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Update logic on CheckpointState around serialization/deserialization #5808

Merged

Conversation

dreamer-89
Copy link
Member

@dreamer-89 dreamer-89 commented Jan 10, 2023

Description

As part of #5282, we introduced a boolean flag in CheckpointState inside ReplicationTracker with incorrect serialization/deserialization logic.

The bwc tests (using on CheckpointState for transport communication e.g. RecoveryHandoffPrimaryContextRequest) fails on 2.x because of in.getVersion().onOrAfter(Version.CURRENT) (2.5.0) < Version.Current (2.6.0); resulting in else block execution thereby skipping read on boolean. BUT, the write to outputstream is unconditional which means the minor version (2.5.0 containing #5282 change) writes to output stream.

The correct fix here is:

  1. Replace Version.CURRENT with specific version. Version.CURRENT is unreliable as it changes with branch and difficult to argue about in code
  2. Update output stream write to be conditional so that it writes only when it is supported.
if (in.getVersion().onOrAfter(Version.V_2_5_0)) {
    this.replicated = in.readBoolean();
} else {
    this.replicated = true;
}

Issues Resolved

#5801
#5766

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Suraj Singh <surajrider@gmail.com>
@dreamer-89
Copy link
Member Author

Gradle Check (Jenkins) Run Completed with:

org.opensearch.action.support.replication.TransportReplicationActionTests > testClosedIndexOnReroute FAILED
    java.lang.IllegalStateException: No local node found. Is the node started?
        at __randomizedtesting.SeedInfo.seed([8903A3BA9D4750C:646E98B8CA46FC47]:0)
        at org.opensearch.cluster.service.ClusterService.localNode(ClusterService.java:156)
        at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.<init>(TransportReplicationAction.java:890)
        at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.<init>(TransportReplicationAction.java:883)
        at org.opensearch.action.support.replication.TransportReplicationActionTests.testClosedIndexOnReroute(TransportReplicationActionTests.java:640)

@dreamer-89 dreamer-89 marked this pull request as ready for review January 10, 2023 23:20
@mch2
Copy link
Member

mch2 commented Jan 10, 2023

@dreamer-89 Can we PR this first to main?

@dreamer-89
Copy link
Member Author

dreamer-89 commented Jan 10, 2023

@dreamer-89 Can we PR this first to main?

Sure @mch2, #5809

@kotwanikunal
Copy link
Member

Gradle Check (Jenkins) Run Completed with:

org.opensearch.action.support.replication.TransportReplicationActionTests > testClosedIndexOnReroute FAILED
    java.lang.IllegalStateException: No local node found. Is the node started?
        at __randomizedtesting.SeedInfo.seed([8903A3BA9D4750C:646E98B8CA46FC47]:0)
        at org.opensearch.cluster.service.ClusterService.localNode(ClusterService.java:156)
        at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.<init>(TransportReplicationAction.java:890)
        at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.<init>(TransportReplicationAction.java:883)
        at org.opensearch.action.support.replication.TransportReplicationActionTests.testClosedIndexOnReroute(TransportReplicationActionTests.java:640)

This is waiting a merge on #5806

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #5808 (199543d) into 2.x (49fffcb) will decrease coverage by 0.16%.
The diff coverage is 73.55%.

@@             Coverage Diff              @@
##                2.x    #5808      +/-   ##
============================================
- Coverage     70.61%   70.45%   -0.17%     
- Complexity    58695    58824     +129     
============================================
  Files          4741     4764      +23     
  Lines        280699   282163    +1464     
  Branches      40908    41106     +198     
============================================
+ Hits         198222   198797     +575     
- Misses        66027    66782     +755     
- Partials      16450    16584     +134     
Impacted Files Coverage Δ
.../opensearch/gradle/plugin/PluginBuildPlugin.groovy 36.28% <0.00%> (ø)
...sion/awareness/put/DecommissionRequestBuilder.java 0.00% <0.00%> (ø)
...in/cluster/health/ClusterHealthRequestBuilder.java 29.41% <0.00%> (-3.93%) ⬇️
...n/cluster/health/TransportClusterHealthAction.java 45.94% <0.00%> (-1.81%) ⬇️
...te/ClusterDeleteWeightedRoutingRequestBuilder.java 0.00% <0.00%> (ø)
...d/put/ClusterPutWeightedRoutingRequestBuilder.java 66.66% <0.00%> (-33.34%) ⬇️
.../opensearch/action/search/SearchShardIterator.java 97.50% <ø> (ø)
...on/support/broadcast/TransportBroadcastAction.java 13.04% <0.00%> (ø)
...rg/opensearch/common/blobstore/fs/FsBlobStore.java 80.76% <ø> (ø)
...rg/opensearch/common/settings/ClusterSettings.java 91.89% <ø> (ø)
... and 644 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@kotwanikunal kotwanikunal merged commit ef3a58f into opensearch-project:2.x Jan 11, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 11, 2023
Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>
(cherry picked from commit ef3a58f)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@dreamer-89 dreamer-89 changed the title [Bug] Send replicated boolean on supported versions [Bug] Update logic on CheckpointState around serialization/deserialization Jan 11, 2023
kotwanikunal pushed a commit that referenced this pull request Jan 11, 2023
Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>
(cherry picked from commit ef3a58f)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@ashking94
Copy link
Member

Thanks @dreamer-89 for fixing it up. While development, I had initially put this as 2.5 version, but the same tests were failing earlier due to current being 3.0 at that time. Should it be that - first we get in with Current so that the PR build succeeds and then change it to the specific version?

@dreamer-89
Copy link
Member Author

Thanks @dreamer-89 for fixing it up. While development, I had initially put this as 2.5 version, but the same tests were failing earlier due to current being 3.0 at that time. Should it be that - first we get in with Current so that the PR build succeeds and then change it to the specific version?

Thank you @ashking94 for checking in. I think specifying specific version is better. Version.Current is mis-leading as I mentioned in description. I don't understand how using Current will help in this situation
Do you have the code link where you were seeing errors when using the specific version ? I can take a look.

kotwanikunal pushed a commit that referenced this pull request Jan 25, 2023
Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants