Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure changes requests return the latest mapping version #37633

Merged
merged 9 commits into from
Jan 23, 2019

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jan 19, 2019

Today we keep the mapping on the follower in sync with the leader's using the mapping version from changes requests. There are two rare cases where the mapping on the follower is not synced properly:

  1. The returned mapping version (from ClusterService) is outdated than the actual mapping. This happens because we expose the latest cluster state in ClusterService after applying it to IndexService.

  2. It's possible for the FollowTask to receive an outdated mapping than the min_required_mapping. In that case, it should fetch the mapping again; otherwise, the follower won't have the right mapping.

Relates #31140 (comment)

@dnhatn dnhatn added >bug v7.0.0 :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v6.7.0 labels Jan 19, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 lgtm, good stuff.

Index leaderIndex = params.getLeaderShardId().getIndex();
Index followIndex = params.getFollowShardId().getIndex();

ClusterStateRequest clusterStateRequest = CcrRequests.metaDataRequest(leaderIndex.getName());
CheckedConsumer<ClusterStateResponse, Exception> onResponse = clusterStateResponse -> {
IndexMetaData indexMetaData = clusterStateResponse.getState().metaData().getIndexSafe(leaderIndex);
// the returned mapping is outdated - retry again
if (indexMetaData.getMappingVersion() < minRequiredMappingVersion) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we had the metadata version (which is also updated whenever index metadata / mapping changes), we could just do a waitForMetaDataVersion? This would avoid a possible busyloop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this retry in 7.0 after backporting to 6.x

@martijnvg
Copy link
Member

Maybe also backport this to the 6.6 branch? As this fixes replication issues that also exist in that branch.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @dnhatn. I've suggested a small change (which might have BWC implications though) that will avoid a busy loop.

Index leaderIndex = params.getLeaderShardId().getIndex();
Index followIndex = params.getFollowShardId().getIndex();

ClusterStateRequest clusterStateRequest = CcrRequests.metaDataRequest(leaderIndex.getName());
CheckedConsumer<ClusterStateResponse, Exception> onResponse = clusterStateResponse -> {
IndexMetaData indexMetaData = clusterStateResponse.getState().metaData().getIndexSafe(leaderIndex);
// the returned mapping is outdated - retry again
if (indexMetaData.getMappingVersion() < minRequiredMappingVersion) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we had the metadata version (which is also updated whenever index metadata / mapping changes), we could just do a waitForMetaDataVersion? This would avoid a possible busyloop.

@dnhatn
Copy link
Member Author

dnhatn commented Jan 21, 2019

@martijnvg @ywelsch Thanks for looking. I have updated to use waitForMetaDataVersion. Would you please have another look?

@dnhatn dnhatn requested a review from ywelsch January 21, 2019 17:53
@dnhatn dnhatn added the v6.6.1 label Jan 21, 2019
@dnhatn
Copy link
Member Author

dnhatn commented Jan 21, 2019

@ywelsch I've addressed your comment. Can you please give this a go? Thank you!

@dnhatn dnhatn requested a review from ywelsch January 21, 2019 22:02
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

remoteClient(params).admin().cluster().state(clusterStateRequest, ActionListener.wrap(
r -> {
// if wait_for_metadata_version timeout, the response is empty
if (r.getState() == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what odd behavior

dnhatn added a commit that referenced this pull request Jan 23, 2019
If the indices of a ClusterStateRequest are specified, we fail to
include the cluster state metadata version in the response.

Relates  #37633
dnhatn added a commit that referenced this pull request Jan 23, 2019
If the indices of a ClusterStateRequest are specified, we fail to
include the cluster state metadata version in the response.

Relates  #37633
@dnhatn dnhatn merged commit 0096f1b into elastic:master Jan 23, 2019
@dnhatn dnhatn deleted the mapping-version branch January 23, 2019 18:41
dnhatn added a commit that referenced this pull request Jan 23, 2019
Today we keep the mapping on the follower in sync with the leader's
using the mapping version from changes requests. There are two rare
cases where the mapping on the follower is not synced properly:

1. The returned mapping version (from ClusterService) is outdated than
the actual mapping. This happens because we expose the latest cluster
state in ClusterService after applying it to IndexService.

2. It's possible for the FollowTask to receive an outdated mapping than
the min_required_mapping. In that case, it should fetch the mapping
again; otherwise, the follower won't have the right mapping.

Relates to #31140
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jan 24, 2019
…ead-de-duplication

* elastic/master:
  Use explicit version for build-tools in example plugin integ tests (elastic#37792)
  Change `rational` to `saturation` in script_score (elastic#37766)
  Deprecate types in get field mapping API (elastic#37667)
  Add ability to listen to group of affix settings (elastic#37679)
  Ensure changes requests return the latest mapping version (elastic#37633)
  Make Minio Setup more Reliable (elastic#37747)
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jan 24, 2019
* elastic/master: (85 commits)
  Use explicit version for build-tools in example plugin integ tests (elastic#37792)
  Change `rational` to `saturation` in script_score (elastic#37766)
  Deprecate types in get field mapping API (elastic#37667)
  Add ability to listen to group of affix settings (elastic#37679)
  Ensure changes requests return the latest mapping version (elastic#37633)
  Make Minio Setup more Reliable (elastic#37747)
  Liberalize StreamOutput#writeStringList (elastic#37768)
  Add PersistentTasksClusterService::unassignPersistentTask method (elastic#37576)
  Tests: disable testRandomGeoCollectionQuery on tiny polygons (elastic#37579)
  Use ILM for Watcher history deletion (elastic#37443)
  Make sure PutMappingRequest accepts content types other than JSON. (elastic#37720)
  Retry ILM steps that fail due to SnapshotInProgressException (elastic#37624)
  Use disassociate in preference to deassociate (elastic#37704)
  Delete Redundant RoutingServiceTests (elastic#37750)
  Always return metadata version if metadata is requested (elastic#37674)
  [TEST] Mute MlMappingsUpgradeIT testMappingsUpgrade
  Streamline skip_unavailable handling (elastic#37672)
  Only bootstrap and elect node in current voting configuration (elastic#37712)
  Ensure either success or failure path for SearchOperationListener is called (elastic#37467)
  Target only specific index in update settings test
  ...
dnhatn added a commit that referenced this pull request Jan 26, 2019
If the indices of a ClusterStateRequest are specified, we fail to
include the cluster state metadata version in the response.

Relates  #37633
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v6.7.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants