Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support point in time cross cluster search #61827

Merged
merged 13 commits into from
Sep 10, 2020
Merged

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Sep 2, 2020

This commit integrates point in time into cross cluster search.

0a99df6 -> Closes #61790
Relates #61062

@dnhatn dnhatn added :Search/Search Search-related issues that do not fall into other categories v8.0.0 v7.10.0 >non-issue labels Sep 2, 2020
@dnhatn dnhatn marked this pull request as ready for review September 2, 2020 12:25
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Sep 2, 2020
@dnhatn dnhatn requested a review from jimczi September 2, 2020 12:25
Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -72,6 +72,7 @@ protected void doExecute(Task task, OpenPointInTimeRequest request, ActionListen
.preference(request.preference())
.routing(request.routing())
.allowPartialSearchResults(false);
searchRequest.setCcsMinimizeRoundtrips(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it needed ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to disable ccs_minimize_roundtrips because we don't know how to merge a point in time from multiple sub-search requests.

@ywangd
Copy link
Member

ywangd commented Sep 3, 2020

First of all, I have little knowlege about PIT's design and implementation. So my comments may be completely off. But in the spirit of learning, I am just going to ask it away :)

To perform CCS with PIT, it requires the user to have read privileges on a local index even when the target indices are all remote. i.e. if user has no index privileges on the local cluster, the call POST remote:point_in_time_index/_pit?keep_alive=5m results in the following error:

action [indices:data/read/open_point_in_time] is unauthorized for user [joe] on indices [], this action is granted by the privileges [read,all]

This behaviour is different from CCS with scroll or async_search, both of them do not require user to have any index privileges on the local cluster (as suggested by our docs), i.e. both following API calls work:

GET remote:point_in_time_index/_search?scroll=1h 

POST remote:point_in_time_index/_async_search?keep_on_completion=true&keep_alive=1h

(Note ccs with scroll has an issue where it requires local privileges after the initial open scroll call. But we consider it as a bug #61828).

Unless PIT's design is intentionally to be different on this front, consistency may be a better choice.

Another thing I noticed is that if only remote indices are used for creating PIT, the initial creation will success (provide user has local index privileges). But the subsequent usage results in NEP, i.e:

POST remote:point_in_time_index/_pit?keep_alive=5m
POST _search
{
  'query': {
    'range': {
      'created_at': {
        'gte': '2020-01-03'
      }
    }
  },
  'pit': {
    'id': "PIT_FROM_ABOVE_REQUEST",
    'keep_alive': '5m',
  }
}

// The above call results in error:

"Cannot invoke "org.elasticsearch.action.OriginalIndices.indices()" because "localIndices" is null"

Ping @albertzaharovits @tvernum for awareness

@albertzaharovits
Copy link
Contributor

Thanks for testing and digging into things @ywangd ! These two are bugs for sure.

On the first point, about open-point-in-time requests that contain only remote indices, I think the fix should be simple add OpenPointInTimeRequest to the list in IndicesAndAliasesResolver#allowsRemoteIndices .

The other one should also be easy to fix, I would start from

OriginalIndices localIndices = remoteClusterIndices.remove(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY);
. Regular searches can have an empty list for "local indices" .

Can you take it home and raise the fix PRs as well ?

@ywangd
Copy link
Member

ywangd commented Sep 3, 2020

Can you take it home and raise the fix PRs as well ?

I am happy to raise a following PR once this one is merged if that's what you are asking.

@dnhatn
Copy link
Member Author

dnhatn commented Sep 9, 2020

@ywangd Thank you for reviewing. I've pushed the fix for NPE in f83433b.

@dnhatn dnhatn merged commit 161ec69 into elastic:master Sep 10, 2020
@dnhatn dnhatn deleted the ccs-with-pit branch September 10, 2020 00:55
@ywangd
Copy link
Member

ywangd commented Sep 10, 2020

@ywangd Thank you for reviewing. I've pushed the fix for NPE in f83433b.

@dnhatn Thanks for fixing the NPE. I'll raise follow-up one to fix the permission issue when target indices are all remote.

dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Sep 10, 2020
This commit integrates point in time into cross cluster search.

Relates elastic#61062
Closes elastic#61790
dnhatn added a commit that referenced this pull request Sep 10, 2020
This commit integrates point in time into cross cluster search.

Relates #61062
Closes #61790
ywangd added a commit that referenced this pull request Sep 21, 2020
When target indices are remote only, CCS does not require user to have privileges on the local cluster. This PR ensure Point-In-Time reader follows the same pattern.

Relates: #61827
ywangd added a commit to ywangd/elasticsearch that referenced this pull request Sep 21, 2020
…62261)

When target indices are remote only, CCS does not require user to have privileges on the local cluster. This PR ensure Point-In-Time reader follows the same pattern.

Relates: elastic#61827
ywangd added a commit that referenced this pull request Sep 22, 2020
…62696)

When target indices are remote only, CCS does not require user to have privileges on the local cluster. This PR ensure Point-In-Time reader follows the same pattern.

Relates: #61827
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI failure: AsyncSearchActionIT#testMaxMinAggregation
6 participants