Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG - Segment Replication + Remote Store] - Support segment replication for system indices #8182

Closed
mch2 opened this issue Jun 20, 2023 · 13 comments
Assignees
Labels
bug Something isn't working Severity-Blocker v2.10.0

Comments

@mch2
Copy link
Member

mch2 commented Jun 20, 2023

Describe the bug
Segment Replication is currently not supported on system indices. This decision was made for segment replication after failing tests with CCR plugin (issues linked below).

Related issues:
#6602 (comment)
#3823
#8158

To Reproduce
Steps to reproduce the behavior:

  1. Update opensearch.yml to set cluster replication strategy as SEGMENT. cluster.indices.replication.strategy: SEGMENT
  2. create a cluster with security or ccr enabled.
  3. Fetch default setting values, replication strategy will be DOCUMENT.

Expected behavior
Segment Replication should be a supported strategy for system indices.

Additional context
This limitation also prevents system indices from being backed up with remote storage.

@sachinpkale
Copy link
Member

Supporting remote store for system indices involves creating the repository before we create the index. This means we also need to get repository details as part of cluster settings (today, we accept only repository name). I will create a separate issue for the same.

@sachinpkale
Copy link
Member

cc: @psychbot

@sachinpkale
Copy link
Member

Issue specific to remote store: #8188

@stephen-crawford
Copy link
Contributor

HI @mch2 and @sachinpkale, looking at this, it seems like we will want to engage from the security side of things. Currently, we restrict modification of the security system index. The concern is escalation of privileges and some other issues that can arise such as desync between copies.

Can you provide some more details on how you plan to interact with the system indices? Is there an issue outlining the flow you are working on?

Tagging @peternied for some additional context on concerns associated with this.

@peternied
Copy link
Member

@scrawfor99 I don't think there is an issue with the security plugin itself - its how plugins can interact with system indices that is breaking down with seg-rep enabled.

@mch2 @psychbot @sachinpkale Please let the security team know if there seems to be a 'governance change' related to these indices, which doesn't seem to be the case and I'd guess would break backward compatibility

@mch2
Copy link
Member Author

mch2 commented Jun 22, 2023

@scrawfor99 @peternied Thanks for taking a look here!

In short there will be no modification to the content of the system index, only the replication strategy that is used behind the scenes to copy it out to replicas. Segment Replication is now configurable at a cluster level so we would like any index including system indices to use segrep if enabled.

However there are perf implications here that need to be called out. SegRep will on average increase the amount of time that replica shards are inconsistent from their primaries due to the time required to copy out segments (time dependent on cluster config/hardware etc). System indices are generally small and should copy out quickly, so I wouldn't expect this to be of major concern. Are there are cases where stronger read/write consistency is desired on the system index? If this is the case plugins would need to prioritize primaries for searches.

There is no strong r/w guarantee today with docrep, unless you use _bulk API with wait_until preference, which won't ack the request until each shard has refreshed on those documents. This option is not supported with segrep. Looking through security I don't think this is used.

@peternied
Copy link
Member

System indices are generally small and should copy out quickly, so I wouldn't expect this to be of major concern. Are there are cases where stronger read/write consistency is desired on the system index?

@scrawfor99 What do you think about creating an issue to contemplate how we configure segment replication for the security index? Building up the architecture documentation of security index <-> in memory security configuration might be a good companion for that issue.

@stephen-crawford
Copy link
Contributor

stephen-crawford commented Jun 23, 2023

Hi @peternied, will do! Thanks for taking a look at this. You have a better understanding of the overall picture involved with these types of changes so I dragged you in. I will make an issue over in the Security repo and link it back here.

Update: Here is a link to the issue I created: opensearch-project/security#2898.

@cwperks
Copy link
Member

cwperks commented Jun 23, 2023

@mch2 When there is a change to the security configuration, the security plugin will run an transport action on every node of the cluster to instruct it to read the entire security index and refresh its local cache. See: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/action/configupdate/TransportConfigUpdateAction.java#L126-L131

Tracing through the code this will ultimately do a MultiGetRequest to get the different CTypes (or configuration types meaning internalusers, roles, etc.). See https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/configuration/ConfigurationLoaderSecurity7.java#L210-L264

Would there be any risk of the MultiGetRequest getting stale data? i.e. admin calls on security API to create a user which under the hood instructs all nodes to refresh their cache from the security index after the handler for createUser writes a user to the security index?

Edit: I see your comment above about preferences now:

Are there are cases where stronger read/write consistency is desired on the system index? If this is the case plugins would need to https://github.com/opensearch-project/OpenSearch/pull/7375for searches.

Looks like the security codebase can take advantage of that.

@mch2
Copy link
Member Author

mch2 commented Jun 26, 2023

Thanks @cwperks, yes using prefer primary or primary can be used. If security is using RefreshPolicy.IMMEDIATE, this will only start the refresh & replication process after a write, so its possible there could be a stale read.

This issue is currently blocked by #8193.

Steps here.

  1. [Remote Store] Support repository registration during node bootstrap #8193 - Fix bug with remote store/segrep + system indices.
  2. merge fix for this issue to remove override of sys indices to docrep in core.
  3. Alert plugin teams of the impact of this change, namely that RefreshPolicy.IMMEDIATE will only refresh primary shards & start replication (eventual consistency) and that RefreshPolicy.WAIT_UNTIL is not supported with segrep. Instead, plugins will need to prioritize primary shards if r/w consistency guarantee is desired. - [Meta] Validate plugins compatibility with segment replication #8211.

@dbwiddis
Copy link
Member

dbwiddis commented Jul 6, 2023

Joining this conversation coming from Job Scheduler repo and I presume I have similar concerns/questions as @peternied and @cwperks discussed with security plugin. JS uses a system index for locking, which I do not think would be able to use the IMMEDIATE option; but JS locking is used in some performance-sensitive tasks so the impact called out by @mch2 here is a potential concern.

Is there any way to keep the current (2.8) situation as a configuration option (allow users to not use SEGREP for system indices) or are we forced to have these perf impacts if the user is using SEGREP at all?

@cwperks
Copy link
Member

cwperks commented Jul 6, 2023

Thank you @mch2 . The security plugin is using RefreshPolicy.IMMEDIATE after a change to the security index.

I have opened a PR on the security repo to add _primary preference for the mget request that security performs on a change to the security index.

@dreamer-89 dreamer-89 added v2.10.0 and removed v2.9.0 'Issues and PRs related to version v2.9.0' labels Jul 19, 2023
@anasalkouz
Copy link
Member

Closing this, since we are tracking the effort here: #8193

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Severity-Blocker v2.10.0
Projects
Status: Done
Development

No branches or pull requests

9 participants