Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set ConfigurationLoader mget request preference to _primary for strong consistency #2903

Closed

Conversation

cwperks
Copy link
Member

@cwperks cwperks commented Jun 26, 2023

Description

Update ConfigurationLoader7 to set the preference of the mget request to _primary to instruct it to execute the request on the primary shard for maximum consistency.

See issue discussed here: #2898

In Segment replication, replication may be a bit slower so this change guarantees that the request to reload the security config is performed on the primary. See opensearch-project/OpenSearch#8182 (comment)

  • Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation)

Maintenance

Issues Resolved

#2898

Check List

  • New functionality includes testing
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…g consistency

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@codecov
Copy link

codecov bot commented Jun 26, 2023

Codecov Report

Merging #2903 (f312b19) into main (7042464) will decrease coverage by 0.02%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #2903      +/-   ##
============================================
- Coverage     62.17%   62.15%   -0.02%     
+ Complexity     3401     3400       -1     
============================================
  Files           266      266              
  Lines         19653    19654       +1     
  Branches       3330     3330              
============================================
- Hits          12219    12216       -3     
- Misses         5824     5828       +4     
  Partials       1610     1610              
Impacted Files Coverage Δ
...ty/configuration/ConfigurationLoaderSecurity7.java 69.53% <100.00%> (+0.23%) ⬆️

... and 2 files with indirect coverage changes

@cwperks cwperks added the backport 2.x backport to 2.x branch label Jun 26, 2023
@peternied peternied requested a review from mch2 June 26, 2023 19:31
@peternied
Copy link
Member

@mch2 could you review or do you have a recommendation from the storage team to review this PR?

@stephen-crawford
Copy link
Collaborator

Do we know if these settings are configurable like other settings? I.e. can you overwrite these?

Craig's change seems good (nice find @cwperks), but I don't like the idea of having it being configurable. Maybe we want to say that if someone makes the configuration change to overwrite this, that is on them, but as is, this seems like it would be pretty easy to swap this to normal seg rep.

The issue there is that then we would have poor consistency with the replication and primary and no guarantee on which was being accessed.

@cwperks
Copy link
Member Author

cwperks commented Jun 27, 2023

@scrawfor99 This PR sets the mget's request preference to perform the mget on the primary shard since the security plugin requires strong consistency when reading in the security index to re-populate the caches on all of the nodes after a change in the security index. Regardless of segrep or docrep its important that there is strong consistency for a security config update so that the freshest data is cached.

@willyborankin
Copy link
Collaborator

LGTM, but AFAIK Seg Repl does not support refresh so far.

@cwperks
Copy link
Member Author

cwperks commented Jun 30, 2023

@mch2 I drafted this PR after reading your comment on here: opensearch-project/OpenSearch#8182 (comment)

Are there are cases where stronger read/write consistency is desired on the system index?

Yes and the security plugin is a good example where it performs an mget request on all nodes of the cluster after a change has been made to the security index and the mget request is for each node to refresh its local cache. i.e. A user gets created and every node of the cluster refreshes its cache from the index to get the newly created user.

I was reading the discussion on here: opensearch-project/OpenSearch#6046. The security index is typically not very big, I would estimate its size in a few MBs at most. Any idea what the latency is for an index of that size and using segrep? Do you have any concerns adding preference to the mget request here and would it be possible to reproduce a situation where an inconsistency can resort to 2 different nodes having different values in their security caches?

@stephen-crawford
Copy link
Collaborator

@scrawfor99 This PR sets the mget's request preference to perform the mget on the primary shard since the security plugin requires strong consistency when reading in the security index to re-populate the caches on all of the nodes after a change in the security index. Regardless of segrep or docrep its important that there is strong consistency for a security config update so that the freshest data is cached.

Hi @cwperks, perhaps I was not clear in my initial comment. My concern is that I don't think we should have this configurable. This seems to be registering a setting which can be overwritten. If that is the case, I think we should look for an alternative that does not allow for this change or should be very explicit in the potential ramifications.

@cwperks
Copy link
Member Author

cwperks commented Jul 6, 2023

@willyborankin @mch2 Explains refresh with segrep here: opensearch-project/OpenSearch#8182 (comment)

@scrawfor99 This is not a configurable setting. This sets the preference on the mget request to instruct it to execute the mget on the primary shard which has the freshest data. This would ensure that the mget request to refresh the security caches on all the nodes do not fetch stale data.

@stephen-crawford
Copy link
Collaborator

stephen-crawford commented Jul 7, 2023

Hi @cwperks, you are certain that we are not inadvertently exposing a configuration? I see here:https://opensearch.org/docs/latest/api-reference/document-apis/multi-get/ , that we have the ability to configure realtime and refresh settings which is where my concern arose.

I know the API configurations are not directly handled by the Security plugin, but I am not sure that this is a surefire solution if we do have the URL configuration accessible elsewhere. To me, it looks like we are setting a default for the handling as opposed to a locked setting.

If you know it works differently, that is fine, but I wanted to make sure.

@mch2
Copy link
Member

mch2 commented Jul 11, 2023

Hi @scrawfor99 and @cwperks, apologies for the delay here.

I would not go forward with this change because it will impact performance for docrep indicies. For 2.10 and get/mgets we are working to support strong reads with opensearch-project/OpenSearch#8536 and this should be transparent to you. For 2.9 we are going to revert the override in core where registered system indicies continue to use docrep.

Where we would still have concern for plugins is with the use of writes that use WAIT_UNTIL refresh policy followed by a search (not get/mget) that expects a strong read, where those searches would need to provide this preference.

@cwperks
Copy link
Member Author

cwperks commented Jul 11, 2023

Thank you for the context @mch2! I am closing this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants