-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CCR Rest API design #30102
Comments
Original comment by @Mpdreamz: cc @elastic/es-clients |
Original comment by @jasontedor: Relates LINK REDACTED |
Original comment by @clintongormley: I don't think we need two separate end points. Eventually, when we no longer need snapshot/restore for bootstrapping, then the logic would look something like this:
While we still need S&R, we'd do the following:
So this can all be handled by the single The method should be POST not PUT, because we're not storing the body with the URL I talked to @jasontedor about possibly doing something like this, without need for a body
or
or
but he wasn't keen on having to remember the order of parameters. Instead, we thought about making consistent with the reindex API, eg:
This seems pretty easy to remember and understand. |
Original comment by @bleskes: Thanks @clintongormley
I personally don't like the automatic fallback, in this case. I think that in this kind of admin level API this can only hide mistakes - for example, I expect the index to be there but it wasn't and now we automatically create it. This is a common issue in our API - sometime it's very useful, like when we automatically create an index in the time series data use case, but in general I think it should be avoided.
See comment in the description - we discussed this and the group decided to not go with url only. The main argument was to make it easier to add parameters and options.
I personally find this less "tight" - although it's clear I'm biased. The words "source"/"dest" add confusion IMO in the context where we use "follower" and "leader". Can you elaborate on the reason for the suggestion? |
Original comment by @clintongormley:
What is the purpose of creating an empty follower index? Either:
This looks like a single process to me. When relying on snapshot restore for bootstrapping, we'll fail if we can't read data from the translog, and when we no longer rely on snapshot restore, then we'll do a segment copy. I don't understand why we need the two APIs. I also value consistency. Why should this API be different from all the others?
Consistency again. It's one less thing for users to remember. I think |
Original comment by @zuketo: I started walking through using this API design for various use cases. Overall, looks good, a few questions came up (I'll group them by use case): Data locality/replicating the same index to 5 different datacenters (to be close to the application server/user):
Logging/security events:
|
Original comment by @bleskes: @zuketo answers inline:
Yes. Following is a property of the target indices. The source index doesn't care how many followers read from it.
As far as we know now there is no reason for this not work. It is obviously a more complex setup so we might need to drop the feature at the first iteration. out of curiosity - what was your use case?
At this point we dont know yet.
That's a fair point. We may need to extend the api I suggested at the end. I lean towards the same rename options we have in the restore API. |
Original comment by @bleskes: @jasontedor do you mind updating the ticket about our discussion in berlin? |
Original comment by @zuketo: Thanks! For chained replication use cases, I haven't come across any users specifically asking for this (I definitely wouldn't list it as a priority). I was mainly interested in solving the data locality use case if the source index couldn't have multiple followers (which isn't the case). For chained replication, thinking through some future scenarios, users may want to reduce load on the leading index (e.g. if replicating to 20 clusters, or maybe more, this can be fanned out with chaining). The other scenario is two or more clusters per DC/region, and replicating once over the WAN, then again locally, to reduce network traffic costs. |
Original comment by @jasontedor: @clintongormley Can you update this issue with the outcome of our discussion in Berlin? |
Original comment by @clintongormley: As discussed in Berlin, we're going to go with a single API for setting up index following, and we're going to use |
The TODOs in the rest actions was incorrect. The problem was that these rest actions used `follow_index` as first named variable in the path under which the rest actions were registered. Other candidate rest actions that also have a named variable as first element in the path (but with a different name) get resolved as rest parameters too and passed down to the rest action that actually ends up getting executed. In the case of the follow index api, a `index` parameter got passed down to `RestFollowExistingAction`, but that param was never used. This caused the follow index api call to fail, because of unused http parameters. This change doesn't fixes that problem, but works around it by using `index` as named variable for the follow index (instead of `follow_index`). Relates to elastic#30102
The TODOs in the rest actions was incorrect. The problem was that these rest actions used `follow_index` as first named variable in the path under which the rest actions were registered. Other candidate rest actions that also have a named variable as first element in the path (but with a different name) get resolved as rest parameters too and passed down to the rest action that actually ends up getting executed. In the case of the follow index api, a `index` parameter got passed down to `RestFollowExistingAction`, but that param was never used. This caused the follow index api call to fail, because of unused http parameters. This change doesn't fixes that problem, but works around it by using `index` as named variable for the follow index (instead of `follow_index`). Relates to #30102
Tests shard follow task in the context of a leader and follower ReplicationGroup, in order to test how the shard follow logic reacts to certain shard related failure scenarios. More tests will need to be added, but this indicates what changes need to be made to have these tests. Relates to elastic#30102
Tests shard follow task in the context of a leader and follower ReplicationGroup, in order to test how the shard follow logic reacts to certain shard related failure scenarios. More tests will need to be added, but this indicates what changes need to be made to have these tests. Relates to #30102
Tests shard follow task in the context of a leader and follower ReplicationGroup, in order to test how the shard follow logic reacts to certain shard related failure scenarios. More tests will need to be added, but this indicates what changes need to be made to have these tests. Relates to #30102
We have implemented all these endpoints. Closing |
Original comment by @bleskes:
Top Level Overview
This is a meta issue to capture all the initial thoguhts and design of the REST API for the LINK REDACTED. We currently see the need for the following API, each described in more details below:
The API design assumes we will use the remote cluster configuration of Cross Cluster Search (which will require minor tweaks not described here).
Create a following index
This API is used to create a new index on the local cluster that immediatley start following an index on a remote cluster. The newly create index will have the same meta data as the remote index. The default name will be identical but can optionally changed.
Notes:
Make an existing index become a follower
This API takes an existing index and adds the needed metadata to make it a follower. The API validates that the index is closed but doesn't close it nor open it. This needs to be done by explicit calls to the dedicated API.
This API also needs to validate that the remote following index is compatible with the local one. This includes the mapping and metadata but also some kind of sanity check using history uuids. Caveats here include people restoring this index from a snapshot, which will destroy it's history uuid. Sadly this is a likely scenario as we plan to use snapshot and restore as a way to boostrap indices initially.
Disconnect a following index
Takes a following index and makes it a "normal" one. It should verify that the index is closed before doing so.
Monitor/Stats
The goal of the API is to give easy access to statiscs that are relevant for CCR. This information is exposed by index stats and job status but you'll quite an expert knowledge to figure out how to tie things together. To that end we offer an API that does the heavy lifing.
returns a per index, per shard map of lag information
Register an auto follow patterns (phase 2)
Register a auto follow pattern to automatically create following indices for newly created indices on the remote cluster (phase 2)
This is a rough sketch. We don't plan to implement this at first phase of the project. That said, it's good to start discussion going on how this may look like and how it may or may not affect the components being built. This API will be important for the timebased data use case
The text was updated successfully, but these errors were encountered: