[BUG] Replication Failover Gap #1349

zalseryani · 2024-03-12T12:29:11Z

Replication Failover on Production Outage has Data Gap

When configuring replication between Prod and DR sites of opensearch, and we have an outage on Production, there will be some data that are not synced to DR opensearch.
What is the proper solution for such case?

Failing over will result in having some documents or messages that are not available on the DR site, would it be a solution to let the ETL start again from the point where it should start from ?

Is there any other better way to handle/configure synchronous replication between opensearch Prod and opensearch DR sites, something like 2 phase commit, meaning data will not be written/committed on Prod unless it is written on DR site ?

because I do not see any replication configuration to tune the speed of replication or the pulling interval for the data (not metadata/settings or new matching indices when having an auto-follow rule configured between Prod and DR sites) Replication settings

Kindly advise, and thanks in advance for your time and support.

ankitkala · 2024-04-11T05:26:13Z

We do not support synchronous replication.

During DR, follower stats can give you the last tracked leaderCheckpoint & followerCheckpoint but 1) it tracks changes at shard level whereas user is concerned about REST API level. 2) Checkpoint doesn't tell you the total data replicated in terms on time but rather as a monotonically increasing integer value.

CCR provides 1 min SLA for replication and usually is under 20 seconds. But its hard to guarantee this as a lot depends on the workload and overall resource consumption.

zalseryani added bug Something isn't working untriaged labels Mar 12, 2024

ankitkala removed the untriaged label Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Replication Failover Gap #1349

[BUG] Replication Failover Gap #1349

zalseryani commented Mar 12, 2024 •

edited

Loading

ankitkala commented Apr 11, 2024

[BUG] Replication Failover Gap #1349

[BUG] Replication Failover Gap #1349

Comments

zalseryani commented Mar 12, 2024 • edited Loading

Replication Failover on Production Outage has Data Gap

ankitkala commented Apr 11, 2024

zalseryani commented Mar 12, 2024 •

edited

Loading