-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: incremental backup and point-in-time recovery #11227
Comments
Nicely written proposal. A few questions/comments:
|
Correct, and we need the logic to prevent it from auto-replicating.
👍
👍 |
Thinking more on this, I'm not sure which is the preferred way: use the same keyspace or create a new keyspace. Using the same keyspace leads to the risk of the server unintentionally getting attached to th ereplication stream. In fact, that's what's happening in my dev env right now: I make a point in time restore, and then vitess auto-configures the restored server to replicate -- even though I skip the replication configuration in the restore process. Is there a way to forcefully prevent the restored server from joining the replication stream? |
The current implementation now reuses same keyspace, sets tablet type to |
Re-using the same keyspace seems more logical to me at first thought. You can prevent tablet repair in a number of standard ways:
@GuptaManan100 would know better |
If we do want to disable the replication manager explicitly (even though in my opinion it shouldn't be required), then there is a new flag that was added recently -
I looked at the linked PR and I think that has all the changes that should be needed. There is already code to stop the @shlomi-noach Do you know where the replication is fixed by vitess in your tests? I don't think there is any other place, other than the 3 mentioned ☝️ that repair replication. I can help debug if we are seeing that replication is being repaired after the restore. EDIT: Looked at the test in the PR and it is using a replica tablet that is already running to run the recovery process, so the initialization code shouldn't matter either. |
@GuptaManan100 I don't think they are? The PITR tests are all good and validate that replication does not get fixed. @mattlord like @GuptaManan100 said, I think |
@shlomi-noach Okay great! I was looking at
which made me think that replication was being repaired by something in Vitess, even though I wasn't expecting it to be. And I agree that |
Sorry I wasn't clear. There was this problem and I found where it was that forced replication to start. It was as part of the |
Oh! I see, I had added that in response to an issue wherein if there was a PRS while a tablet was in restore state, its semi-sync settings weren't set up correctly when it finally transitioned to |
Link to a prior discussion |
A few words about #13156: this PR supports incremental backup/recovery for With #13156, it is possible to run full & incremental backups using The incremental restore process is similar to that of #13156 is merges in Note that this still only supports
Support for a point in time (as in, restore to a given |
Supporting a point-in-time recovery: We want to be able to recover one or all shards up to a specified point-in-time, ie a timestamp. We want to be able to restore to any point in time in a Whether we restore a single shard or multiple shards, the operation will take place independently on each shard. When restoring multiple shards to the same point in time, the user should be aware that the shards may not be in full sync with each other. Time granularity, clock skews etc., can all mean that the restored shards may not be 100% consistent with an actual historical point-in-time. As for the algorithm we will go by: it's a bit different than a restore-to-position because:
The most reliable information we have is the original committed timestamp value in the binary log. This event header remains true to the primary, even if read from a replica. The way to run a point-in-time recovery is a bit upside-down from restore-to-pos:
I tend to require the user to supply the point-in-time strictly in UTC, but we can work this out. Everything is possible, of course, but I wonder what is more correct UX-wise. |
WIP for restore-to-time: #13270 |
It's been pointed out by the community (Vitess slack, PR incoming. |
Closing this RFC as the work was done. |
We wish to implement a native solution for (offline) incremental backup and compatible point-in-time recovery in Vitess. There is already a Work in Progress PR. But let's first describe the problem, what's the offered solution, and how it differs from an already existing prior implementation.
Background
Point-in-time recoveries make it possible to recover a database into a specific or rough, timestamp or position. The classic use case is a catastrophic change to the data, e.g. an unintentional
DELETE FROM <table>
or similar. Normally the damage only applies to a subset of the data, and the database is generally still valid, and the app is still able to function. As such, we want to fix the specific damage inflicted. The flow is to restore the data on an offline/non-serving server, to a point in time immediately before the damage was done. It's then typically a manual process of salvaging the specific damaged records.It's also possible to just throw away everything and roll back the entire database to that point in time, though that is an uncommon use case.
A point in time can be either an actual timestamp, or, more accurately, a position. Specifically in MySQL 5.7 and above, this will be a GTID set, the
@@gtid_executed
just before the damage. Since every transaction gets its own GTID value, it should be possible to restore up to a single transaction granularity (where a timestamp is a more coarse measurement).A point in time recovery is possible by combining a full backup recovery, followed by an incremental stream of changes since that backup. There are two main techniques in three different forms:
This RFC wishes to address (1). There is already prior work for (2). Right now we do not wish to address (3).
The existing prior work addresses (2), and specifically assumes:
Suggested solution, backup
We wish to implement a more general solution by actually backing up binary logs as part of the backup process. These can be stored on local disk, in S3, etc., same way as any vitess backup is stored. In fact, an incremental backup will be listed just like any other backup, and this listing is also the key to performing a restore.
The user will take an incremental backup similarly to how they take a full backup:
vtctlclient -- Backup zone1-0000000102
vtctlclient -- Backup --incremental_from_pos "MySQL56/16b1039f-22b6-11ed-b765-0a43f95f28a3:1-615" zone1-0000000102
vtctlclient -- Backup --incremental_from_pos "auto" zone1-0000000102
An incremental backup needs to have a starting point, given as
--incremental_from_pos
flag. The incremental backup must cover that position, but does not have to start exactly at that position: it can start with an earlier position. See diagram below. The backup ends with the rough position of the time the backup was requested. It will cover the exact point in time where the request was made, and possibly extend slightly beyond that.An incremental backup is taken by copying binary logs. To do that, there is no need to shut down the MySQL server, and it is free to be fully operational and serve traffic while backup takes place. The backup process will rotate binary logs (
FLUSH BINARY LOGS
) so as to ensure the files it is backing up are safely immutable.A manifest of an incremental backup may look like so:
"Incremental": true,
--incremental_from_pos
. This value is empty for full backup.ServerUUID
is new and self explanatory, added for convenienceTabletAlias
is new and self explanatory, added for convenienceSuggested solution, restore/recovery
Again, riding the familiar
Restore
command. A restore looks like:Vitess will attempt to find a path that recovers the database to that point in time. The path consists of exactly one full backup, followed by zero or more incremental restores. There could be exactly one such path, there could be multiple paths, or there could be no path. Consider the following scenarios:
Recovery scenario 1
This is the classic scenario. A full backup takes place at e.g.
12:10
, then an incremental backup taken from exactly that point and is valid to13:20
, then the next one from exactly that point, valid to16:15
, etc.To restore the database to e.g.
20:00
(let's assume that's at position16b1039f-22b6-11ed-b765-0a43f95f28a3:1-10000
), we will restore the full backup, followed by incrementals1 -> 2 -> 3 -> 4
. Note that4
exceeds20:00
and vitess will only apply changes up to20:00
, or to be more precise, up to16b1039f-22b6-11ed-b765-0a43f95f28a3:1-10000
.Recovery scenario 2
The above is actually identical to the first scenario. Notice how the first incremental backup precedes the full backup, and how backups
2
%3
overlap. This is fine! We take strong advantage of MySQL's GTIDs. Because the overlapping transactions in2
and3
are consistently identified by same GTIDs, MySQL is able to ignore the duplicates as we apply both restores one after the other.Recovery scenario 3
In the above we have four different paths for recovery!
1 -> 2 -> 3 -> 4
1 -> 2 -> 6
1 -> 5 -> 3 -> 4
1 -> 5 -> 6
Either is valid, Vitess should choose however it pleases. Ideally using as fewest backups as possible (hence preferring 2nd or 4th options).
Recovery scenario 4
If we wanted to restore up to
22:15
, then, there's no incremental backup that can take us there, and the operation must fail before it event begins.Finding paths
Vitess should be able to determine the recovery path before even actually applying anything. It is able to do so by reading the available manifests, finding the shortest valid path to a requested point in time. By a greedy algorithm, it will seek the most recent full backup at or before requested time, and then the shortest sequence of incremental backups to take us to that point.
Backups from multiple sources
Scenario (3) looks imaginary, until you consider backups may be taken from different tablets. These have different binary logs at different rotation time -- but all share the same sequence of GTIDs. Since an incremental backup consists of full binary log copies, there could be overlaps between binary logs backed up from different tablets/MySQL servers.
Vitess should not care about the identity of the sources, should not care about the binary log names (one server's
binlog.0000289
may come before another server'sbinlog.0000101
), should not care about binary log count. It should only care about the GTID range an incremental backup covers: from (exclusive) and to (inclusive)Restore time
It should be notes that an incremental restore based on binary logs means sequentially applying changes to a server. This make take minutes or hours, depending on how many binary log events we need to apply.
Testing
As usual, testing is to take place in:
endtoend
(validate incremental backup, validate point in time restore)Thoughts welcome. Please see #11097 for Work In Progress.
The text was updated successfully, but these errors were encountered: