You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have encountered a rare issue where we got a backup done where only the MANIFEST was written to S3, but not the actual backup file!
Looking at the logs, it seems we tried to complete the upload process of backup.xbstream.gz after the MANIFEST was written, braking the contract highlighted here.
Reproduction Steps
not easy to reproduce since it depends on S3 throttling us, but might be possible to write a test that simulates this kind of behaviour.
Hello @rvrangel, thanks for reporting this. I spend some time investigating, and the issue is rooted in how we do AddFile on S3 and Ceph. To make it short, when backing up Vitess, both the builtin and xtrabackup engine code will do an AddFile to upload each individual file (or stripe), this function returns a writer to which we write. However, the S3 and Ceph AddFile will write to the remote storage asynchronously, meaning that we may return from backupFiles when we are done writing everything to the buffer, but when not everything has been written/processed by S3/Ceph. To fix this, we must use bh.EndBackup() at the end of backupFiles to make sure S3/Ceph are done processing the files before moving on to the MANIFEST.
The refactor I did in #17271 will fix this issue for the builtin backup engine. The xtrabackup engine fix should be part of another PR.
Overview of the Issue
We have encountered a rare issue where we got a backup done where only the
MANIFEST
was written to S3, but not the actual backup file!Looking at the logs, it seems we tried to complete the upload process of
backup.xbstream.gz
after theMANIFEST
was written, braking the contract highlighted here.Reproduction Steps
not easy to reproduce since it depends on S3 throttling us, but might be possible to write a test that simulates this kind of behaviour.
Binary Version
Operating System and Environment details
Log Fragments
The text was updated successfully, but these errors were encountered: