Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark snapshot as failure if there is any issue in uploading a file. #679

Merged
merged 2 commits into from
Jun 12, 2018

Conversation

arunagrawal84
Copy link
Contributor

Mark snapshot as a failure if there is an issue with uploading a file. This is to ensure we fail-fast. This is in contrast to previous behavior where snapshot would "ignore" any failures in the upload of a file and mark snapshot as "success".

Since it was not truly a "success" marking that as "failure" is the right thing to do. Also, meta.json should really be uploaded in case of "success" and not in case of "failure" as the presence of "meta.json" marks the backup as successful.

The case for fail-fast: In a scenario where we had an issue say at the start of the backup, it makes more sense to fail-fast then to keep uploading other files (and waste bandwidth and use backup resources). The remediation step for backup failure is anyways to take a full snapshot again.

…his is to ensure we fail-fast. This is in contrast to previous behavior where snapshot would "ignore" any failures in upload of a file and mark snapshot as "success".

Since, it was not truly a "success" marking that as "failure" is right thing to do. Also, meta.json should really be uploaded in case of "success" and not in case of "failure" as presence of "meta.json" marks the backup as successful.

Case for fail-fast: In a scenario where we had issue say at the start of the backup, it makes more sense to fail-fast then to keep uploading other files (and waste bandwidth and use backup resources). The remediation step for backup failure is anyways to take a full snapshot again.
@@ -100,7 +100,9 @@ public AbstractBackupPath retriableCall() throws Exception {

addToRemotePath(abp.getRemotePath());
} catch (Exception e) {
logger.error("Failed to upload local file {} within CF {}. Ignoring to continue with rest of backup.", file.getCanonicalFile(), parent.getAbsolutePath(), e);
//Throw exception to the caller. This will allow them to take appropriate decision.
logger.error("Failed to upload local file {} within CF {}.", file.getCanonicalFile(), parent.getAbsolutePath(), e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw the exception or just collect all the failed files and throw one exception at the end after uploading all successful files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a discussion around this as to what is the strategy in case of failure. How will upload of other remaining objects as part of "snapshot" help because that snapshot will never be marked as "successful" and thus can not be used for "restore" path either?
The current way to come out of snapshot failure is to actually go and take a complete new snapshot. Throwing exception right away will ensure we fail-fast rather than fail and notify later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm +1 if with resumable snapshot of course :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fail-fast with resumable is the way to go!

@@ -100,7 +100,9 @@ public AbstractBackupPath retriableCall() throws Exception {

addToRemotePath(abp.getRemotePath());
} catch (Exception e) {
logger.error("Failed to upload local file {} within CF {}. Ignoring to continue with rest of backup.", file.getCanonicalFile(), parent.getAbsolutePath(), e);
//Throw exception to the caller. This will allow them to take appropriate decision.
logger.error("Failed to upload local file {} within CF {}.", file.getCanonicalFile(), parent.getAbsolutePath(), e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm +1 if with resumable snapshot of course :-)

@arunagrawal84 arunagrawal84 merged commit 5436432 into Netflix:3.x Jun 12, 2018
@arunagrawal84 arunagrawal84 deleted the hotfix branch June 12, 2018 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants