Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the implementation of backup in HelixAccountService #1246

Merged
merged 6 commits into from
Sep 5, 2019

Conversation

justinlin-linkedin
Copy link
Collaborator

Change the semantics of backup files in the HelixAccountService

  1. HelixAccountService would create backup files for all updates, regardless whether this update is performed by current instance.
  2. Backup filename's pattern is changed to include ZNRecord version and modified time. It will look like this 123.20190819T121314. The first number would be the version number and the second part is the modified time in a human readable format.
  3. Since the version and modified time is retrieved from ZNRecord, backup files would have the same name across all the instances.
  4. HelixAccountService creates backup whenever it fetches new account metadata from Helix. It will check if the current version is already backed up, it creates the file when it's not yet backed up.
  5. HelixAccountService would use account metadata from the latest backup file when the HelixStore is not available.
  6. This implementation doesn't guarantee that all the updates would be captured by backup files, since an ambry-frontend might crash between update account metadata and broadcasting the update. That's why when facing any I/O error, HelixAccountService would just give up on backup.

@codecov-io
Copy link

codecov-io commented Aug 23, 2019

Codecov Report

Merging #1246 into master will increase coverage by 0.06%.
The diff coverage is 83.8%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1246      +/-   ##
============================================
+ Coverage     72.21%   72.27%   +0.06%     
- Complexity     6063     6109      +46     
============================================
  Files           439      440       +1     
  Lines         35002    35145     +143     
  Branches       4446     4465      +19     
============================================
+ Hits          25276    25402     +126     
- Misses         8571     8584      +13     
- Partials       1155     1159       +4
Impacted Files Coverage Δ Complexity Δ
...github.ambry/config/HelixAccountServiceConfig.java 86.66% <100%> (+2.05%) 2 <0> (ø) ⬇️
.../com/github/ambry/account/LegacyMetadataStore.java 88.37% <100%> (+1.57%) 5 <1> (ø) ⬇️
...com/github/ambry/account/AccountMetadataStore.java 100% <100%> (ø) 6 <0> (+1) ⬆️
...om/github/ambry/account/AccountServiceMetrics.java 100% <100%> (ø) 1 <0> (ø) ⬇️
...ain/java/com/github/ambry/account/RouterStore.java 77.53% <50%> (-0.61%) 11 <1> (ø)
...va/com/github/ambry/account/BackupFileManager.java 82.77% <82.77%> (ø) 39 <39> (?)
.../com/github/ambry/account/HelixAccountService.java 88.09% <84.61%> (+0.9%) 43 <1> (+2) ⬆️
...in/java/com.github.ambry.clustermap/Partition.java 78.16% <0%> (-3.45%) 26% <0%> (-1%)
...java/com.github.ambry.network/SSLTransmission.java 70.22% <0%> (-0.98%) 69% <0%> (-2%)
...src/main/java/com.github.ambry.commons/BlobId.java 93.52% <0%> (-0.36%) 71% <0%> (-1%)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 21bb244...a3ed8f8. Read the comment docs.

backupFiles.put(version, currentBackup);
} else {
// remove the current file
currentBackup.tryRemove();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? Isn't the idea to remove the oldest saved file to make room for the new one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but here the currentBackup's version is less than the smallest version number in the map. So current backup file should be removed.

* @param afterTimeInSecond The unix epoch time which the latest backup's modifiedTime must be greater than.
* @return The account map from the latest backup file.
*/
Map<String, String> getLatestState(long afterTimeInSecond) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename variable, eg. latestTimeAllowed.

* @param afterTimeInSecond The unix epoch time which the latest backup's modifiedTime must be greater than.
* @return The account map from the latest backup file.
*/
Map<String, String> getLatestState(long afterTimeInSecond) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of returning null, how about Optional<Map<>>?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if Optional is better than just null, since the caller function just want to check if there is a result or not.

* Delete file identified by the given {@link Path}.
* @param toDelete The path of file to be deleted.
*/
private void deleteFile(Path toDelete) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can call this tryDeleteFile too.

Copy link
Contributor

@jsjtzyy jsjtzyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments, will review when it is unblocked

Copy link
Contributor

@jsjtzyy jsjtzyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. Will approve after a few comments are addressed.

@jsjtzyy jsjtzyy merged commit e406673 into linkedin:master Sep 5, 2019
@justinlin-linkedin justinlin-linkedin deleted the helixbackup branch September 5, 2019 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants