-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for in-place restores. #1094
base: 4.x
Are you sure you want to change the base?
Conversation
…hs when fetching incrementals. The iterators are always fully materialized. Also, remove the now-redundant method from BackupRestoreUtil that merely wrapped the MetaProxy call.
… false in practice. Remove final modifier on static method. Make static property an instance variable instead. Use parameterized constructor call in place of two lines.
…is not relevant to whether a restore should be attempted.
…intuitive and will be empty in practice in most cases anyway.
…re to cease requiring an IConfiguration but rather just take the data directory path as a String.
…y to restore to a staging area and then import or move the data depending on whether Cassandra is running.
6369b6a
to
9557275
Compare
String restoreFileName = | ||
PATH_JOINER.join(dataDir, keyspace, columnFamily, indexDir, fileName); | ||
return_ = new File(restoreFileName); | ||
return_ = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are in here, consider renaming this variable, its not idiomatic java
String dataDirectory, String monitoringFolder) throws IOException { | ||
ImmutableSet.Builder<Path> backupPaths = ImmutableSet.builder(); | ||
Path dataPath = Paths.get(dataDirectory); | ||
if (Files.exists(dataPath) && Files.isDirectory(dataPath)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The File.exists
call is redundant here: https://docs.oracle.com/javase/6/docs/api/java/io/File.html#isDirectory()
Files.newDirectoryStream(keyspaceDirPath, Files::isDirectory)) { | ||
for (Path columnfamilyDirPath : keyspaceStream) { | ||
Path backupDirPath = | ||
Paths.get(columnfamilyDirPath.toString(), monitoringFolder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can use columnFamilyDirPath.resolve(monitoringFolder)
here instead of converting back and forth between string and path. Reference: https://docs.oracle.com/javase/8/docs/api/index.html?java/nio/file/Path.html
for (Path columnfamilyDirPath : keyspaceStream) { | ||
Path backupDirPath = | ||
Paths.get(columnfamilyDirPath.toString(), monitoringFolder); | ||
if (Files.exists(backupDirPath) && Files.isDirectory(backupDirPath)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here that the File.exists
call is redundant.
if (instanceState.getRestoreStatus() != null | ||
&& instanceState.getRestoreStatus().getStatus() != null | ||
&& instanceState.getRestoreStatus().getStatus() == Status.STARTED) { | ||
if (instanceState.isRestoring()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this cleanup 🙌
failedImports.addAll(importData(keyspace, table, tableDir.toString())); | ||
} | ||
} else { | ||
recursiveMove(Paths.get(srcDir), Paths.get(configuration.getDataFileLocation())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the move is non-atomic I can see a weird edge case where we start Cassandra while the move is happening (and we have like 10,000+ files). Should we have some sort of "fence" that tells the subsystem that starts Cassandra it needs to wait while this happens?
if (Files.isRegularFile(path)) { | ||
Files.move(path, destination.resolve(path.getFileName())); | ||
} else if (Files.isDirectory(path)) { | ||
recursiveMove(path, destination.resolve(path.getFileName())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If am reading this right, if the source path we are currently iterating over is a directory we recurse by moving the files in that directory into [destination/directory name]
. Cant we just move the entire directory instead of recursing then?
} else if (Files.isDirectory(path)) { | ||
recursiveMove(path, destination.resolve(path.getFileName())); | ||
} else { | ||
throw new IOException("Failed determining type of inode is " + path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this error or just skip/log? Likely an edge case but do we really want to fail if thats the case?
|
||
// Cleanup local data | ||
File dataDir = new File(config.getDataFileLocation()); | ||
if (dataDir.exists() && dataDir.isDirectory()) FileUtils.cleanDirectory(dataDir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question about this: I know we don't need to stop cass
anymore because we use import
but I was surprised to see we no longer clean the directory. Is this for the case where we want to restore a subset of the data but keep whats there? Do we still need the option to clean the directory for some restores?
"config.doesCassandraStartManually() is set to True, hence Cassandra needs to be started manually ..."); | ||
List<String> failedImports = cassOps.importAll(config.getRestoreDataLocation()); | ||
if (!failedImports.isEmpty()) { | ||
instanceState.endRestore(Status.FAILED, LocalDateTime.now()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to use the list of failed imports somewhere to log what failed and be able to resume? instead of starting over from the beginning?
Also if we only need the boolean why return the list? I would rather us use the list but if we are not going to I mean.
6369b6a 2024-05-02 | Cease stopping Cassandra in the restore process and reveal the ability to restore to a staging area and then import or move the data depending on whether Cassandra is running.
4ece8d3 2024-05-01 | Move getBackupDirectories to BackupRestoreUtil and change its signature to cease requiring an IConfiguration but rather just take the data directory path as a String.
6b3d45a 2024-04-10 | Cease wiping the data directory in advance of restore. It is counter-intuitive and will be empty in practice in most cases anyway.
f10289f 2024-04-10 | Consolidate restore status API.
4d6d07f 2024-03-09 | Remove backup racs as that configuration is not used in practice and is not relevant to whether a restore should be attempted.
aa3dffe 2024-03-09 | Remove redundant vertical whitespace.
8c23800 2024-03-09 | Remove redundant logging.
646973d 2024-03-09 | Remove all non-javadoc comments that don't provide warnings to future maintainers.
8d18f51 2024-03-09 | Remove waitForCompletion parameter to download method which is always false in practice. Remove final modifier on static method. Make static property an instance variable instead. Use parameterized constructor call in place of two lines.
ce6c7c1 2024-03-09 | Remove unused code from our restore tooling.