Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
WriteRename: Create backup of original file
To make sure the updates on important files are performed correctly, we are taking the following steps: 1. Create temporary file next to the file we want to update and Write the new context to the temp file and call fsync on the descriptor, and close the descriptor 2. Execute `rename` syscall to move the tempfile -> original_file The only step missing here is: 3. Call `fsync` on the directory containing the file. Theoretically `rename` should guarantee deleting the original file and renaming the temp file as one atomic operation. But we still observe that the original file is some time missing after a power cut, or crash. It can be that we are forcing a weird corner case in ext4 code because of this missing step 3, and because of `rename` is closely followed by `unlink`, but we do not have any proof yet. To address this behavior, this patch changes the workflow to the following: 1. Create temp file and write new context, call fsync and close 2. Create a copy of the original file as "orig_file.bac" 3. Call fsync on the directory where the original file located 4. Rename temp_file -> orig_file 5. Call fsync on the directory where the original file located This way, whenever the crash happens, we must have at least the *.bac file. This patch introduces the safety net. We would still need to teach the consumers of these files to reach for the *.bac file in case the original file is missing. A positive side effect of this feature - postmortem analysis of a broken node. Or a leverage for manual recovery after configuration bug. Signed-off-by: Yuri Volchkov <yuri@zededa.com>
- Loading branch information