-
-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTRFS subvolumes treated as external filesystems #4009
Comments
First, I am sorry to hear that you lost data. I have been where you are and it is quite an awful experience. That being said, understand that btrfs subvolumes are, in fact, separate filesystems. For example, you'll notice that you can't remove a subvolume using There was discussion on #2141 some time back to get the documentation where it is currently. Do you have a recommendation of what would make it more clear? Also, you should regularly perform test restores to make sure your backups are working properly. At a minimum, do a |
This is the line in the code deciding about the --one-file-system property: https://github.com/borgbackup/borg/blob/1.1.6/src/borg/archiver.py#L588 This is the doc string: https://github.com/borgbackup/borg/blob/1.1.6/src/borg/archiver.py#L3199 I don't think this is a borg bug, it rather seems that btrfs subvolumes being different devices / filesystems was an unexpected property of these for you and you accidentally excluded them. Feel free to do a PR if you have an idea about how to improve the docs or the code. |
You are right, I should've properly done backups, which means that I also should have done restore test from time to time. Unfortunately, I just based the reliability of the backup on the total size of the backup, the manual checks in the mounts and the reliability of the drive where the backups were on. So now I'm learning it the hard way that I should've this exact check. I still can't understand that The reason that I used I think BTRFS subvolumes should be included with I'm going to look into the code which you provided links for and try to PR a change myself, hoping that it would be supported. |
I would oppose such a change. To my mind (and workflow) the way borg currently handles Further, if you're going to propose this change for btrfs, what about other snapshotting filesystems like ZFS? Also, BTRFS subvolumes are more flexible than you indicate. For example, subvolumes can be nested. They can be nested at arbitrary positions in the filesystem tree. However (and arguably providing further clues that the current approach taken by the borg code is correct) is the fact that creating a snapshot of a subvol that itself contains other subvols will not snapshot those child subvols. Numerous machines I administer run on (and boot from) a single btrfs partition with various subvols (e.g. os1, os2, os1-home, os2-home, user-documents, etc). Backing up multiple subvols in a single Alternatively, for maximum data consistency when backing up a running system, one can snapshot the subvols to be backed up, have borg backup those snapshots, then delete the snapshots. In this case, the snapshots will not contain any mounted filesystems within them, so All the best |
I don't think you should consider this as being a special case by implementing specific code to handle this differently as how it's currently done. I think one should look at this as the whole BTRFS partition being one filesystem, as the subvolumes (hence "sub"volume) are also part of that same filesystem (and partition), but are connected to it with different features for reasons already mentioned or separated from it (behind the scenes) on purpose to be able to create scopes for snapshots so that rollbacks don't cause data loss of data that shouldn't be part of that rollback, like logs or data in This specific (the latter) use of snapshots is one of the most commonly used. SUSE (the BTRFS experts) does it and documents it very well on why specific subvolumes are created, Ubuntu does it with the
This should have been clear that I meant this by saying If you look at the |
Are you saying |
@IsaakGroepT
Either you misunderstood me or I was unclear, because I'm absolutely aware of the fundamental difference between snapshots and backups. I think this confusion started because you weren't clear in your original post that you were using nested snapshots. I assumed that you were using a snapshot structure like Ubuntu does (subvols not nested and hanging off the btrfs root directory) and therefore you were backing up a live/running system that also had various mount points within the subtrees of each btrfs subvolume and you were using Now you've clarified that you're using nested subvolumes, I better understand the issue you're wanting to discuss. In any case, my central point stands - you can list each subvolume individually on the So via In any event, if the present borg docs are unclear, then by all means let's clarify them so that no one else misses this important point. All the best |
Unfortunately it doesn't seem to be so simple. I tested it and my subvolume was detected as a mount by that function |
An observation is that the BTRFS subvolumes show a device major number of 0. Whether this is guaranteed or just coincidence I have no idea. Perhaps add a flag
This would treat all subvolumes as if they are the same filesystem as the start. |
Btrfs subvolumes are separate file trees, they aren't really a separate file system. Subvolumes share all the other trees: root, csum, extent, uuid, chunk, dev, etc. On the other hand, the inode numbering starts over in each subvolume, so a given inode number is not unique on a Btrfs volume where it is unique on a Btrfs subvolume. Anyway, it's an understandable point of confusion for the user as well as for development. You probably wouldn't really want a backup to, by default, consider all subvolumes (which means including snapshots as they are just a pre-populated subvolume) on the file system for backups as it would lead to a lot of unnecessary duplication of data. The reality is Btrfs volumes can be sufficiently more complicated that treating it as if it's any other file system, and not have a least Btrfs specific warnings to the user for potentially ambiguous situations, is going to end up in misaligned expectations. |
how about something else - instead of trying to figure out a way to handle this add 2 new tools a) a warning that will print file-system boundaries that borg will not traverse into |
@cmurf I don't think duplication is an issue to take into account, because Borg has a proper deduplication system that handles this. I think a warning would be a satisfying way to improve this situation, next to improving the documentation to mention this behaviour specifically. Since the |
@IsaakGroepT: @cmurf's point about files in snapshots being assigned different inode numbers would mean that for a btrfs volume that has multiple subvolume snapshots (as is the case for users who use tools like |
@IsaakGroepT: I don't know what "proper" means in the context of deduplication, and I don't know how Borg's deduplication works. If I point borg at a path omitting And therefore I think it would be entirely sane for borg developers to use @level323 You definitely cannot infer much of anything about files and their inode numbers on Btrfs. The inode numbers in a snapshot created with With one exception, you can assume that an inode number is not used more than once in a given subvolume. The exception is subvolumes themselves always have inode number 256. Another gotcha pertains to mounting subvolume with the -o subvol, or -o subvolid mount option. Behind the scenes these are bind mounts. So, depending on how things are assembled, that might pose curious behaviors, not least of which is it's possible snapshots are entirely invisible (not in any mount path). |
J/k. That was interesting information. And that also shows the limitation of the use cases that I'm used to. So we already discussed this issue thoroughly to avoid btrfs tweaks being added to the general behaviour of Borg. Shall we now propose pull requests with improvements to the documentation regarding to the potential issues/confusions with the btrfs file system? |
Afaik that is not how borg works. If it was, it would be much slower. |
Borg wouldn't know that the contents were the same, so it would have to read all 10 copies of the data. Subsequent runs would be fast, because of the files cache. But if a file changed in all 10 snapshots, then it would have to be read 10 times. |
I agree with @cmurf and would like to add one thing:
|
I'm just familiarizing myself with btrfs and stumbled upon this issue. I sympathize with @aardbol in finding the status quo confusing but am not sure how to really fix it. Why is it confusing? I think the best solution would be to rename the option to If the above change of behaviour is not implementable maybe the option should be renamed |
I also think a warning as suggested by @horihel would be good although I have no idea how to do that. |
For your information: |
I'd like to add my voice to the side claiming that the behavior
I'd like to do automatic periodic backups of my system. The backups should contain the entire system, except for the contents of any USB keys or external disks that happen to be plugged in. If the root filesystem was, for example, ext4, I could accomplish this in a trivial way with
None of the solutions seem ideal. Note that the subvolumes are not explicitly mounted (such as with |
In the case of containers, restoring such a backup likely won't work because it won't restore the subvolume/snapshot hierarchy. I think for this you need a separate backup regime for containers that's tar based. That's what podman/mobi/docker all expect, and I see While Btrfs subvolumes aren't separate file systems, they are a dedicated btrfs, with their own pool of inodes, and |
I'm quite confident that restoring from such a backup would work just fine. The subvolumes that But anyway, I think that's an entirely separate topic. I was trying to make another point, but got it mixed up with considerations about my ideal backup scheme. Sorry about that. My main point was that the option I understand that subvolumes are technically quite different from ordinary directories. As you said, own pool of inodes, statfs() reports them as separate devices, and so on. I'm not saying that
Traditionally, it was quite obvious what "same file system" meant, and how it was different from "mount points of other file systems". But with newer filesystems, such as btrfs, coming into play, such a description is not so clear anymore. In my opinion, at least a few words in the documentation should be dedicated to this issue. Users need to be warned that subvolumes will not be included when this option is used. The original author of this issue got burned due the misunderstanding, and I nearly did, too. For those of us that would only like to avoid pulling in files from external mounts, a separate option such as |
No idea. If it expects they are subvolumes, snapshotting them will fail. Maybe there's a fall back, but then you've lost snapshotting.
It'd be surprising in any case. There isn't a single correct expectation. I guess one way of answering this is, which is the worse outcome? Data that the user thought was backed up, but isn't, because e.g. /home was excluded. Or data that the user thought was backed up once, but isn't, e.g. / was backed up 500 times because there are 500 snapshots of it. Possibly in either case there's data not backed up that was expected to be. A possible refinement might be to default to backing up all subvolumes that are not snapshots. This information is exposed by the libbtrfsutil C API; and should be possible to expose it by the libbtrfsutil python API if it isn't already. This might result in a better match up to expectations more often; but there might still be edge cases leading to surprises. It's also not terribly discoverable what's going on: why are some subvolumes backed up and others aren't? Oh, because they're subvolume snapshots. Hmm, well what if there is a "received" snapshot but it's a standalone with no other snapshots made from it? Back it up or not? If we think the safest path is to back it up when in doubt, that's flawed logic because that might result in the destination (the backup) becoming full, preventing other things from being backed up. This sort of problem happens all the time in software design. Still another possibility is to include mounted subvolumes, but not cross into nested subvolumes (or snapshots) that aren't mounted.
True. A subvolume is a dedicated btree, but not a separate file system. In the Fedora Btrfs by default case, the previous default layout is LVM+ext4 where /home is on a separate ext4 volume. So if the backup source is set to |
Well, that's a fatalistic claim! :) I don't think that things are so bad. Expectations are based on available information, and the main sources of information in this particular case are: the name of the option, and its description in the manual. The name As a first step, I propose that the description is expanded with something like "also excludes filesystem subvolumes on systems that support them". Wording might be different, I'm just sharing the basic idea. This still requires the users to know what "subvolumes" are, and whether they have them on the system, so it does not fully cater to my example of a naïve user. But at least, they have been pointed in the right direction. In my opinion, that's the game changer. After all, if people use a program's option without fully understanding what it does, they can't expect to fully understand the consequences of using it. If A clarification of the matter in the FAQ might also be appropriate, just to reaffirm what the option does in conjunction with a btrfs system. As a second step, I propose that we research the possibility of implementing an option
I think that people tend to be quite careful to explicitly exclude the data they don't want to have backed up. If they have large snapshots on the system, they will be aware of them, and they will exclude the snapshot path(s) from the backup. At least in my mind, a snapshot is still perceived as a copy, and if I had 500 copies of something large, I would be quite hurried to ensure that they are not burdening my backups. |
This addresses borgbackup#4009. The documentation now explicitly mentions btrfs subvolumes.
For your convenience: These are the changes in that PR: And in the prose part below:
|
Haha, @eike-fokken, my mail client showed only the beginning of the first line in the preview,
and I was already holding my breath... until I realized that you meant to type thread instead of threat. So thanks for threatening us with a PR. :) I was planning on doing the same myself, but you beat me to it, so hey, less work for me! I think that your addition to the documentation clarifies the behavior of
This only says that the device numbers are compared, but it might be unclear how
So this means that
Is this really specific to Linux? I'm not familiar enough with other kernels to know whether it applies elsewhere. Anyway, this guideline of double checking the contents of the backups should hold universally. |
Oh, sorry... you're right. It does descend into bind mounts, if and only if the device numbers are equal, i just checked.
At least the BSDs don't have bind mounts. But of course it is plausible that someone somewhere has written a btrfs driver for some bsd. So maybe I should reword that.
Nice, I'll write that into the PR. |
The feedback from issue borgbackup#4009 is now included.
docs: clarify borg create's '--one-file-system' option, #4009 The documentation now explicitly mentions btrfs subvolumes and explains how --one-file-system works. Co-authored-by: Eike <e.fokken+git@posteo.de>
docs: clarify borg create's '--one-file-system' option, borgbackup#4009 The documentation now explicitly mentions btrfs subvolumes and explains how --one-file-system works. Co-authored-by: Eike <e.fokken+git@posteo.de>
Would be really great if Borg could treat btrfs subvolumes as within one file-system or have other option which would include it and skip only really other file-systems, as per above, subvolumes are still same filesystem (i.e. mkfs.btrfs creates one filesystem and subvolume creation happens within this filesystem). I get all the points around what ismounted, etc. returns, it's just that btrfs has been created after ismounted, etc. has been invented. Thanks! |
I nearly stumbled upon this as well, but got skeptical when my backup was only 2 GBs instead of some TBs. One of the few problems with borg is, that it assumes the user actually knows what he does (which would be nice, but many users a probably just not full time sysadmins but just want to backup their home server) - IMHO borg could warn the user a bit more if he probably does stupid things. Combining In my case, I'm using borg on a Synology NAS which uses magical btrfs stuff and I just did not know that btfs sub-volumes are different filesystems (and it took me quite a time to figure that out, because they are not shown as mountpoints in Maybe it would also be a nice idea to include a "Check if I am doing the right things" flag in borg.
|
Have you checked borgbackup docs, FAQ, and open Github issues?
Yes
Is this a BUG / ISSUE report or a QUESTION?
BUG
System information. For client/server mode post info for both machines.
Desktop computer that backups to slave HDD
Your borg version (borg -V).
1.1.6
Operating system (distribution) and version.
Antergos up-to-date, rolling release distro
Hardware / network configuration, and filesystems used.
BTRFS
How much data is handled by borg?
200GB
Full borg commandline that lead to the problem (leave away excludes and passwords)
borg create --one-file-system ...
I can't remember the exact full command exactly because it was in a script that I wrote, which I can't recover due to data loss.
I remember that I used the above in combination with exclude /tmp and exclude /var/cache/pacman/pkg
I also used encryption and zstd compression, level 6
Describe the problem you're observing.
Borg skipped all my btrfs subvolumes during bavkups due to the --one-file-system argument. I thought it was skipping external filesystems only as is described in the documentation. Well it now seems that it skipped even btrfs subvolumes like my /home folder in all my backups.
I was surprised to find it out now because the total amount of back-up size, not taking into account compression and deduplication, was around the size of my partition, so I thought it was a complete back-up. Unfortunately now I found out that in all those back-ups, the /home folder was ignored...
Note that btrfs subvolumes are also handled by fstab, so they are mounted at boot time
Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.
Yes, use --one-file-system
The text was updated successfully, but these errors were encountered: