-
-
Notifications
You must be signed in to change notification settings - Fork 63
Home
Look at the README.md for the introduction.
zfs-autobackup creates ZFS snapshots on a "source" machine and then replicates those snapshots to a "target" machine via SSH.
zfs-autobackup may be installed on either the source machine or the target machine. (Installing on both is unnecessary.)
When installed on the source, zfs-autobackup will push snapshots to the target. When installed on the target, zfs-autobackup will pull snapshots from the source.
The recommended installation method on most machines is to use pip:
[root@server ~]# pip install --upgrade zfs-autobackup
The above command can also be used to upgrade zfs-autobackup to the newest stable version.
To install the latest beta version add the --pre
option.
On older machines you might have to use easy_install:
[root@server ~]# easy_install zfs-autobackup
If you dont want to install zfs-autobackup, or want to make some changes to the code, look at Development
In this example, a machine called backup
is going to create and pull backup snapshots from a machine called pve01
.
As zfs-autobackup will perform numerous remote commands via ssh, we strongly recommend setting up passwordless login via ssh. This means generating an ssh key on target machine (backup
) and copying the public ssh key to the source machine (pve01
).
NOTE: Most examples use root-access on both the source and target. If you want to use a normal user its a bit more complex: Your user needs read/write access to /dev/zfs and you need to setup zfs permissions as well.
Create an SSH key on the backup machine that runs zfs-autobackup. You only need to do this once.
Use the ssh-keygen
command and leave the passphrase empty:
root@backup:~# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
...
root@backup:~#
Now you need to copy the public part of the key to pve01
The ssh-copy-id
command is a handy tool to automate this. It will just ask for your password.
root@backup:~# ssh-copy-id root@pve01
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
Password:
Number of key(s) added: 1
root@backup:~#
This allows the backup machine to login to pve01
as root without password.
Next, we specify the filesystems we want to snapshot and replicate by assigning a unique group name to those filesystems.
Its important to choose a unique group name and to use the name consistently. (Advanced tip: If you have multiple sets of filesystems that you wish to backup differently, you may do this by creating multiple group names.)
In this example, we assign the group name offsite1
to the filesystems we want to backup.
On the source machine, we set the autobackup:offsite1
zfs property to true
, as follows:
[root@pve01 ~]# zfs set autobackup:offsite1=true rpool
[root@pve01 ~]# zfs get -t filesystem,volume autobackup:offsite1
NAME PROPERTY VALUE SOURCE
rpool autobackup:offsite1 true local
rpool/ROOT autobackup:offsite1 true inherited from rpool
rpool/ROOT/pve-1 autobackup:offsite1 true inherited from rpool
rpool/data autobackup:offsite1 true inherited from rpool
rpool/data/vm-100-disk-0 autobackup:offsite1 true inherited from rpool
rpool/data/vm-101-disk-0 autobackup:offsite1 true inherited from rpool
rpool/tmp autobackup:offsite1 true inherited from rpool
ZFS properties are inherited
by child datasets. Since we've set the property on the highest dataset, we're essentially backing up the whole pool.
If we don't want to backup everything, we can exclude certain filesystem by setting the property to false
:
[root@pve01 ~]# zfs set autobackup:offsite1=false rpool/tmp
[root@pve01 ~]# zfs get -t filesystem,volume autobackup:offsite1
NAME PROPERTY VALUE SOURCE
rpool autobackup:offsite1 true local
rpool/ROOT autobackup:offsite1 true inherited from rpool
rpool/ROOT/pve-1 autobackup:offsite1 true inherited from rpool
rpool/data autobackup:offsite1 true inherited from rpool
rpool/data/vm-100-disk-0 autobackup:offsite1 true inherited from rpool
rpool/data/vm-101-disk-0 autobackup:offsite1 true inherited from rpool
rpool/tmp autobackup:offsite1 false local
The autobackup
property can have these values:
-
true
: Backup the dataset and all its children. -
false
: Don't backup the dataset and all its children. (Exclude the dataset) -
child
: Only backup the children of the dataset, not the dataset itself. -
parent
: Only backup the dataset, but not the children. (supported in version 3.2 or higher)
(Note: Only use the zfs
command to set these properties. Do not use the zpool
command.)
Run the script on the backup machine and pull the data from the source machine specified by --ssh-source
.
[root@backup ~]# zfs-autobackup -v --clear-mountpoint --ssh-source pve01 offsite1 data/backup/pve01
zfs-autobackup v3.1.1 - (c)2021 E.H.Eefting (edwin@datux.nl)
Selecting dataset property : autobackup:offsite1
Snapshot format : offsite1-%Y%m%d%H%M%S
Hold name : zfs_autobackup:offsite1
#### Source settings
[Source] Datasets on: pve01
[Source] Keep the last 10 snapshots.
[Source] Keep every 1 day, delete after 1 week.
[Source] Keep every 1 week, delete after 1 month.
[Source] Keep every 1 month, delete after 1 year.
#### Selecting
[Source] rpool: Selected
[Source] rpool/ROOT: Selected
[Source] rpool/ROOT/pve-1: Selected
[Source] rpool/data: Selected
[Source] rpool/data/vm-100-disk-0: Selected
[Source] rpool/data/vm-101-disk-0: Selected
[Source] rpool/tmp: Excluded
#### Snapshotting
[Source] Creating snapshots offsite1-20220107131107 in pool rpool
#### Target settings
[Target] Datasets are local
[Target] Keep the last 10 snapshots.
[Target] Keep every 1 day, delete after 1 week.
[Target] Keep every 1 week, delete after 1 month.
[Target] Keep every 1 month, delete after 1 year.
[Target] Receive datasets under: data/backup/pve01
#### Synchronising
[Target] data/backup/pve01/rpool@offsite1-20220107131107: receiving full
[Target] data/backup/pve01/rpool/ROOT@offsite1-20220107131107: receiving full
[Target] data/backup/pve01/rpool/ROOT/pve-1@offsite1-20220107131107: receiving full
[Target] data/backup/pve01/rpool/data@offsite1-20220107131107: receiving full
[Target] data/backup/pve01/rpool/data/vm-100-disk-0@offsite1-20220107131107: receiving full
[Target] data/backup/pve01/rpool/data/vm-101-disk-0@offsite1-20220107131107: receiving full
#### All operations completed successfully
As you might notice, zfs-autobackup preserve the whole parent-path of the source.
So rpool/data/vm100-disk-0
ends up as: data/backup/pve01/rpool/data/vm-100-disk-0
Since its a backup, its usefull to preserve the original structure of the data like this.
Since you might think this is ugly, there is the --strip-path
option. However this can lead to collisions if you 2 source datasets result in the same target paths. Since version 3.1.2 zfs-autobackup will check for this and emit an error.
If you want your source and target structure to look exactly the same, you have to do the following:
- Select the whole source-pool. In this case:
zfs set autobackup:offsite1=true rpool
- Use
--strip-path=1
- Specify target-pool as target-path. In this case:
data
- Use the
--force
option the first time to overwrite the existing target pool. (New in v3.1.2)
Note that this is called a "pull" backup. The backup (target) machine pulls the backup from the source machine. This is usually the preferred way.
It is also possible to let a source machine push its backup to the target machine. There are security implications to both approaches, as follows:
- With a pull backup, the target machine will have ssh access to the source machine.
- With a push backup, the source machine will have ssh access to the target machine.
If you wish to do a push backup, then you would setup the SSH keys the other way around and use the --ssh-target
parameter on the source machine.
Note that you can always change the ssh source and target parameters at a later point without any problems.
It also possible to use a 3rd server that pulls backups from the source and pushes the data to the target server via 1 stream. This way the source and target server wont have to be able to reach each other. If one server gets hacked, they cant access the other server.
To do this, you only have to install zfs-autobackup on a 3rd server and use both --ssh-source
and --ssh-target
to specify the other source and target servers.
It is also possible to run zfs-autobackup locally, where you could backup snapshots to a different pool on the same server. This is done by simply omitting the --ssh-source
and --ssh-target
parameters.
For example, let's say you have an additional pool for local backups called backups
, that's on separate device(s) from your data pools. In this pool, you have a dataset called autobackup
. You could run the following command (assuming you set the zfs group name to autobackup:local
on your data filesystems):
zfs-autobackup -v local backups/autobackup
Combining this with a remote push or pull backup, you could then set the zfs group name on your backup filesystems to something like autobackup:remote
, then have a second zfs-autobackup job that backs up these snapshots to your remote storage like:
zfs-autobackup -v --ssh-target root@backupserver remote data/backup/pve01
Now every time you run the command, zfs-autobackup will create a new snapshot and replicate your data.
Older snapshots will eventually be deleted, depending on the --keep-source
and --keep-target
settings. The defaults are shown above under the 'Settings summary'. Look at Thinner for more info.
Once you've got the correct settings for your situation, you can just store the command in a cronjob.
Or just create a script and run it manually when you need it.
Don't forget to monitor the results of your backups, look at Monitoring for more info.
You might want to make snapshots during the week, and only transfer data during the weekends.
In this case you would run this each weekday:
zfs-autobackup -v --ssh-source pve01 offsite1 data/backup/pve01 --no-send
And this on weekend days:
zfs-autobackup -v --ssh-source pve01 offsite1 data/backup/pve01
You can also create the snapshots in offline mode by using zfs-autobackup as a snapshot tool on the source side. This way the snapshots will always be created, even if the backup server is offline or unreachable.
You can use zfs-autobackup as a standalone snapshot tool.
To do this, simply omit the target-path, as follows:
zfs-autobackup -v --ssh-source pve01 offsite1
Only use this if you don't want to make any backup at all, or if a target isn't reachable during the snapshotting phase.
If you have offline backups, checkout Common-snapshots-and-holds
The correct way to do this is by creating ~/.ssh/config:
Host smartos04
Hostname 1.2.3.4
Port 1234
user root
This way you can just specify "smartos04" as host.
Look in man ssh_config
for many more options.
You can use multiple zfs-autobackup jobs to transfer data to multiple targets. Just make sure that you use different backup names. This way the jobs should not interfere with each other: Each job only removes its own snapshots.
You CAN use the same backup name to transfer data to multiple targets. However in that case it's up to you to make sure that a common snapshot of one backup job isn't deleted by the other job.
One way to do this is to make adjust the --keep-source option or to make sure the backups run at a close enough interval.
However: To prevent confusion, and to be more flexible, I would advise to always use different and clear to distinguish names. e.g.: autobackup:offsite and autobackup:local for example.
- Use
--clear-mountpoint
to prevent all kinds of problems. See Mounting - Use
--debug
if something goes wrong and you want to see the commands that are executed. This will also stop at the first error. - Use these only one time if needed:
--force
--destroy-incompatible
--rollback
. Dont add them to your script. Try to solve the underlying cause if you keep needing them. - Set the
readonly
property of the target filesystem toon
. This prevents changes on the target side. (Due to the nature of ZFS itself, if any changes are made to a dataset on the target machine, then the next backup to that target machine will probably fail. Such a failure can probably be resolved by perfroming a target-side zfs rollback of the affected dataset.) Note thatreadonly
prevents changes to the CONTENTS of the dataset directly. Its still possible to receive new datasets and manipulate properties etc. - Use
--clear-refreservation
to save space on your backup machine. - zfs-autobackup uses holds by default, so you might get "dataset busy" if you try to destroy a snapshot. (check zfs holds --help)
Restoring can be done with simple zfs commands. For example:
root@fs1:/home/psy# zfs send fs1/zones/backup/zfsbackups/server01/vm01@offset1-20220110230003 | ssh root@2.2.2.2 "zfs recv rpool/restore"
- Performance tips (recommended)
- Common problems and errors
- Thinning out obsolete snapshots
- Handling ZFS encryption
- Transfer buffering, compression and rate limiting.
- Custom Pre- and post-snapshot commands
- Monitoring
- Proxmox Example
usage: ZfsAutobackup.py [--help] [--test] [--verbose] [--debug] [--debug-output] [--progress] [--utc] [--version] [--ssh-config CONFIG-FILE] [--ssh-source USER@HOST] [--ssh-target USER@HOST] [--property-format FORMAT] [--snapshot-format FORMAT] [--hold-format FORMAT] [--strip-path N] [--exclude-unchanged BYTES] [--exclude-received] [--no-snapshot] [--pre-snapshot-cmd COMMAND]
[--post-snapshot-cmd COMMAND] [--min-change BYTES] [--allow-empty] [--other-snapshots] [--set-snapshot-properties PROPERTY=VALUE,...] [--no-send] [--no-holds] [--clear-refreservation] [--clear-mountpoint] [--filter-properties PROPERTY,...] [--set-properties PROPERTY=VALUE,...] [--rollback] [--force] [--destroy-incompatible] [--ignore-transfer-errors]
[--decrypt] [--encrypt] [--zfs-compressed] [--compress [TYPE]] [--rate DATARATE] [--buffer SIZE] [--send-pipe COMMAND] [--recv-pipe COMMAND] [--no-thinning] [--keep-source SCHEDULE] [--keep-target SCHEDULE] [--destroy-missing SCHEDULE]
[BACKUP-NAME] [TARGET-PATH]
ZfsAutobackup.py v3.2 - (c)2022 E.H.Eefting (edwin@datux.nl)
positional arguments:
BACKUP-NAME Name of the backup to select
TARGET-PATH Target ZFS filesystem (optional)
Common options:
--help, -h show help
--test, --dry-run, -n
Dry run, dont change anything, just show what would be done (still does all read-only operations)
--verbose, -v verbose output
--debug, -d Show zfs commands that are executed, stops after an exception.
--debug-output Show zfs commands and their output/exit codes. (noisy)
--progress show zfs progress output. Enabled automaticly on ttys. (use --no-progress to disable)
--utc Use UTC instead of local time when dealing with timestamps for both formatting and parsing. To snapshot in an ISO 8601 compliant time format you may for example specify --snapshot-format "{}-%Y-%m-%dT%H:%M:%SZ". Changing this parameter after-the-fact (existing snapshots) will cause their timestamps to be interpreted as a different time than before.
--version Show version.
SSH options:
--ssh-config CONFIG-FILE
Custom ssh client config
--ssh-source USER@HOST
Source host to pull backup from.
--ssh-target USER@HOST
Target host to push backup to.
String formatting options:
--property-format FORMAT
Dataset selection string format. Default: autobackup:{}
--snapshot-format FORMAT
ZFS Snapshot string format. Default: {}-%Y%m%d%H%M%S
--hold-format FORMAT ZFS hold string format. Default: zfs_autobackup:{}
--strip-path N Number of directories to strip from target path.
Selection options:
--exclude-unchanged BYTES
Exclude datasets that have less than BYTES data changed since any last snapshot. (Use with proxmox HA replication)
--exclude-received Exclude datasets that have the origin of their autobackup: property as "received". This can avoid recursive replication between two backup partners.
Snapshot options:
--no-snapshot Don't create new snapshots (useful for finishing uncompleted backups, or cleanups)
--pre-snapshot-cmd COMMAND
Run COMMAND before snapshotting (can be used multiple times.
--post-snapshot-cmd COMMAND
Run COMMAND after snapshotting (can be used multiple times.
--min-change BYTES Only create snapshot if enough bytes are changed. (default 1)
--allow-empty If nothing has changed, still create empty snapshots. (Same as --min-change=0)
--other-snapshots Send over other snapshots as well, not just the ones created by this tool.
--set-snapshot-properties PROPERTY=VALUE,...
List of properties to set on the snapshot.
Transfer options:
--no-send Don't transfer snapshots (useful for cleanups, or if you want a separate send-cronjob)
--no-holds Don't hold snapshots. (Faster. Allows you to destroy common snapshot.)
--clear-refreservation
Filter "refreservation" property. (recommended, saves space. same as --filter-properties refreservation)
--clear-mountpoint Set property canmount=noauto for new datasets. (recommended, prevents mount conflicts. same as --set-properties canmount=noauto)
--filter-properties PROPERTY,...
List of properties to "filter" when receiving filesystems. (you can still restore them with zfs inherit -S)
--set-properties PROPERTY=VALUE,...
List of propererties to override when receiving filesystems. (you can still restore them with zfs inherit -S)
--rollback Rollback changes to the latest target snapshot before starting. (normally you can prevent changes by setting the readonly property on the target_path to on)
--force, -F Use zfs -F option to force overwrite/rollback. (Useful with --strip-path=1, but use with care)
--destroy-incompatible
Destroy incompatible snapshots on target. Use with care! (implies --rollback)
--ignore-transfer-errors
Ignore transfer errors (still checks if received filesystem exists. useful for acltype errors)
--decrypt Decrypt data before sending it over.
--encrypt Encrypt data after receiving it.
--zfs-compressed Transfer blocks that already have zfs-compression as-is.
Data transfer options:
--compress [TYPE] Use compression during transfer, defaults to zstd-fast if TYPE is not specified. (gzip, pigz-fast, pigz-slow, zstd-fast, zstd-slow, zstd-adapt, xz, lzo, lz4)
--rate DATARATE Limit data transfer rate in Bytes/sec (e.g. 128K. requires mbuffer.)
--buffer SIZE Add zfs send and recv buffers to smooth out IO bursts. (e.g. 128M. requires mbuffer)
--send-pipe COMMAND pipe zfs send output through COMMAND (can be used multiple times)
--recv-pipe COMMAND pipe zfs recv input through COMMAND (can be used multiple times)
Thinner options:
--no-thinning Do not destroy any snapshots.
--keep-source SCHEDULE
Thinning schedule for old source snapshots. Default: 10,1d1w,1w1m,1m1y
--keep-target SCHEDULE
Thinning schedule for old target snapshots. Default: 10,1d1w,1w1m,1m1y
--destroy-missing SCHEDULE
Destroy datasets on target that are missing on the source. Specify the time since the last snapshot, e.g: --destroy-missing 30d
Full manual at: https://github.com/psy0rz/zfs_autobackup
Sponsored by: JetBrains
zfs-autobackup:
- Introduction (README.md)
- Getting started
- Full manual
- Mounting backup datasets
- Performance tips (recommended)
- Common problems and errors
- Thinning out obsolete snapshots
- Common snapshot and holds
- Handling ZFS encryption
- Transfer buffering, compression and rate limiting.
- Custom Pre- and post-snapshot commands
- Monitoring
Examples: