It occured to me that there are scenarios where it would be useful to be
able to "zfs send -i A B" where B is a snapshot older than A. I am
trying to design an encrypted disk-based off-site backup solution on top
of ZFS, where budget is the primary constraint, and I wish zfs send/recv
would allow me to do that. Here is why.
I have a server with 12 hot-swap disk bays. An "onsite" pool has been
created on 6 disks, where snapshots of the data to be backed up are
periodically taken. Two other "offsite" pools have been created on two
other sets of 6 disks, let's give them the names offsite-blue and
offsite-red (for use on blue/red, or even/odd, weeks). At least one of
the offsite pools is always at the off-site location, while the other
one is either in transit or in the server. Every week a script is
basically compressing and encrypting the last few snapshots (T-2, T-1,
T-0) from onsite to offsite-XXX. Here is an example:
$ rm /offsite-blue/*
$ zfs send onsite@T-2 | gzip | gpg -c >/offsite-blue/T-2.full.gz.gpg
$ zfs send -i T-2 onsite@T-1 | gzip | gpg -c >/offsite-blue/T-1.incr.gz.gpg
$ zfs send -i T-1 onsite@T-0 | gzip | gpg -c >/offsite-blue/T-0.incr.gz.gpg
Then offsite-blue is zfs export'ed, sent to the the off-site location,
offsite-red is retrieved from the off-site location, sent back on-site,
ready to be used for the next week. My proof-of-concept tests show it
works OK, but 2 details are annoying:
o In order to restore the latest snapshot T-0, all the zfs streams,
T-2, T-1 and T-0, have to be decrypted, then zfs receive'd. It is
slow and inconvenient.
o My example only backs up the last 3 snapshots, but ideally I would
like to fit as many as possible in the offsite pool. However, because
of the unpredictable compression efficiency, I can't tell which
snapshot I should start from when creating the first full stream.
These 2 problems would be non-existent if one could "zfs send -i A B"
with B older than A:
$ zfs send onsite@T-0 | gzip | gpg -c >/offsite-blue/T-0.full.gz.gpg
$ zfs send -i T-0 onsite@T-1 | gzip | gpg -c >/offsite-blue/T-1.incr.gz.gpg
$ zfs send -i T-1 onsite@T-2 | gzip | gpg -c >/offsite-blue/T-2.incr.gz.gpg
$ ... # continue forever, kill zfs(1m) when offsite-blue is 90% full
I have looked at the code and the restriction "B must be earlier than A"
is enforced in dmu_send.c:dmu_sendbackup() . It looks like the code
could be reworked to remove it.
Of course, when zfs-crypto ships, it will simplify a lot of things.
I could just always send incremental streams and receive them directly
on the encrypted pool, and directly manage the snapshots rotation by
zfs destroy'ing the old ones, etc.
zfs-discuss mailing list