[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Backup systems



On 9/4/23 00:53, Michael Kjörling wrote:
On 3 Sep 2023 14:20 -0700, from dpchrist@holgerdanske.com (David Christensen):
Without seeing a console session, I am unsure what you mean by "physically
stored", "total logical (excluding effects of compression) data", and "hot
current logical data ... (excluding things like ZFS snapshots and
compression)".

"Physically stored" is how much data, after compression and including
file system metadata, is actually written to disk and necessary for
all data to be accessible; it's the relevant metric for whether I need
to add disk space.

"Logical" is the sum of all apparent file sizes as visible to userland
utilities e.g. through stat(2).

Something like `dd if=/dev/zero of=$(mktemp) bs=1M count=1M` would
result in a large logical size but, because of compression, a very
small amount of physically stored data.

"Hot" is perhaps better referred to as the "current" data set; since
snapshots (and earlier backups) can include data which has since been
deleted, and is thus no longer current but still exists on disk.


What partitioning scheme, volume manager, file system, compression, etc., do
you use on your backup server?

ZFS within LUKS containers. If I recall correctly, the backup pool is
set to use zstd compression.


I had thought you were using rsnapsnot/ rsync --link-dest, but you also
mention ZFS snapshots.  Please clarify.

Mostly ZFS with a rotating snapshot schedule on the source (the root
file system is ext4); copied using rsync --link-dest (through
rsnapshot) to a ZFS file system which doesn't use snapshots on the
backup target. Most of the ZFS file systems are set up to use
compression; there are a few where I know _a priori_ that the data is
in effect completely incompressible so there's no point in using CPU
to even try to compress that data, so those have compression turned
off.

(In ZFS, creating a file system is barely any more involved than
creating a directory, and all file systems come out of the same "pool"
which is a collection of >=1 storage devices set up with some
particular method of redundancy, possibly none. In more traditional
*nix parlace, a *nix file system is conceptually closer to a ZFS
pool.)

Hopefully this is more clear.


So for backup storage:

* We are both using ZFS with default compression.

* You are using 'rsync --link-dest' (via rsnapshot(1)) for deduplication and I am using ZFS for deduplication.

Related:

* I am using zfs-auto-snapsnot(8) for snapsnots. Are you using rsnapsnot(1) for snapshots?


Here are the current backups for my current daily driver:

2023-09-04 13:26:15 toor@f3 ~
# zfs get -o property,value compression,compressratio,dedup,logicalreferenced,logicalused,refcompressratio,referenced,used,usedbydataset,usedbysnapshots p3/backup/taz.tracy.holgerdanske.com
PROPERTY           VALUE
compression        lz4
compressratio      2.14x
dedup              verify
logicalreferenced  6.59G
logicalused        48.7G
refcompressratio   1.83x
referenced         3.89G
used               23.4G
usedbydataset      3.89G
usedbysnapshots    19.5G

2023-09-04 13:26:36 toor@f3 ~
# ls -1 /var/local/backup/taz.tracy.holgerdanske.com/.zfs/snapshot | wc -l
     186

2023-09-04 13:27:15 toor@f3 ~
# du -hs /var/local/backup/taz.tracy.holgerdanske.com/ /var/local/backup/taz.tracy.holgerdanske.com/.zfs
3.9G	/var/local/backup/taz.tracy.holgerdanske.com/
722G	/var/local/backup/taz.tracy.holgerdanske.com/.zfs

2023-09-04 13:28:02 toor@f3 ~
# crontab -l
 9 3 * * * /usr/local/sbin/zfs-auto-snapshot -k d 40
21 3 1 * * /usr/local/sbin/zfs-auto-snapshot -k m 99
27 3 1 1 * /usr/local/sbin/zfs-auto-snapshot -k y 99


Observations:

* du(1) of the backup file system matches ZFS properties 'referenced' and 'usedbydataset'.

* I am unable to correlate du(1) of the snapshots to any ZFS properties -- du(1) reports much more storage than ZFS 'usedbysnapshots', even when scaled by 'compressratio'.


David


Reply to: