[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Do snapshots solve all consistency problems ?



Hi,

> Joerg Schilling wrote:
> I don't know whether and iff, how Linux supports snapshots.
> Solaris and FreeBSD do it in a very smilar way since 2002.

>From http://www.tldp.org/HOWTO/LVM-HOWTO/snapshots_backup.html :
  # lvcreate -L592M -s -n dbbackup /dev/ops/databases 
  ...
  lvcreate -- logical volume "/dev/ops/dbbackup" successfully created

Maybe one should donate them a paragraph about known
pitfalls. They _are_ overly optimistic without doubt.


> >From the new star man page:
> 
>      Backups from life filesystems should be avoided.

... if that constraint does not keep you from making
backups in sufficient frequency.


>          On operat-
>      ing  systems  that  support  file  system snapshots, backups
>      should be made from a read-only  mount  of  a  snapshot.  

Agreed. A snapshot is preferrable in any way.
(Until i learn how to detect the cars. Then it would
be counterproductive. Sic transit ...)


>      Be careful  that  all files that have been created between set-
>      ting up a snapshot and starting an incremental backup may be
>      missing  from all backups unless the dumpdate=name option is
>      used.

Interesting pitfall. I'm still learning from star.
Such an option is on my todo list meanwhile.


>      If the system that is going to be backed up is not acting as
>      a file server, it makes sense to shut down all services that
>      may result in inconsistent file states before setting up the
>      filesystem  snapshot.

Oh. You aren't as overly optimistic as i thought.
I've read that paragraph a few days ago but did not see
the connection to my concerns. (Probably i perceived 
"inconsistent" too much in the sense of fsck.)


>>  you can also run blindy into a standing car.
> No, definitely not.

But Joerg, it is written in your own man page.
  "result in inconsistent file states" 
that's what i mean with 
  "run blindly into a standing car".
The snapshot makes the cars stand still but when
the backup truck crisscrosses the street it will
hit them anyway.


> Snapshots are acting transparently so you use them where they
> ar available and you need to go to single user on other 
> systems. 

Shutting down all complex services does not seem
to be far away from a short visit at run level 0.
The snapshot reduces the downtime substantially,
of course.
But there is a downtime with connection loss and all.

That's why i dreamt of a virtual single user mode.
It would have to represent the disk state after
an emergency shutdown which left the services
and applications enough time to end properly or
at least to do their own eventual emergency stuff
before aborting. Yes, poor quality applications will
still produce bad files. Their fault then.

LVM offers writeable snapshots. An important prequisite
for such a stunt.


> problems describes above with NFS servers. I am sure that
> most sysadmins don't even know about this theoretical
> backup consistency problem.

I am quite sure that there are potential problems of
which i never thought. One has to live with that fact.


>> Are there guidelines on how to achieve this quality
>> within general purpose programs ?
> 
> Programs that write (and fsync(2)) correctly aligned
> will see no problems.

That brings the consistency definition back down
to the filesystem level.

Probably that's why i thought you were too optimistic.

The semantics of the application (or service) often does
not allow atomic operations. I am not aware of a 
widely available API which allows a well defined
tentative state of multiple write operations which
may then get committed altogether atomically.

If the snapshot catches me between the one write() and
the other write() which only together form a usable
file change ... then i'm doomed.


Your advises for a clean backup environment are valid,
no doubt. But they are also hard to follow in sufficient
frequency.

>From the user's point of view this is similar to the
hardships of level 0 backups. They are inavoidable but
also unbearable. A compromise is needed.


My own approach is to avoid the risky areas on the disk
when doing everyday's backup. For those tree parts which
host problem files and which need to get backuped frequently
i would advise specially adapted procedures.
(Like dumping a database portably and backing up the dump
 because you can't expect to get the binary database running
 with another server version anyway.)

This reduces largely the effort to run a routine backup.
On the other hand only the most backup-problematic parts
of a system stay without sufficient protection.
They will have to be covered in single user mode. If ever.

For normal data trees it is often sufficient to keep the
hands off the application programs or to wait until the
Windows users have left the office.


Such a less demanding strategy leaves more room for weak
but convenient file formats (e.g. ISO-9660) and allows 
to put the necessary social pressure on the person who
is responsible for doing the backup.
(To quote myself "no excuse not to do the backup")

This approach depends on partial incremental backups, 
or on easy-to-do partial restore of full incremental
backups.
It would be awful if the user had to sort out potentially
bad files at restore time. At restore time the psychological
situation is labile.


Have a nice day :)

Thomas


PS: 
About atomicity of larger filesystem changes:
The journaled thingies _could_ be able to do so.
I googled, stumbled over file change logs which look
cool for everybody who wants to do incremental backups.
No actual commit/rollback found. Not to speak of a
standard API for that trick.



Reply to: