[email protected]
[Top] [All Lists]

Re: [ZODB-Dev] Restoring from repozo and reusing an index file?

Subject: Re: [ZODB-Dev] Restoring from repozo and reusing an index file?
From: Paul Winkler
Date: Fri, 11 Jun 2010 17:24:24 -0400
On Fri, Jun 11, 2010 at 11:48 AM, Paul Winkler <[email protected]> wrote:

I tend to run rsync via "rsync -rP --rsh=ssh". The Data.fs is an
append-only file, so rsync is very efficient at handling it. Only
zeopack rewrites things all across the file and causes a subsequent
rsync to be slow again.

Thanks. I'll do a trial run of this today.

It seems that a second rsync isn't exactly blazing fast with a few changes on the end of the 32G Data.fs. Near as I can tell, it spends a good 10 minutes or so just comparing the files to see if it has any work to do.
Once that phase is done, it seems to spend a lot of its time in IO since by default it builds a new file and replaces the existing file when it's done. Total time ~ 25 minutes.

The rsync man page paid off though: Using the --append option (or --append-verify on recent enough versions of rsync) seems to reduce the IO a lot, as it's tailor-made for this use case: updating in-place when the source file has only been appended to and potentially losing the target file on failure is OK. (We can manually make a pristine copy prior to starting our downtime, just in case we need to do it over for any reason).
FWIW total time for the second `rsync -z --append Data.fs` was:
realÂÂÂ 7m50.253s

Last time I had to rebuild the index file it took ~ 30 minutes, so this looks like a win. We'll go with rsync.


For more information about ZODB, see the ZODB Wiki:

ZODB-Dev mailing list  -  [email protected]
<Prev in Thread] Current Thread [Next in Thread>