I have now found a system that fails with your script, and have two that
do not. On the one that fails, I am running Python 2.3.4, and on the two
that work, one is running 2.4.1 and the other 2.5.1. The failing system
also has OpenSSH 3.9p1. The working systems have OpenSSH 4.1p1 and 4.5p1
What versions of Python and SSH are you running?
Austin Clements wrote:
> The following bug is still present in rdiff-backup 1.1.15. For
> convenience, I've attached the test script that causes rdiff-backup to
> Quoth myself on Sep 21 at 5:50 pm:
>> After a great deal of pain and suffering, I've managed to reduce this
>> problem to an easily reproducible test case. Put 1999 files or more
>> in a directory and try to back that up, using a remote schema for both
>> the source and the destination. Below is a script that will do the
>> whole process.
>> More specifically, (Globals.pipeline_max_length*4-1) is the cut off.
>> This formula is based on DestinationStruct.set_rorp_cache, which
>> initializes the CacheCollatedPostProcess. I still really don't
>> understand what's going on here, but it has something to do with the
>> order that the CacheCollatedPostProcess cache is traversed on the
>> destination rdiff-backup server when the source is also an
>> rdiff-backup server.
>> I generated some simple traces of calls to important methods in the
>> cache and it looks like the order of traversal versus the order in
>> which methods like get_mirror_rorp are called on the destination is
>> radically different depending on whether or not the source uses a
>> remote schema. I'm not sure, but I think each file in the cache only
>> gets examined once in the correct case (an index is retrieved from the
>> cache, then a bunch of methods are called to get more information
>> about it from the cache, then that index is never looked at again).
>> In the broken case with a source server, I think the cache is iterated
>> through once, and _then_ asked for information about the indexes that
>> were retrieved from it. Thus, if the first traversal fills up the
>> cache and causes something to be evicted (ie, if there are 1999 files
>> or more), the "second" traversal won't be able to retrieve the
>> additional information and will raise a KeyError.
>> I'll keep poking at it, but at this point I'm in way, way over my head
>> on the details of how rdiff-backup works, so it's not likely I'll get
>> much further on my own.
>> # Cause rdiff-backup to throw a KeyError
>> set -e
>> mkdir /tmp/lotsoffiles
>> for (( N=0 ; N < 500*4-1 ; N++ )); do
>> touch /tmp/lotsoffiles/`printf '%04d\n' $N`
>> rdiff-backup \
>> --terminal-verbosity 5 \
>> --include /tmp/lotsoffiles \
>> --exclude / \
>> localhost::/ localhost::/tmp/lotsoffiles-backup
>> > 5) Provide your kernel version and details of the source and dest
>> > filesystems (type, any mountpoints involved in the backup).
>> Kernel 2.6.18, Mostly Reiser (/) with a small dash of ext3 (/boot)
>> Kernel 2.6.21, Reiser mounted via a crypsetup mapper of a USB drive
>> I ran my above test script on my source machine (and it didn't include
>> /boot, so it was just Reiser).
> # Cause rdiff-backup to throw a KeyError
> set -e
> # Create a directory to back up, containing
> # Globals.pipeline_max_length*4-1 (1999) files. This is the exactly
> # cut off, if you make it 1998, everything works.
> mkdir /tmp/lotsoffiles
> for (( N=0 ; N < 500*4-1 ; N++ )); do
> touch /tmp/lotsoffiles/`printf '%04d\n' $N`
> # Back up /tmp/lotsoffiles to /tmp/lotsoffiles-backup, using remote
> # access for both.
> rdiff-backup \
> --terminal-verbosity 5 \
> --include /tmp/lotsoffiles \
> --exclude / \
> localhost::/ localhost::/tmp/lotsoffiles-backup
> rdiff-backup-users mailing list at rdiff-backup-users@xxxxxxxxxx
> Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Andrew Ferguson - owsla@xxxxxxxxxxxxx
rdiff-backup-users mailing list at rdiff-backup-users@xxxxxxxxxx
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki