[email protected]
[Top] [All Lists]

Re: questions on draft-bryant-ipfrr-tunnels-01.txt

Subject: Re: questions on draft-bryant-ipfrr-tunnels-01.txt
From: Alia Atlas
Date: Mon, 22 Nov 2004 15:52:56 -0500

You may be right on the terminology.

What about unexpected simultaneous failures? One can do a mechanism, as you said, to protect against a limited number of unexpected simultaneous failures, but not a large or arbitrary number without fair expense.

At 02:38 PM 11/22/2004, Curtis Villamizar wrote:

In message <[email protected]>
Alia Atlas writes:
> At 07:28 AM 11/19/2004, Stewart Bryant wrote:
> >However unknown SRLGs will affect all solutions, so if
> >SRLG is a MUST, then all solutions need text describing
> >their behavior under conditions of unknown SRLG. In
> >particular unknown SRLG can result in mutually looping
> >repairs (which could even be cyclic) and the solution
> >must describe how to detect and break these loops.
> Unknown SRLGs are uncorrelated failures.  The detection of
> uncorrelated failures is by the IGP, which can then install the new
> primaries as quickly as possible.  In the meantime, traffic loops
> until TTL expires.
> IPFRR is quite clearly not handling uncorrelated failures; no
> protection mechanism that I am aware of does.


I think we have a terminology problem here.

Unless I'm mistaken correlated failures are simply ones that happen at
about the same time.  Just because someone didn't realize that two
logical links used a common physical resource (unknown SRLG) doesn't
mean the failures were not correlated.  For example, when the
earthquake outside San Diego took out three of four major fiber paths
in the mid 1990s, the failure was definitely correlated.  It was not
anticipated and would not have been listed in a set of SRLGs but after
it happenned it didn't take much to figure out that the failures
shared a common cause leading to them occurring at about the same time
(correlated with known cause).  Failures can also be correlated
without a known cause of the correlation.  This is when at least
initially no one knows why but for some reasons two (or more) circuits
are observed failing at the same time.  Again there is no SRLG entry.

There is work in the literature on handling multiple failures.  One
simple way is to make path B disjoint with path A.  Then make path C
disjoint with path A and B.  Obviously I'm oversimplifying for
illustration.  If two correlated failures occur that are not
anticipated, then all three paths can't go down.  OTOH .. a third
failure will take them all down.  I don't know that this is considered
practical for most data communication problems (perhaps NASA or the
military has a need for this - war has a way of causing correlated
failures that are not planned for so maybe there is an application,
though that is a reason that connectionless networking got started).
I think what this is leading to is whether it is a requirement that
techniques address SRLG and if so whether it is also a requirment that
the proposed SRLG solution (or overall solution) is computational
practical.  The latter would seem to be consistent with routing area
scalability requirements.

Rtgwg mailing list
[email protected]

<Prev in Thread] Current Thread [Next in Thread>