[email protected]
[Top] [All Lists]

Re: draft-zinin-microloop-analysis-01.txt - times

Subject: Re: draft-zinin-microloop-analysis-01.txt - times
From: mike shand
Date: Fri, 03 Jun 2005 17:05:44 +0100
At 11:36 03/06/2005 -0400, Alia Atlas wrote:

At 11:31 AM 6/3/2005, Stewart Bryant wrote:

          Periods of time used by the router to delay installation of
          new primary next-hops after a topology change when the router
          has (type-B) or has not (type-C) a safe neighbor to temporary
          divert the traffic to in the meantime.

   While correctness and effectiveness of the algorithm described here
   does not depend on the actual values assigned to the architectural
   constants, it does depend on the relationship between them, and the
   assumption that all routers in the same network use the same values.

   To satisfy these constrains, and yet allow these delays to be
   decreased as implementations continue to improve towards faster con-
   vergence, this document defines the architectural constants as con-
   figurable, specifies the required relationship between the values,
   and the default values that should be used by the implementations.

SB> I wonder if we need to signal these, for example in the LSP/LSA
SB> I am concerned that there is little chance that all routers
SB> in the network will be correctly configured. The trouble is
SB> that if there is a mis-config it will be very hard to detect.

AA> What would be done by the routers with this additional information? Why isn't this a management AA> problem? These values could be in (yet another) MIB & then the values of the routers could be AA> compared. I don't like the idea of adding signaling to check for inconsistency - when all the router AA> could do on detecting this would be a log or, I guess, disabling the functionality in the case of mis-AA> matches.
I think I agree with Alia here. While at first sight it seems that there
might be something you could do with advertising these things in the
protocol, life can get very complicated when you start considering what
happens when various routers and or regions of the network come and go.
There is a very real danger that an "automated"dynamic synchronization
scheme would result in more errors than a manual static one.
Simply using an advertisement in the protocol to give a warning that
some static misconfiguration has been made (as I think Stewart was
suggesting) is more workable, but seems like a poor use of the protocol,
especially since (as Alia points out) the information should be
available for management application to check anyway.
I don't like the idea of requiring consistent configuration across the
network, because there is a strong chance that it will be wrong, and
it's always a problem when it has to be changed because of the inclusion
or removal of a particular router. There is also the issue that the
parameter is in some sence dynamic, being dependent on the size of the
network so it's going to need to be tuned as time goes by.

Nor do I much like the idea of requiring an NMS to se the parameter
because not all networks have an NMS that is responsible of the
configuration of the routers.

I suggest that each router includes it's max required time in its
LSP/LSAs and the network uses the max of the max. The parameter
can be extracted as part of the SPF calculation. Since, for the
topology calculation to be consistent, all routers have to be
using the same set of active links in their calculation we know that
they will all extract the same maximum value for delay. If the network
is unstable in such a way that not all routers are using the same links
to claculate the SPT, then we have bigger problems.

A standard concern is the inclusion of removal of large sections of the
network changin the parameter, but this comes out in the wash, provided
the algorithm (at least conceptually) is to find the max value during
the parsing of the active links.

A futher standard concern is what happens when a value changes. When a
router is re-configured it issues a new set of LSPs with the new timer
value. However this has a null effect on the topology, and we can
expressly prevent a transition taking place when a router issues an
LSP in which the only change is the ephoch time request.

There is the issue concerning inconsistency during the window where one
router has issued a time change LSP and concurrently a failure occurs.
This is probably better, and certainly no worse than a failure occuring
during a time in which the NMS is issuing a timer value change to
the routers in the net.
I think that this could be a very nice solution to the problem. I really
like it :-)
If each router advertises its max compute-install time (which could be
configured or derived), then all routers could compute the max, possibly
add a small certainty factor and use that for the DELAY_TYPEC and DELAY_TYPEB.
Yes, I agree that such an automatic dynamic scheme would be very nice IF it
were unconditionally stable. Stewart's arguments seem convincing, but I
have been burnt too often in the past by such things to immediately accept
that it will never cause any problems:-)
I don't THINK it will, but I think it needs a bit more analysis of nasty
cases to be sure that there aren't any situations where it could make bad
things happen, or in particular make more bad things happen than if we
hadn't done it.
I do agree that SOMETHING needs to be done, since the consequences of
getting it wrong manually are clearly pretty severe, and changing a value
manually is a real pain.
I think it all boils down the veracity (or otherwise) of Stewart's statement

"If the network
is unstable in such a way that not all routers are using the same links
to claculate the SPT, then we have bigger problems."

If indeed we do have bigger problems, then this is probably all OK. But if the problems introduced by an unsynchronized topology are actually much LESS than the problems caused by using inconsistent delay values (long persistence of loops), then maybe it is not such a good idea.

Rtgwg mailing list
[email protected]

<Prev in Thread] Current Thread [Next in Thread>