[Top] [All Lists]

## Re: Updated: draft-zinin-microloop-analysis-01.txt

 Subject: Re: Updated: draft-zinin-microloop-analysis-01.txt mike shand Fri, 03 Jun 2005 10:56:34 +0100
 ```At 16:43 31/05/2005 -0400, Alia Atlas wrote: ``````Stewart & Alex, At 02:02 PM 5/27/2005, Stewart Bryant wrote: `````` Primary neighbor Neighbor N of router S is considered S's primary neighbor for destination D, if N provides the shortest path to D according to the SPF calculation. SB> Need to say something formal like selecting N such that SB> Dopt(N.D) is minimised. ```AA> There's always the possibility that a potential primary neighbor isn't selected. Consider the case AA> where a router only selects up to 4 equal-cost paths, and there are 5 or more. The definition should ```AA> handle this case as well. `````` 2.2 Next hop safety condition We start the analysis with the following observation: When router X learns about a topology change and starts using neighbor Y as its new primary neighbor for a given destination, a microloop between X and Y can only form if the topology before failure or topology after failure are such that Y uses X as its primary neighbor for the same destination. SB> I don't think that this is quite right. You say that X uses SB> Y as its new next hop, AND Y uses X as it's new next hop. That SB> would be a failure of the IGP, which is out of scope. ``````AA> Perhaps a better way to phrase it is: ```AA> "... a microloop between X and Y can only form if the topology before the failure is such that Y used ```AA> X as its primary neighbor for the same destination." ```AA> And then need to clarify the opposite case as well - where the roles of X and Y are reversed. I think ```AA> you were trying to - but I agree that it isn't clear. `````` Routers SHOULD use the symmetric-link safety condition by default, MAY attempt to dynamically determine the method that needs to be applied based on the topological information from the routing SB> I think that we need to discuss which algorithm should SB> be the default. Given that many networks that are thought SB> to be symmetric turn out to be asymmetric, it's not clear SB> which we should choose and why. ```AA> How many of the symmetric networks that actually turn out to be asymmetric have multi-hop loops in AA> them? Couldn't this be something that was flagged by a MIB - to indicate that the "symmetric" AA> network isn't really. Surely this is something that the network operators would want to know so that it ```AA> can be corrected?? ```Yes, I wonder how much of a problem this really is. Given that the algorithm doesn't prevent all loops anyway, then a small increment in the number of loops caused by incorrectly handled asymmetric cost cases doesn't seem to be much a price to pay, especially since using the stronger condition to handle them correctly will result in the overall coverage being less. i.e. the total number of loops may get WORSE by using the asymmetric cost fixing algorithm. ```I know.... something for me to simulate :-) ```AA> Another related question is how does PLSN work with max-cost links? How should it work? Is it AA> acceptable to use a max-cost link to reach a safe neighbor that isn't a potential primary neighbor on AA> either the old or new topology? That seems potentially bad to me, since it could cause additional ```AA> traffic loss, depending on why the link was set to max-cost. ```I think a max-cost link should be treated as unreachable, since that is probably why it was set to max cost. ``` ``````------------------------ 3.3 IP Fast Reroute Considerations If the router implements [IPFRR] and performs local failure repair, procedures describes in this document still need to be applied in order to prevent micro-loops while reconverging on the new topology. SB> This is stricter than it should be. Say we implement basic [IPFRR] SB> AND some other enhanced mechanism. We may wish to use some other SB> mechanim in place of this. ```AA> I think that the intention should be to say that PLSN is useful to avoid micro-loops during AA> re-convergence and this benefit is not provided simply by using basic [IPFRR] or another repair AA> mechanism. Both a repair mechanism and a convergence control mechanism are desirable. AA> I do think it would be useful to specify the risks/undesirability of using PLSN without a repair ```AA> mechanism when the topology change includes failures. ``````Yes. ```AA> I do agree with Stewart that the phrasing should consider the possibility of future techniques being ```AA> introduced. ``````Agreed. `````` Another difference is when the router could not repair the failure, the new primary next-hops do not satisfy the safety condition, and there's no other neighbor that does, i.e. a type-C situation. Unlike other routers in the network, the router directly connected to the network does not have the old next-hop any more, and cannot continue using it. In this situation, the router MUST revert to the regular convergence procedures, and update the route with the new next-hops with no additional delay. SB> We need to think about this some more. When we have an imperfect SB> repair we need to consider the "greater good" and that might SB> be to control the convergence of the rest of the network. ```AA> Given that no other router in the network is aware that the router (S) doesn't have an alternate, I'm AA> not sure what better option can exist. I think that the convergence of the rest of the network is being AA> controlled. The micro-loops related to S are not being handled. The worst-case that I see is that S AA> uses a neighbor N where that neighbor is type-B and is using S as its safe neighbor. I do agree that ```AA> we need to think about this more. ``````3.4 Architectural Constants The following architectural constants have been used in the descrip- tion of the algorithm above: DELAY_SPF The delay between the moment the router receives a topology SB> s/a/the first/ update after a period of stability and the moment it starts its routing table recalculation. This delay is necessary to collect multiple updates originated by different routers that relate to the same topological event. SB> We might want to more formally state the start/inhibit criteria ``````AA> I agree. `````` DELAY_TYPEB and DELAY_TYPEC Periods of time used by the router to delay installation of new primary next-hops after a topology change when the router has (type-B) or has not (type-C) a safe neighbor to temporary divert the traffic to in the meantime. While correctness and effectiveness of the algorithm described here does not depend on the actual values assigned to the architectural constants, it does depend on the relationship between them, and the assumption that all routers in the same network use the same values. To satisfy these constrains, and yet allow these delays to be decreased as implementations continue to improve towards faster con- vergence, this document defines the architectural constants as con- figurable, specifies the required relationship between the values, and the default values that should be used by the implementations. SB> I wonder if we need to signal these, for example in the LSP/LSA SB> I am concerned that there is little chance that all routers SB> in the network will be correctly configured. The trouble is SB> that if there is a mis-config it will be very hard to detect. ```AA> What would be done by the routers with this additional information? Why isn't this a management AA> problem? These values could be in (yet another) MIB & then the values of the routers could be AA> compared. I don't like the idea of adding signaling to check for inconsistency - when all the router AA> could do on detecting this would be a log or, I guess, disabling the functionality in the case of mis-AA> matches. I think I agree with Alia here. While at first sight it seems that there might be something you could do with advertising these things in the protocol, life can get very complicated when you start considering what happens when various routers and or regions of the network come and go. There is a very real danger that an "automated"dynamic synchronization scheme would result in more errors than a manual static one. Simply using an advertisement in the protocol to give a warning that some static misconfiguration has been made (as I think Stewart was suggesting) is more workable, but seems like a poor use of the protocol, especially since (as Alia points out) the information should be available for management application to check anyway. ```Mike ``````Alia _______________________________________________ Rtgwg mailing list [email protected] https://www1.ietf.org/mailman/listinfo/rtgwg ``````_______________________________________________ Rtgwg mailing list [email protected] https://www1.ietf.org/mailman/listinfo/rtgwg ```
 Current Thread Re: Updated: draft-zinin-microloop-analysis-01.txt, mike shand Re: Updated: draft-zinin-microloop-analysis-01.txt, Stewart Bryant Re: Updated: draft-zinin-microloop-analysis-01.txt, mike shand <= Re: Updated: draft-zinin-microloop-analysis-01.txt, Alia Atlas draft-zinin-microloop-analysis-01.txt - times, Stewart Bryant Re: draft-zinin-microloop-analysis-01.txt - times, Alia Atlas Re: draft-zinin-microloop-analysis-01.txt - times, mike shand Re: draft-zinin-microloop-analysis-01.txt - times, Alia Atlas Re: draft-zinin-microloop-analysis-01.txt - times, Stewart Bryant Re: draft-zinin-microloop-analysis-01.txt - times, Alex Zinin Re: draft-zinin-microloop-analysis-01.txt - times, Alia Atlas Re: draft-zinin-microloop-analysis-01.txt - times, Stewart Bryant