[email protected]
[Top] [All Lists]

Re: Comments on draft-ietf-rtgwg-ipfrr-spec-base-09

Subject: Re: Comments on draft-ietf-rtgwg-ipfrr-spec-base-09
From: mike shand
Date: Fri, 09 Nov 2007 10:15:36 +0000
    Sorry about the delay in replying. It fell off my to do stack.

    Responses in line below. I'll address your algorithm comment responses in another email.


Alia Atlas wrote:
Hi Mike,

On 10/12/07, mike shand <[email protected]> wrote:
I support the progression of this document, but would like to see the
comments Stewart and I made earlier on the algorithm addressed and also
the few points below.

I addressed the comments on the algorithm in a separate email.  Here are
responses to your points below.

In addition, I wonder whether there should be something up front which
makes clear that this may only provide a partial solution (depending on
the topology). I know this is implicit in the body of the text, but I
think it would be useful to avoid any misunderstandings if the abstract
were a little more explicit on this point.

How about adding after "...The goal of this technology is to reduce the
   micro-looping and packet loss that happens while routers converge
   after a topology change due to a failure. ..."

"The extent to which this goal can be met by this specification is
dependant on the topology of the network."

Agreed.  I've added this.


    1.1.  Failure Scenarios

       The alternate next-hop can protect against a single link failure, a
       single node failure, one or more shared risk link group failures, or
       a combination of these.

It might be better to say "failure of one or more links within a shared
risk link group".

yes - changed

          Figure 5: Example where Continued Use of Alternate is Desirable

       This is an example of a case where the new primary is not a loop-free
       alternate before the failure and therefore may have been forwarding
       traffic through S. This will occur when the path via a previously
       upstream node is shorter than the the path via a loop-free alternate
       neighbor.  In these cases, it is useful to give sufficient time to
       ensure that the new primary neighbor and other nodes on the new
       primary path have switched to the new route.

I wonder if it should be pointed out that while this is a good strategy
to minimize the occurrence of microloops, it does nothing to prevent any
microloops which may occur more than one hop away.

Immediately before that figure, the draft discusses how the techniques given
in the micro-loop prevention drafts should dictate the convergence rules. 

I've prefaced the references to those drafts with

"There are techniques available to handle the micro-forwarding loops
that can occur in a networking during convergence."
to give it a bit more context.

Yes. I think that covers it. The references talk about general loop prevention, and it is now clear (although it already said so), that this particular section is just talking about loops involving S.

       based on the new network topology.  The use of the alternate next-
       hops for packet forwarding SHOULD terminate:

       a.  if the new primary next-hop was loop-free prior to the topology
           change, or

       b.  if a configured hold-down, which represents a worst-case bound on
           the length of the network convergence transition, has expired, or

       c.  if notification of an unrelated topological change in the network
           is received.

We should probably add that if the primary link comes back before any of
this has happened then you can just go back to using the primary link as
if nothing had happened. That of course pre-supposes that the failure
hadn't yet been advertised. If it HAS been advertised, then it requires
another advertisement to put it back how it was, but in any case (I
think) the old next hop can safely be used.

Once the failure has been advertised, I don't think we can just go back to using
the old next hop.  It would depend on what else in the network has already been
updated.  Can you explain why you don't think that additional micro-loops might be
the result?
In the old topology P was downstream of S (obviously, since P was S's next hop). In order to get a loop S would need to be downstream of P in the new topology. Removing the link S-P from the topology cannot now result in S being downstream of P.

It IS possible that, in the absence of controlled convergence, there will be a microllop somewhere between P and D. But it is equally possible that there will be a micro-loop on the path from S's alternate to D.

Also, generally, there are hold-down times on links so that they can't just bounce
back up before the link change has been advertised.  I don't think this is a very useful
optimization and I'm a bit concerned about putting it in now without some more thought.
In the general context of IPFRR solution with 100% protection, we have been thinking that it may be sensible to delay the advertisment of the change for a short time after the repair has been put in place, so that if the link DOES come back in that time the rest of the network is not disturbed at all. If you don't have 100% protection this may not be such a good strategy, since those destination which are not being protected will suffer a prolonged outage.

So in this context IF the link comes back before anything else has happened, then just going back makes sense. But I take your point that this is unlikely, given the hold downs and the desire to respond reasonably quickly.

In any case, I don't feel strongly about all this, so let's just leave it as it was.



rtgwg mailing list
[email protected]

_______________________________________________ rtgwg mailing list [email protected] https://www1.ietf.org/mailman/listinfo/rtgwg

rtgwg mailing list
[email protected]
<Prev in Thread] Current Thread [Next in Thread>