Re: ops reqs on IP FRR

Subject: Re: ops reqs on IP FRR
From: mike shand
Date: Mon, 14 Nov 2005 14:30:22 +0000
Thanks for this contribution. I strongly agree with all your points. This is pretty much the design principles we have been working to.
At 16:10 11/11/2005, Pekka Savola wrote:

After watching the IP FRR discussion at rtgwg, I though I should try to highlight a few operational perspectives:
 1) the solution should not require any configuration of topology, backup
paths or in the topology; more specifically,
Yes. In particular it should not be necessary to modify the topology so
that the IPFFR works better (or works at all).

1.a) the solution must not require configuration of any parts of topology which are not adjacent to the router.
By "configuration" I assume you mean something other than deploying the
software and turning it on. Or are you saying that ONLY the routers
adjacent to the protected link should need updated software. That is fairly

Discussion: topological/routing changes happen frequently. Having to reconfigure the otherwise-unaffected routers so that IPFRR doesn't get messed up is a non-starter.
Exactly. If the IPFRR coverage is topology dependant it is easy for a
"good" topology to be degraded to a "bad" topology by one or more changes.

2) the solution should "just work" without any config (apart from toggling it on if need be)

3) the solution, when enabled, must not have a failure mode where the result would be blackholing or duplicating traffic for a noticeably longer period than non-IPFRR would ("optimizations must not make corner cases worse")
Yes. This is critical. "Do no harm". This why I have a problem with IPFRR
solutions which have less than 100% coverage. IPFRR and loop prevention
must go together. One without the other is not useful (for failure
protection.... loop prevention on its own is useful for management invoked
topology changes). ALL known loop prevention schemes delay convergence
(some more than others). This is not intrinsically a problem, since the
fast repair will be carrying the traffic for this time. However, that is
only true if the repair is able to repair 100% of the traffic. For any
fraction of the traffic which cannot be repaired, the delay introduced by
the loop prevention mechanism will result in that traffic receiving worse
service than for normal routing convergence.
Multiple uncorrelated failures may be difficult of impossible to deal with.
For those cases, the proposed action is to abandon the IPFRR/loop
prevention and fall back to normal convergence. This can be actioned when
multiple conflicting LSP/LSA s are received. This may take marginally
longer than normal convergence, owing to propagation delays, but it is a
small difference.
This "fall back" seems an important safety property.


HTH in choosing a deployable and manageable solution,

Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

