Basic Specification for IP Fast-Reroute: Loop-free Alternates
The goal of this technology is to reduce the
micro-looping and packet loss that happens while routers converge
after a topology change due to a failure.
SB> This draft does not address microlooping - our other drafts do that.
SB> There really ought to be a health warning saying that there are
SB> other solutions with greater fault coverage, but this one
SB> is simple and does not need any co-operation from any other
SB> node in the network.
I will remove the comment about micro-looping (though part of that is for
the micro-loops involving S).
"The extent to which this goal can be met by this
specification is dependent on the topology of the network."
to the end of the abstract.
As discussed in [I-D.ietf-rtgwg-ipfrr-framework],
SB> We need to ask the question of whether we need to LC the framework
SB> draft and whether we need to publish that first to raise the
SB> awareness level before we send this to IETF/IESG review?
As John asked, is it ready? I'd like to see that progress as well - though it
doesn't discuss multicast IPFRR and that would be good to see in a framework
eventually - even if we haven't nailed down many details yet.
The mechanism also
assumes that both the primary path and the alternate path are in the
same routing area.
SB> I am trying to remember how fundamental to the solution that is.
SB> Unlike NV, LFA only uses cost information and we know the costs.
SB> There is discussion regarding this restriction for OSPF, but does
SB> the constraint also apply to ISIS?
Even in ISIS, the SPF computation is done per area. IF one didn't want this
restriction, then that would have to change.
/------| S |--\
/ +-----+ \
/ 5 8 \
| E | | N_1 |
\ \ 4 3 / /
\| \ / |/
-+ \ +-----+ / +-
\---| D |---/
Figure 1: Basic Topology
SB> This picture is not too bad, others are worse - particularly
SB> the figure in the appendix. Would we help
SB> the reader if we published a pdf version of the RFC?
I am happy to take others offerings for ASCII art if they are better
looking. I don't have the diagrams drawn other than in ASCII at the moment.
If you'd like a pdf version (and color can definitely help), perhaps we could
work on creating the diagrams.
network with this feature experiences less traffic loss and less
micro-looping of packets than a network without IPFRR. There are
cases where micro-looping is still a possibility since IPFRR coverage
varies but in the worst possible situation a network with IPFRR is
equivalent with respect to traffic convergence to a network without
SB> Surely we are talking repair not uloop here?
Replaced the micro-looping in "There are cases where micro-looping is still a possibility..."
with "traffic loss".
1.1. Failure Scenarios
The alternate next-hop can protect against a single link failure, a
single node failure, one or more shared risk link group failures, or
a combination of these. Whenever a failure occurs that is more
extensive than what the alternate was intended to protect, there is
the possibility of temporarily looping traffic (note again, that such
a loop would only last until the next complete SPF calculation).
SB> which may be delayed by the loop prevention mechanism.
But we're not describing the loop prevention mechanism here.
If there are not other protection
mechanisms a node failure is still a concern when only using link
SB> This sentence seems in the wrong place - or provides too little
SB> discussion on the issues concerning node failure.
"If there are not other protection mechanisms to handle node failure, a node failure is still
a concern when only using link protecting LFAs. "
It's just pointing out that if your LFA only protects against link failure, then node failures will
still cause problems.
In Figure 2, S
would be able to use N as an alternate, but N could not use S;
SB> s/alternate/downstream alternate/
therefore N would have no alternate and would discard the traffic,
thus avoiding the micro-loop. A micro-loop due to the use of
alternates can be avoided by using downstream paths because each
succeeding router in the path to the destination must be closer to
the destination than its predecessor (according to the topology prior
to the failures). Although use of downstream paths ensures that the
micro-looping via alternates does not occur, such a restriction can
severely limit the coverage of alternates.
SB> Might be better to reorder the discussion here a little for
"Micro-looping of traffic via the alternates caused when a more
extensive failure than planned for occurs can be prevented via
selection of only downstream paths as alternates. A micro-loop due to
the use of alternates can be avoided by using downstream paths because
each succeeding router in the path to the destination must be closer
to the destination than its predecessor (according to the topology
prior to the failures). Although use of downstream paths ensures that
the micro-looping via alternates does not occur, such a restriction
can severely limit the coverage of alternates. In Figure 2, S would be able to use N as a downstream
alternate, but N could not use S; therefore N would have no alternate
and would discard the traffic, thus avoiding the micro-loop. "
As shown above, the use of either a node protecting LFA or a
downstream path provides protection against micro-looping in the
event of node failure.
SB> I don't think that you have explained NP LFA yet
What needs explanation? The details for node-protecting LFA are given later, it is true,
but I think the term is clear enough.
There are topologies where there may be
either a node portecting LFA, a downstream path, both or neither. A
node may select either a node protecting LFA or a downstream path
without risk of causing micro-loops in the event of node failure.
SB> neighbor node failure?
clearer - changed
Since the functionality of link and node protecting LFAs is greater
than that of downstream paths, a router SHOULD select a link and node
protecting LFA over a downstream path.
SB> Should we just say NP LFA as L & NP LFA sounds like we have two
SB> protection mechanisms.
No, because a node-protecting LFA may not be link-protecting...
3.3. Broadcast and NBMA Links
| S |--------
| 5 |
| 0 |
/----\ 0 5 +-----+
| PN |-----| N |
| 0 |
| | 8
| 5 |
+-----+ 5 +-----+
| E |----| D |
Figure 3: Loop-Free Alternate that is Link-Protecting
In Figure 3, N offers a loop-free alternate which is link-protecting.
If the primary next-hop uses a broadcast link, then an alternate
SHOULD be loop-free with respect to that link's pseudo-node to
provide link protection. This requirement is described in
Inequality 4 below.
D_opt(N, D) < D_opt(N, pseudo) + D_opt(pseudo, D)
SB> The diagram used the term PN, maybe the equation should as well.
Sure - changed - and added the clarification in the text associating PN
"...with respect to that link's pseudo-node (PN) to provide link protection"
4.1. Terminating Use of Alternate
An implementation SHOULD continue to use the alternate next-hops for
packet forwarding even after the new routing information is available
based on the new network topology. The use of the alternate next-
hops for packet forwarding SHOULD terminate:
a. if the new primary next-hop was loop-free prior to the topology
b. if a configured hold-down, which represents a worst-case bound on
the length of the network convergence transition, has expired, or
c. if notification of an unrelated topological change in the network
SB> I think that list list needs to include the case where the network
SB> has converged (with or without the uloop prevention as applicable)
How, other than the configured hold-down, does a router know that the
network has converged?
6. Routing Aspects
6.1. Multi-Homed Prefixes
5 +---+ 4 +---+ 5 +---+
------| S |------| A |-----| B |
| +---+ +---+ +---+
| | |
| 5 | 5 |
| | |
+---+ 5 +---+ 5 7 +---+
| C |---| E |------ p -------| F |
+---+ +---+ +---+
Figure 6: Multi-homed prefix
If there exist multiple multi-homed prefixes that share the same
connectivity and the difference in metrics to those routers, then a
single node can be used to represent the set. For instance, if in
Figure 6 there were another prefix X that was connected to E with a
metric of 1 and to F with a metric of 3, then that prefix X could use
the same alternate next-hop as was computed for prefix p.
SB> I do not understand this simplification. I think that you are
SB> saying use p as a proxy for p', p'' etc. But if they have
SB> different costs don't you have to always use a different proxy?
SB> Say p' was connected to E at cost 1, but F at cost 1000, surely
SB> the packet would never get to F, or if it did, F would prefer
SB> to sent it via E and hence loop. Is there a must sent external
SB> assumption (which we considered for the tunnel solutions) that
SB> is assumed?
Let me try and clarify.
Say we have the following defined:
D_opt(S, E) and D_opt(S, F)
Now, for simplicity, assume that p', p", etc. connect more cheaply to E than to F.
So, have diff_p'(E,F) = D(p', F) - D(p', E)
So, the cost to a proxy node p would be:
D_opt(S,p) = min (D_opt(S,E) + 0, D_opt(S, F) + diff_p'(E,F))
To get the distance to p', you would add D(p', E) to that min D_opt(S, p) - but that wouldn't change the
path computation at all because the same constant is being added to both potential paths at the end.
----[ E ] ------- < p > ------ [ F ]
( p') ( p" )
So, if you have a set of prefixes with the same D(x, F) - D(x, E), they can take the same path to p.
Does that make sense? Any suggestions on how to phrase it better for the draft?
rtgwg mailing list