I've been thinking about this approach and have a number of comments on it
(sorry for the length) gathered from various discussions.
In general, this is a straightforward approach that seems quite promising,
but it still very preliminary. I am concerned about a number of issues
that are either not sufficiently addressed in the draft or where I feel the
complexity of the approach in the draft is a problem. If all such issues
can be adequately resolved such that the IPFRR with Notvia Addresses can be
relatively simple to configure, manage and understand while handling all
the problem cases of interest, then this approach could be a likely
candidate for an advanced method.
As we've been discussing, the key question is what is the correct trade-off
between mechanism complexity and network coverage. While I think that this
approach has the possibility of being a reasonable trade-off, whether that
is actually the case will depend on whether and how the various issues can
At a high level, conceptually the Notvia addresses approach gives very
similar forwarding paths as TE FRR. The difference is that rather than the
head-end doing the computation for each tunnel and signaling the ERO, the
computation is distributed to all nodes in the network.
First, I'm going to go through what I believe the benefits are.
1. The idea is conceptually simple; it is easy to understand the path
the alternate will take. This is useful for network operations.
2. No more than a single level of encapsulation is ever
required. Although it does suffer from requiring an explicit tunnel, at
least the forwarding complexity overhead is constant and well understood.
3. For node and link failure, if the topology isn't disjoint after the
failure, then an alternate will be found. There are definitely issues
still to handle the broadcast link case, but I'll address those later.
4. Although computationally expensive, the necessary computation does
appear to be feasible by using incremental SPFs and early
termination. These bring the required computation for pt2pt-link and node
failure into a reasonable range. My assumption is that this time can be
further improved with some work.
5. There is no possibility of looping via the alternates in the event
that a worse failure, which the alternate can't protect against, has
occurred. This is both because the alternate in the Notvia addressed
tunnel is not repaired and because the Notvia-addressed tunnel rejoins the
SPT at a point that is always downstream of S.
6. There are no issues with multi-area OSPF because traffic is sent
through tunnels to Notvia addresses that are always intra-area for at most
one area. Of course, the multi-area OSPF applicability restrictions are
not very restrictive.
7. It is clear that SRLG failures and broadcast link failures can be
handled. The complexity required depends on the desired/required
coverage. I'll address this later in the concerns section.
8. Because a tunnel is used, it is possible to use the same mechanism
for multicast traffic, when we determine how to provide IPFRR for
multicast. There is also the advantage that the router advertising the
Notvia address can know the SPT in ref to that Notvia address; this allows
an RPF check for traffic considering the alternate.
Second is the list of downsides with the approach. The main concern is
that the mechanism becomes too complex such that the trade-off between its
complexity and the full coverage is not desirable.
1. This requires a large number of additional IP addresses in the
IGP. The same number of additional FECs is required to support LDP.
2. Explicit tunnels are needed, which means that targeted LDP sessions
are necessary to have this support LDP traffic. This is a particular
concern for multi-homed prefixes; I'll describe my concerns on this later.
3. Substantial IGP changes are required to handle the additional
4. A more complex algorithm is required to make the computation feasible.
5. The management of the Notvia addresses & of the tunnels can create
longer time periods where protection isn't available for a part of the
network (the new link or node, etc.).
Third, there are a number of issues that I feel need considerable
discussion to try and resolve. I will try to go through each in turn and
explain what I think the various aspects of each are. Each of these issues
has the possibility to resolve in such a way that the Notvia Addresses
approach becomes overly complex.
1. Notvia Addresses: The first issue is how the Notvia addresses are
allocated, distributed and withdrawn. An initial idea of Stewart & Mike is
that these addresses are not global addresses (i.e. are 10.x.x.x or such)
and are configured in blocks on each router so that the router can manage
the bindings itself.
a. The routing extensions to the IGP will have to associate a network
resource (node or link) that an address should be Notvia. This is probably
b. It is desirable to have some dampening on the withdrawal of Notvia
addresses to minimize thrashing.
c. If configured in blocks, it would be extremely desirable to have
the same Notvia address mean the same thing through multiple reboots,
etc. It'd be good to have some means of consistent association. This is
for easy manageability.
d. When a new link or neighbor comes up, there will be a longer period
of time when an alternate isn't available because the Notvia address hasn't
been advertised yet. These periods without protection need to be clearly
understood and minimized.
e. There may be scalability concerns based on the number of Notvia
addresses and LDP FECs required. For instance, as described in the draft,
it is basically the number of uni-directional links in the topology. This
is ignoring the extras for broadcast links. To fully & certainly provide
SRLG protection if at all feasible, would require that each router
advertise a Notvia address for every uni-directional link into every
neighbor of that router. This would result in K*L additional addresses,
where K is the average number of neighbors & L is the number of
uni-directional links in the topology.
2. Insufficiently diverse topology: It is possible that a network
topology cannot provide an alternate that suffices for link, node and SRLG
protection. It isn't clear to me how to compute a "best-available"
alternate using this approach. For instance, if one can get link
protection, but not node protection, how would that be determined, computed
and assigned? This becomes much more of a concern for SRLG protection &
for topologies where failures have already occurred and the network has
converged for those & needs protection in the event of an additional failure.
3. Failure Diagnosis versus Pessimism: As written, the draft
discusses the idea of doing failure diagnosis using BFD. As Stewart, Mike
& I have discussed, this isn't possible for SRLG failures, although it is
possible for broadcast links.
a. I am concerned about adding the failure diagnosis. This is yet
another level of complexity for implementation. It also has ramifications
for the forwarding plane, because of the need to store multiple alternates
to use & have multiple states to check to decide what to use.
b. An example of a concern with the BFD diagnosis is that all
interfaces on a node that has failed are not certain to fail exactly
simultaneously or even within a sub-50ms bounded window. It is entirely
possible that BFD sessions are terminated on different line-cards, that
detect the router failure at slightly different times and stop forwarding
traffic, therefore, at slightly different times.
c. The other approach is to pessimistically eliminate all routers
connected to the broadcast link as well as the broadcast link; this may not
provide an alternate. It also needs to be thought through what issues
might exist if the topologies used for the SPF vary slightly for each
router that is on the broadcast link, since each will, as described, not
prune itself out when doing the computation; of course, there could be an
approach where the same topology can be used everywhere. It isn't clear to
me what Notvia addresses would be needed to express "don't go through this
pseudo-node or any nodes attached to it"; I don't think that it is simply
the Notvia address for avoiding a particular node.
4. Multi-homed Prefixes: I am quite concerned about the mechanisms
suggested in the draft.
a. First, I really do not like the idea of having separate forwarding
for "local" prefixes that come out of a tunnel. What is a local
prefix? For instance, does this mean that an ABR has to forward traffic
different depending on which area traffic from the tunnel has come from? I
am concerned about how this would scale; maybe only 2 FIBs are needed (one
for backbone & one for other), but it may be worse to handle AS external
routes. I know that Stewart, Mike, Joel, Albert and I had discussed/agreed
to put this idea out of scope at least for the moment.
b. I am quite concerned about having tunnels to the advertisers of the
i. There needs to be a mechanism to determine whether the advertiser
of a prefix will forward the packet in a loop-free fashion to avoid the
failure point. The separate forwarding for "local" prefixes avoided the
need for this determination, but at more substantial cost.
ii. To support LDP, every tunnel requires a targeted LDP session. If
multi-homed prefixes are common, then this becomes a full mesh for
LDP. That isn't acceptable. Of course, multi-homed prefixes may be much
more infrequent for LDP than for IP; for example, there is no reason to
advertise a separate FEC for the subnet of a link. However, multi-homed
prefixes are a concern for LDP for at least the inter-area, AS External,
and BGP routes.
iii. If traffic is encapsulated to a node's regular address, because
that traffic is destined to a prefix advertised by the node, how does the
receiving node know to remove the encapsulation and forward the packet
inside all in the fast path? Is this a just a question of different
handling based on the header type inside the outer encapsulation (for GRE)?
iv. Perhaps these issues could be handled by determining a
next-next-hop that avoids the failure to reach an appropriate
advertiser. Of course, this is a different set/type of computation.
5. SRLGs and Broadcast Links: There seem to be a number of possible
ways to handle SRLGs and broadcast links, each of which provides a
different trade-off in terms of coverage, computation, and extra Notvia
addresses. There are basically 4 approaches at this point.
a. First, In order to compute a notvia alternate that avoids a link,
the primary neighbor, and all SRLGs that the link is part of, it is
necessary to have a separate topology and associated SPF computation for
each link that is a member of an SRLG or a broadcast link. This requires
also a substantially larger number of Notvia addresses and the
corresponding mechanisms to determine how and when to allocate and
b. Second, one could use a topology that removed the primary neighbor
and see whether SRLG protection can be obtained either along S's path or
along any path of a neighbor of S that is also loop-free.
c. Third, when a Notvia address indicates to avoid a node, one could
remove not merely the node & the uni-directional links to and from that
node, but also any other links that are in a common SRLG with any of the
links to or from the removed node. This is pessimistic but allows some
SRLG protection without increased computation or Notvia addresses.
d. Fourth, one could simply track the SRLGs encountered along the
Notvia path; this just reports whether the alternate provides SRLG
protection without any effort to obtain it.
6. Implementability: Clearly, the draft describes the basic idea for
Notvia addresses, but there are a fair number of implementation/protocol
decisions that need to be made before this can become anything more than an
7. There is a definite need to describe the convergence case
better. This is how the transition from using the alternate to the
network being converged happens, such that the alternate remains
a. For instance, if the node E fails, then the Notvia address E_!S
will no longer be advertised. If S was getting link protection (because
that was all that was possible, for instance) by tunneling traffic to E_!S,
it is important that this traffic be properly discarded when E's addresses
go away. This implies that there needs to be a default blackhole for
b. Another example is when node E fails, the next-next-hop B must
continue to advertise the Notvia address B_!E until the network converges
so that S can continue to tunnel traffic to B_!E as the alternate.
c. It is possible to get a micro-forwarding loop affecting a Notvia
address as a result of a less severe failure than anticipated. For
instance, consider the following topology.
| | \ 10
1 |R 1 |R \
| 5 | \
Link S->E and Link H->F are in SRLG R
When node E fails, if I converges before H, there will be a loop affecting
the Notvia address being used to reach F without going through any of Link
S->E, E or SRLG R.
d. How do exceptions work? Particularly in regards to an IP-in-IP
encapsulation such as GRE, it doesn't seem like MTU exceeded cases can be
handled cleanly either by use of DF or by doing IP fragmentation and then
the reassembly at the end of the tunnel. This seems like a problem for all
ICMP packets; how could a source understand the header inside for a TTL
expired, for instance.
e. For IP-in-IP tunnels, another concern is flow diversity. The IP
source and destination addresses are used to determine a flow; this flow
identification may then be used for a variety of purposes, including
ECMP. By putting all the traffic to a variety of destinations inside the
same header, the ability to take advantage of flow diversity appears to
have disappeared. This could possibly be solved by putting the original
source address into the encapsulating header? Are there other approaches?
Hopefully, this will spark some discussion on the issues.
Rtgwg mailing list