[email protected]
[Top] [All Lists]

Composite Link Requirements, A Network Operator Perspective

Subject: Composite Link Requirements, A Network Operator Perspective
From: "Mcdysan, David E"
Date: Tue, 6 Apr 2010 09:14:57 -0400
Hi Everyone,

As I mentioned last Wednesday, I had been drafting another version of a
rewrite that is  from the network operator perspective of problems to be
solved. 

A major difference is an attempt to answer Alex's question of why
network operators need these additional functions, striving to avoid
stating these in terms of what changes could made to existing protocols.

I have tried to include the major requirements that Curtis proposed and
points raised in the discussion thread, in particular, an attempt to
remove any solution specific orientation. 

In my opinion, we need to write the requirements in such a way that the
IETF can use them to decide between different solution approaches that
will be proposed. Based uppon the discussion on the list there are
several solution ideas that people have in mind, and what we first need
to agree on is things that are "decidable" in order to choose between
them. 

In order to keep to the 7 page limit (including boilerplate), the
proposed document structure would put much supporting detail in two
appendices (service provider use case and descriptions with references
to prior techniques (which may well be the basis on which a solution
would define extensions)). I recall Alex indicating that material on
"acknowledgement of Prior Work" could be in an Appendix, and I am
proposing that Service Provider "Story Telling" would also be in an
Appendix.

Your candid feedback, comments, questions and suggestions would be much
appreciated.

Thanks,

Dave
------------------------------------------------------------------------
-----------
1. Introduction

   The purpose of section 4 is to describe why network operators require
   certain functions in order to solve certain business problems. The
   intent is to first describe why things need to be done in terms of
   functional requirements that are as independent as possible of
protocol
   specifications. For certain functional requirements this document
   describes a set of derived protocol requirements in section 5. Two
   appendices provide supporting details as a summary of existing/prior
   operator approaches and implementation techniques and relevant
protocol
   standards.

2. Assumptions

   The services supported include L3VPN, L2VPN (VPWS and VPLS), Internet
   traffic encapsulated by at least one MPLS label, and dynamically
   signaled MPLS-TP LSPs and pseudowires. The MPLS LSPs supporting these
   services may be pt-pt, pt-mpt, or mpt-mpt.

   The location in a network where these requirements apply are a Label
   Edge Router (LER) or a Label Switch Router (LSR) as defined in RFC
   3031.

   The IP DSCP cannot be used for flow identification since L3VPN
requires
   Diffserv transparency [RFC 4031, 5.5.2], and in general network
   operators do not rely on the DSCP of Internet packets.

3. Definitions

   Composite Link: Section 6.9.2 of ITU-G.800 defines this in terms of
   three cases, of which the following two are relevant (the one
   describing inverse (TDM) multiplexing does not apply). Note that
these
   definitions are from section 6.9, Layer Relationships.

   Case 1: "Multiple parallel links between the same subnetworks can be
   bundled together into a single composite link. Each component of the
   composite link is independent in the sense that each component link
is
   supported by a separate server layer trail. The composite link
conveys
   communication information using different server layer trails thus
the
   sequence of symbols crossing this link may not be preserved. This is
   illustrated in Figure 14."

   Case 3: "A link can also be constructed by a concatenation of
component
   links and configured channel forwarding relationships. The forwarding
   relationships must have a 1:1 correspondence to the link connections
   that will be provided by the client link. In this case, it is not
   possible to fully infer the status of the link by observing the
server
   layer trails visible at the ends of the link. This is illustrated in
   Figure 16."

   Subnetwork: A set of one or more nodes (i.e., LER or LSR) and links.
It
   can represent a site comprised of multiple nodes.

   Forwarding Relationship: Configured forwarding between ports on a
   subnetwork. It may be connectionless (e.g., IP), or connection
oriented
   (e.g., MPLS signaled or configured).

   Component Link:  A topolological relationship between subnetworks
   (i.e., a connection between nodes), which may be a wavelength,
circuit,
   virtual circuit or an MPLS LSP.

   Atomic Flow: A set of packets that must be transferred on one
component
   link.

   Flow identification: The label stack and other information that
   uniquely identifies an atomic flow. Other information may include an
IP
   header, PW control word, Ethernet MAC address, etc.

   Note that an LSP may contain one or more Atomic Flows.

4. Network Operator Functional Requirements (FR)

   The Functional Requirements in this section are in grouped in
sections
   starting with the highest priority.

4.1. Availability, Stability and Transient Response

   Limiting the period of unavailability in response to failures or
   transient events is extremely important as well as maintaining
   stability. The transient period between some service disrupting event
   and the convergence of the routing and/or signaling protocols within
a
   time frame specified by SLA objectives is a key operational
   requirement. The timeframes range from rapid restoration, on the
order
   of 100 ms or less (e.g., for VPWS), to several minutes (e.g., for
   L3VPN) and may differ by set of customers within a single service.

   FR: Provide a means to summarize routing advertisements regarding the
   characteristics of a composite link such that the routing protocol
   convergence on O(Foo) to meet the SLA objective.

   FR: Provide a means for aggregating signaling such that in response
to
   a failure in the worst case cross section of the network that MPLS
LSPs
   are restored within O(Bar) to meet the SLA objective.

   FR: If extensions to existing protocols are specified and/or new
   protocols are defined, then the solution should provide a means for a
   network operator to migrate an existing deployment in a minimally
   disruptive manner.

   FR: Any automatic LSP routing and/or load balancing solutions must
not
   oscillate such that performance observed by users changes such that
an
   SLA is violated.

   FR: Management and diagnostic protocols must be able to operate over
   composite links.

4.2. Component Links Provided by Lower Layer Networks

   Case 3 as defined in G.800 involves a component link supporting an
MPLS
   layer network over another lower layer network (e.g., circuit
switched
   or another MPLS network (e.g., MPLS-TP)). The lower layer network may
   change the latency (and/or other performance parameters) seen by the
   MPLS layer network. Network Operators have SLAs of which some
   components are based on performance parameters. Currently, there is
no
   protocol for the lower layer network to inform the higher layer
network
   of a change in a performance parameter. Communication of the latency
   performance parameter is a very important requirement. Communication
of
   other performance parameters (e.g., delay variation) is desirable.

   FR: In order to support network SLAs and provide acceptable user
   experience, there needs to be protocol specified to allow a lower
layer
   server network to communicate latency to the higher layer client
   network.

   FR: The precision of latency reporting should be at least 10% of the
   one way latency for latency of 1 ms or more.

   FR: Provide a means to limit the latency on a per LSP basis between
   nodes within a network to meet an SLA target when the path between
   these nodes contains one or more pairs of nodes (or sites) connected
   via a composite link.

   The SLAs differ across the services, and some services have different
   SLAs for different QoS classes, for example, one QoS class may have a
   much larger latency bound than another. Overload can occur which
would
   violate an SLA parameter (e.g., loss) and some remedy to handle this
   case for a composite link.

   FR: If the total demand offered by traffic flows exceeds the capacity
   of the composite link, the solution should define a means to cause
the
   LSPs for some traffic flows to move to some other point in the
network
   that is not congested. These "preempted LSPs" may not be restored if
   there is no uncongested path in the network.

4.3. Parallel Component Links with Different Characteristics

   Corresponding to Case 1 of G.800, as one means to provide high
   availability, network operators deploy multiple nodes for the same
MPLS
   layer network in a site, which is connected via multiple component
   links. In many cases, multiple component links connect a pair of
nodes.
   Many techniques have been developed to balance the distribution of
   atomic flows across component links that connect the same pair of
nodes
   (See Appendix XX.1.1). When the component links of the composite link
   do not connect a pair of nodes, but connect a pair of sites
   (subnetworks) other techniques have been developed (See Appendix
   XX.1.2). The following sections break the requirements into three
cases
   determined by the connectivity of the component links: a) same pair
of
   nodes or sites, b) same pair of nodes only, c) component links
   connecting multiple pairs of nodes in a pair of sites.

   a) Additional protocol is required to provide the following
additional
   functions when component links connect a pair of nodes (or sites):

   FR: Measure traffic on a labeled traffic flow and dynamically select
   the component link on which to place this flow in order to balance
the
   load so that no component link in the composite link between a pair
of
   nodes (or sites) is overloaded.

   FR: When a traffic flow is moved from one component link to another
in
   the same composite link between a set of nodes (or sites), it must be
   done so in a minimally disruptive manner.

   When a flow is moved from a current link to a target link with
   different latency, reordering can occur if the target link latency is
   greater than that of the current or clumping can occur if target link
   latency is less than that of the current. Therefore, some flows
(e.g.,
   timing distribution, PW circuit emulation) are quite sensitive to
these
   effects, which may be specified in an SLA or are needed to meet a
user
   experience objective (e.g. jitter buffer under/overrun).



   FR: Provide a means to identify flows whose rearrangement frequency
   needs to be bounded by a configured value.

   FR: Shall provide a means that communicates whether the flows within
an
   LSP can be split across multiple component links. Should provide a
   means to indicate the flow identification field(s) which can be done
to
   do this along.

   FR: Provide a means to indicate that a traffic flow shall select a
   component link with the minimum latency value.

   b) Additional protocol is required to provide the following
additional
   functions when component links connect a pair of nodes:

   FR: Provide a means local to the node connected via a composite link
to
   automatically distribute the load between the component links in the
   composite link that connects to the other node.

   FR: Provide a means to distribute atomic flows from a single LSP
across
   multiple component links to handle at least the case where the
traffic
   carried in an LSP exceeds that of any component link in the composite
   link.

   c) Additional protocol is required to provide the following
additional
   functions when component links connect different sites:

   FR: Provide a means upstream of the sites connected via a composite
   link to automatically distribute the load between the composite links
   that connect the individual nodes in the sites.

5. Derived Requirements (DR)

   This section takes the next step and derives high-level requirements
on
   protocol specification from the functional requirements.

   DR: Attempt to extend existing protocols wherever possible,
developing
   a new protocol only if this adds a significant set of capabilities.

   The vast majority of network operators have provisioned L3VPN
services
   over LDP. Many have deployed L2VPN services over LDP as well. TE
   extensions to IGP and RSVP-TE are viewed as being too complex.

   DR: Solutions which extend LDP capabilities to meet functional
   requirements (without using TE methods as decided in RFC 3468) are
   highly desirable.

   DR: Coexistence of LDP and RSVP-TE signaled LSPs must be supported on
a
   composite link. Other functional requirements should be supported as
   independently of signaling protocol as possible.

   DR: When the nodes in a subnetwork connected via a composite link are
   in the same MPLS network, the solution can define extensions to the
   IGP.

   DR: When the nodes in a subnetwork connected via a composite link are
   in different MPLS networks, the solution cannot rely on extensions to
   the IGP.

   DR: The number of links advertised in the IGP and a worst case
scenario
   of the volume of change for such advertisements causes IGP
convergence
   to occur, potentially causing a period of unavailability as perceived
   by users. NEED TO AGREE ON SOME WAY TO QUANTIFY THIS IN ORDER TO
DECIDE
   BETWEEN SOLUTION APPROACHES.

   DR: The number of RSVP-TE LSPs to be resignaled in response to a
   catastrophic failure event, potentially causing a period of
   unavailability as perceived by users. NEED TO AGREE ON SOME WAY TO
   QUANTIFY THIS IN ORDER TO DECIDE BETWEEN SOLUTION APPROACHES.

6. References

   [TE Rqmts]

   [RFC 2702] Awduche, Malcolm, Agobua, O'Dell, McManus, "Requirements
for
             Traffic Engineering Over MPLS"

   [RFC 3809] Nagarajan, et al, "Generic Requirements for Provider
             Provisioned Virtual Private Networks (PPVPN)"

   [RFC 4665] RFC 4665, Augustyn, Serbest et al, "Service Requirements
for
             Layer 2 Provider-Provisioned Virtual Private Networks"

   [RFC 4031] RFC 4031, Carugi, McDysan et al, "Service Requirements for
             Layer 3 Provider Provisioned Virtual Private Networks
             (PPVPNs)"

   [RFC 5254] Bitar, Bocci, Martini et al, "Requirements for
Multi-Segment
             Pseudowire Emulation Edge-to-Edge (PWE3)"

   [RFC 3031]

   [G.800]

   [RFC 3468] L. Andersson, G. Swallow, "The Multiprotocol Label
Switching
             (MPLS) Working Group decision on MPLS signaling protocols."

7. Appendix A: More Details on Existing Network Operator Practices and
   Protocol Usage

   Network operators have SLAs for services that are comprised of
   numerical values for performance measures, principally availability,
   latency, delay variation.  See [Y.1541], [RFC 3089, 4.9] for examples
   of the form of such SLAs. Note that the numerical values of Y.1541
span
   multiple networks and may be looser than network operator SLAs.
   Applications and acceptable user experience have a relationship to
   these performance parameters.

   Consider latency as an example. In some cases, minimizing latency
   relates directly to the best customer experience (e.g., in TCP closer
   is faster). I other cases, user experience is relatively insensitive
to
   latency, up to a specific limit at which point user perception of
   quality degrades significantly (e.g., interactive human voice and
   multimedia conferencing). A number of SLAs have. a bound on
point-point
   latency, and as long as this bound is met, the SLA is met --
decreasing
   the latency is not necessary. In some SLAs, if the specified latency
is
   not met, the user considers the service as unavailable. An
unprotected
   LSP can be manually provisioned on a set of to meet this type of SLA,
   but this lowers availability since an alternate route that meets the
   latency SLA cannot be determined.

   Historically, when an IP/MPLS network was operated over a lower layer
   circuit switched network (e.g., SONET rings), a change in latency
   caused by the lower layer network (e.g., due to a maintenance action
or
   failure) this was not known to the MPLS network. This resulted in
   latency affecting end user experience, sometimes violating SLAs or
   resulting in user complaints.

   A response to this problem was to provision IP/MPLS networks over
   unprotected circuits and set the metric and/or TE-metric proportional
   to latency. This resulted in traffic being directed over the least
   latency path, even if this was not needed to meet an SLA or meet user
   experience objectives. This results in reduced flexibility and
   increased cost for network operators. Using lower layer networks to
   provide restoration and grooming is expected to be more efficient,
but
   the inability to communicate performance parameters, in particular
   latency, from the lower layer network to the higher layer network is
an
   important problem to be solved before this can be done.

   Latency SLAs for pt-pt services are often tied closely to geographic
   site locations, while latency for mpt services may be based upon a
   worst case within a region.

   The presence of only three Traffic Class (TC) bits (previously known
as
   EXP bits) in the MPLS shim header is limiting when a network operator
   needs to support QoS classes for multiple services (e.g., L2VPN VPWS,
   VPLS, L3VPN and Internet), each of which has a set of QoS classes
that
   need to be supported. In some cases one bit is used to indicate
   conformance to some ingress traffic classification, leaving only two
   bits for indicating the service QoS classes. The approach that has
been
   taken is to aggregate these QoS classes into similar sets on LER-LSR
   and LSR-LSR links.

   Labeled LSPs have been and use of link layer encapsulation have been
   standardized in order to provide a means to meet these needs.

   The IP DSCP cannot be used for flow identification since [RFC4301,
5.5]
   requires Diffserv transparency, and in general network operators do
not
   rely on the DSCP of Internet packets.

   A label is pushed onto Internet packets when they are carried along
   with L2/L3VPN packets on the same link or lower layer network
provides
   a mean to distinguish between the QoS class for these packets.

   Operating an MPLS-TE network involves a different paradigm from
   operating an IGP metric-based LDP signaled MPLS network. The mpt-pt
LDP
   signaled MPLS LSPs occur automatically, and balancing across parallel
   links occurs if the IGP metrics are set "equally" (with equality a
   locally definable relation).

   Traffic is typically comprised of a few large (some very large) flows
   and many small flows. In some cases, separate LSPs are established
for
   very large flows. This can occur even if the IP header information is
   inspected by a router, for example an IPsec tunnel that carries a
large
   amount of traffic.

   Appendix A References

   [Y.1541]

8. Appendix B: More Details on Existing Standards and Techniques

   AUGMENT WITH PROPOSED TEXT FROM CURTIS, MAILING LIST DISCUSSION,
PRIOR
   DRAFT

8.1. Techniques for Load Balancing  across Component Links

8.1.1. Techniques for Component Links Connecting a Pair of Nodes

   * LAG

   * Hashing

   * ECMP

   * LSP Pinning



8.1.2. Techniques for Component Links Connecting a Pair of Sites

   * OMP

   * RSVP-TE signaled LSPs and metric setting

   * ECMP

8.2. Techniques for Minimizing Periods of Unavailability

8.2.1. Routing Protocol Based

   * Link Bundling

   * LFA, Fast IGP Convergence

8.2.2. Signaling Protocol Based

   * RSVP-TE FRR

8.3. Techniques for Handling Congestion

8.3.1.1. Routing Protocol Based

   * TE extensions to IGP

8.3.1.2. Signaling Protocol Based

   * MPLS TE for Differv

   References for Appendix B 
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

<Prev in Thread] Current Thread [Next in Thread>