[email protected]
[Top] [All Lists]

RE: composite link - candidate for respin, maybe

Subject: RE: composite link - candidate for respin, maybe
From: "Mcdysan, David E"
Date: Fri, 26 Mar 2010 14:12:16 -0400
Hi Curtis,

Thanks for taking the initiative to write this up.

Also thanks to the rtgwg participants who provided many insightful
questions and suggestions at the mike as well as the helpful direction
from the chairs.

The definition of "flow" is not agreed to. The current scope is MPLS
only and not the Diffserv definition you propose unless the WG agrees to
extend the scope. I had asked the wg chairs to call for a resolution to
this decision on the list, and hopefully we can accomplish this soon.

A few detailed comments in line below.

If the other folks who took notes could publish their lists of things we
identified as requirements and/or problem statements.

Regards,

Dave 

> -----Original Message-----
> From: [email protected] [mailto:[email protected]] 
> On Behalf Of Curtis Villamizar
> Sent: Friday, March 26, 2010 3:02 AM
> To: [email protected]
> Subject: composite link - candidate for respin, maybe
> 
> 
> This email is to the authors of the CL drafts and the WG as a whole.
> 
> I started out writing a quick few bullets on what we talked 
> about in today's WG meeting that we seemed to agree on and 
> that made some sense to me.  The latter part may be a filter 
> that introduces artifacts to the signal (or noise).
> 
> That didn't seem to stand alone so I added some terms up 
> front and then some introductory text, trying to keep it concise.
> 
> What I would like to ask the WG is whether I'm on a 
> reasonable track here.  [hopefully we won't have continued 
> dead silence on the list.]
> 
> What I would like to ask the authors is whether they would 
> consider this a reasonable restart point.
> 
> I've tried to keep in mind the chair's instructions and what 
> I sensed to be the feelings in the audience (though we didn't 
> formally hum - is formal WG hum an oxymoron?).
> 
>   keep it short - 5 pages including boilerplate if possible
> 

I heard 7 pages, including boilerplate, and Alex's response to my
question regarding "Acknowledgement of Prior work" material was that it
could be moved to an Appendix and not count against this quota. Alex,
please confirm.

>   include clear requirements
> 
>   do not mandate implementation details
> 
> I'm not sure if the usage scenarios at the end are needed or 
> if this could stand alone.
> 
> There was a lot of chatter in the room so I appologize if I 
> missed something.  This is based on my recollection which is 
> known to have a fairly high bit error rate even on better days.
> 
> Curtis
> 
> 
> 
> btw - this is obviously not an I-D as is for more reasons 
> than missing boilerplate.  Its also late and I'm tired so I 
> made no attempt to get citation right.
> 
> 
> 
> 
> Key terms:
> 
>   flow - A flow in the context of this document is a aggregate of
>     traffic for which packets should not be reordered.  A flow is
>     similart to a microflow or ordered aggregate as defined in
>     [diffserv framework].  The term "flow" is used here for brevity.
>     This definition of flow should not be interpreted to have broader
>     scope than this document.

Note that Diffserv microflow includes source address, source port,
destination address, destination port and protocol id. See scope
decision.

> 
>   flow identification - The means of identifying a flow or a group of
>     flows may be specific to a type of payload.
> 
>   top label entry - In MPLS the top label entry contains the label on
>     which an intitial forwarding decision is made.  This label may be
>     popped and the forwarding decision may involve further labels but
>     that is immeterial to this discussion.
> 
>   label stack - In MPLS the label stack includes all of the MPLS
>     labels from the top of the stack to the label marked with the
>     S-bit (Bottom of Stack bit) set.
> 
>   outer and inner LSP - The LSP associated with labels in the outer
>     encapsulation are called outer LSP.  Those LSP which are
>     associated with inner encapsulation (closer to the label entry
>     containing the S-bit) are called inner LSP.  These are not called
>     top and bottom LSP since MPLS and PWE draw the label stack in
>     opposite directions with PWE putting the outermost label on the
>     bottom of diagrams (and confusing people in doing so).

Need to cover case with more than two LSPs. Would using inner LSPs. (For
brevity is last sentence necessary?
> 
>   component link - See RFC4201.
Not clear, component link  is not the bundled link, use existing I-D?

> 
>   composite link - [pull from existing I-D]
> 
> Introduction:
> 
>   There is often a need to provide large aggregates of bandwidth that
>   is best provided using parallel links between routers or MPLS LSR.
>   In core networks there is often no alternative since the aggregate
>   capacities of core networks today far exceed the capacity of a
>   single physical link or single packet processing element.

Move following to Appendix on summary of Existing approaches per
direction from Alex and merge with existing material and your comments,
inputs to the list in the thread "Acknowledgement of Prior work" .

> 
>   Today this requirement can be handled by Ethernet Link Aggregation
>   [IEEE802.1X], link bundling [RFC4201], or other aggregation
>   techniques some of which may be vendor specific.  Each has strengths
>   and weaknesses.
> 
>   The term composite link is more general than terms such as link
>   aggregate which is generally considered to be specific to Ethernet
>   and its use here is consistent with the broad definition in [ITU
>   8xxx].
> 
>   Large aggregates of IP traffic do not provide explicit signaling to
>   indicate the expected traffic loads.  Large aggregates of MPLS
>   traffic are carried in MPLS tunnels supported by MPLS LSP.  LSP
>   which are signaled using RSVP-TE extensions do provide explicit
>   signaling which includes the expected traffic load for the
>   aggregate.  LSP which are signaled using LDP do not provide an
>   expected traffic load.
> 
>   MPLS LSP may contain other MPLS LSP arranged hierarchically.  When
>   an MPLS LSR serves as a midpoint LSR in an LSP carrying other LSP as
>   payload, there is no signaling associated with these inner LSP.
>   Therefore even when using RSVP-TE signaling there may be
>   insufficient information provided by signaling to adequately
>   distribute load across a composite link.
> 
>   Generally a set of label stack entries that is unique across the
>   ordered set of label numbers can safely be assumed to contain a
>   group of flows.  The reordering of traffic can therefore be
>   considered to be acceptable unless reordering occurs within traffic
>   containing a common unique set of label stack entries.  Existing
>   load splitting techniques take advantage of this property in
>   addition to looking beyond the bottom of the label stack and
>   determining if the payload is IPv4 or IPv6 to load balance traffic
>   accordingly.
> 
>   For example a large aggregate of IP traffic may be subdivided into a
>   large number of groups of flows using a hash on the IP source and
>   destination addresses.  This is as described in [diffserv
>   framework].  For MPLS traffic carrying IP, a similar hash can be
>   performed on the set of labels in the label stack.  These techniques
>   are both examples of means to subdivide traffic into groups of flows
>   for the purpose of load balancing traffic across aggregated link
>   capacity.  The means of identifying a flow should not be confused
>   with the definition of a flow.
> 
>   Discussion of whether a hash based approach provides a sufficiently
>   even load balance using any particular hashing algorithm or method
>   of distributing traffic across a set of component links is outside
>   of the scope of this document.
> 
>   The use of three hash based approaches are defined in RFCxxxx.  The
>   use of hash based approaches is mentioned as an example of an
>   existing set of techniques to distribute traffic over a set of
>   component links.  Other techniques are not precluded.

End of Material Moved to Appendix. 

I had envisioned a section summarizing network operator problems to be
solved preceding the requriements. I plan to draft some text and send it
out for comment on the list.  


> 
> Requirements:
> 
>   These requirements refer to link bundling solely to provide a frame
>   of reference.  This requirements document does not intend to
>   constrain a solution to build upon link bundling.  Meeting these
>   requirements useing extensions to link bundling is not precluded, if
>   doing so is determined by later IETF work to be the best solution.
> 
>   The first few requirements listed here are met or partially met by
>   existing link bundling behavior including common behaviour that is
>   implemented when the all ones address (for example 0xFFFFFFFF for
>   IPv4) is used.  This common behaviour today makes use of a hashing
                                  ^Appendix^
>   technique as described in the introduction, though other behaviours
>   are not precluded.
> 
>   1.  Aggregated control information which summarizes multiple
>       parallel links into a single advertisement is required to reduce
>       information load and improve scaleability.
> 

Based upon Tony's comments, I think that the objective is to convey the
same information as a set of parallel links with different
characteristics more efficiently and in a more scalable manner than
making an advertisement for each parallel link. As discussed,
summarization (e.g., more than one value for latency) is a specific
solution. Comments, other recollections?

>   2.  A means to support very large LSP is needed, including LSP whose
>       total bandwidth exceeds the size of a single component link but
                                                           than
largest
>       whose traffic has no single flow greater ^ the   ^ component
link.

Above is an important requirement you mentioned. 

Move following to Appendix
>       In link bundling this is supported by many implementations using
>       the all ones address component addressing and hash based
>       techniques.
> 
>       Note: some implementations impose further restrictions regarding
>       the distribution of traffic across the set of identifiers used
>       in flow identification.  Discussion of algorithms and
>       limitations of existing implementations is out of scope for this
>       requirements document.
End Move to Appendix.

> 
>   The remaining requirements are not met by existing link bundling.
> 
>   3.  In some more than one set of metrics is needed to accommodate a
>       mix of capacity with different characteristics, particularly a
>       bundle where a subset of component links have shorter delay.

I would avoid use of metric to avoid confusion with current solutions.
Proposed rewording as follows:

A means in control and data plane protocols is needed to accomodate a
composite link composed of component links with different
characteristics, including at least: capacity, current latency,
indication of whether latency can change, ... others?

> 
>   4.  A mechansism is needed to signal an LSP such that a component
>       link with specific characteristics are chosen, if a preference
>       exists.  For example, the shortest delay may be required for
>       some LSP, but not required for others.

As discussed in the meeting, picking the shortest delay per composite
link is one requirement as you state above. 

We need to add the other service provider requirement described in the
meeting where certain LSPs have a latency that is less than a specified
end-end value. 

In my view, these are separate requirements, which may have different
solutions.

> 
>   5.  LSP signaling is needed to indicate a preference for placement
>       on a single component link and to specifically forbid spreading
>       that LSP over multiple component links based on flow
>       identification beyond the outermost label entry.

Need to clarify whether this applies to outer LSP and/or Inner LSP(s).
As discussed, we need to add description that Composite Link end point
routers participate in outer LSP signaling, may "snoop" signaling for
inner LSPs(), or may be able to determine that a label may be used for
component link assignment decisions (e.g., entropy label). 

> 
>   6.  A means to support non-disruptive reallocation of an existing
>       LSP to another component link is needed.

Need to include the control of change frequency from 4.1.1.2.3 the
existing I-D to this requirement. 

In the current draft, the use of MPLS TC (aka EXP) and DSCP bits are
only specified for this purpose. Do we want to describe use in other
cases? Also should describe fact that most operators do not modify DSCP
and map this to EXP bits so that DSCP is transparent to customers. 

> 
>   7.  A means to populate the TE-LSDB with information regarding which
>       links (per end) can support distribution of large LSP across
>       multiple component links based on the component flows and the
>       characteristics of this capability.  Key characteristics are:
> 
>       a.  The largest single flow that can be supported.  This may
>           or may not be related to the size of component links.
> 
>       b.  Characteristics of the flow identification method.  [These
>           can be enumberated in this document or a later document. ]

Would this be a place where MPLS and IP requirements would be
differentiated?

> 
>       c.  Other characteristics?  [ Not sure if I got everything
>           mentioned in the WG meeting. ]

We ran out of time in the wg, but support for LDP is an important
operator requirement. The vast majority of L3VPNs run over LDP
"tunnels." I think that stating the why instead of how for the material
on this subject from section 4.2.3 is something we need to do. 

Also need to cover case mentioned by Ning that both LDP and RSVP-TE will
be present on the same composite link. I think you are proposing to add
unlabeled IP traffic to this (data plane) set as well. 

Also, traffic measurement based support at the composite link is
important.

Need to describe the operator requirements on what needs to happen in
the event of "Bandwidth Shortage Events"  See section 4.1.4.1. 

I was going to describe "auto-bandwidth" for RSVP-TE as a current method
used by operators that the composite link should still support. 

How control (routing, signaling) and management (OAM) packets are
directed to component links and how this needs to be done so that no
impacts to liveliness, adjacency and/or OAM occurs needs to be stated.

Add dynamic signaling and advertisement of lower layer component links,
and feedback from the lower layer regarding latency.  See 4.2.1.1.

Backward compatibility as you proposed on previous thread (merge with
text in 4.2.1.3)

Automatic derivation of routing metrics based upon signaled (or
measured) latency changes (4.2.1.3).


Would look to others who were keeping notes to add to the above list.

> 
>   8.  Some means is needed for an LSP which allows distribution of
>       flows across member links to indicate characteristics of flow
>       distribution.  These characteristics include:
> 
>       a.  The largest flow expected.

Please say more, not sure I understand this point. 

> 
>       b.  Other? [ Did we identify any other?  There was some
>           chatter about distribution of flows but no specific
>           characteristics was called for - AFAIK ]

See above; change control frequency,

Also, "pinning" in some way (see 4.1.1.1) based upon snooped  signaling
for inner LSPs and configuration (e.g., of FEC (ranges)). 

> 
>   9.  In some cases it may be useful to measure link parameters
>       and reflect these in metrics.  Link delay is an example.

I think this is one of the most important requirements. Editorially, it
should be toward the top of the list.

As commented by Dimitri; need to state requirements on the frequency of
latency measurements and the precision required.

> 
>   10. Some uses require an ability to bound the sum of delay metrics
>       along a path while otherwise taking the shorted path related to
>       another metric.  [This was mentioned but seems a bit orthogonal
>       to all but #3.]

See 4 above as well. We think the characteristics of 4.2.1.3 and 4.2.2.1
are important, but need to focus the text on the semantics of why these
are important instead of the current descriptions which are a how
assuming crankback style signaling as the solution approach. A
discussion of the how alternatives view could be moved to the framework
document.



Also as discussed in the meeting, a means to implement Diffserv for
MPLS-TE when composite links are present in the network is also
required. 

> 
> Purpose:
> 
>   [ A set of example scenarios were discussed.  We may want to capture
>     them here (and maybe refine the examples). ]

IMO, example scenarios,or solution sketches may be more appropriate for
a framework document. 

> 
> _______________________________________________
> rtgwg mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/rtgwg
> 
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

<Prev in Thread] Current Thread [Next in Thread>