[email protected]
[Top] [All Lists]

RE: Acknowledgement of Prior Work (Was: Composite Link Requirements)

Subject: RE: Acknowledgement of Prior Work Was: Composite Link Requirements
From: "Mcdysan, David E"
Date: Thu, 4 Mar 2010 07:40:47 -0500
Hi Curtis,

When Ning, Andy, Lucy and I first submitted this draft in October 2008,
there was discussion about including IP traffic. The discussion resulted
in a decision to address the MPLS focused case first. 

You are asking the wg to re-evaluate this decision. No one else has
responded to this thread and I would like to wait and hear from others
before making more detailed replies.

A few responses in line below.

Dave 

> -----Original Message-----
> From: [email protected] [mailto:[email protected]] 
> On Behalf Of Curtis Villamizar
> Sent: Wednesday, March 03, 2010 1:27 AM
> To: Mcdysan, David E
> Cc: [email protected]
> Subject: Re: Acknowledgement of Prior Work (Was: Composite 
> Link Requirements)
> 
> 
> In message 
> <[email protected]
> .verizon.com>
> "Mcdysan, David E" writes:
> >  
> > Hi Curtis,
> >  
> > The co-authors of this draft reviewed your comments and decided to 
> > respond with three separate messages, to separate the threads as 
> > follows so that all of the issues you raise can be resolved 
> efficiently.
> >  
> >     #1 Composite Link Trademark Issue (Was: Composite Link
> > Requirements)
> >     #2. Acknowledgement of Prior Work (Was: Composite Link
> > Requirements)
> >     #3. Proposed Resolution of Comments (Was: Composite Link
> > Requirements)
> >  
> > This is thread #2.
> >  
> > Dave
> >  
> > > -----Original Message-----
> > > From: [email protected] [mailto:[email protected]] On 
> > > Behalf Of Curtis Villamizar
> > > Sent: Saturday, February 27, 2010 4:00 AM
> > > To: [email protected]
> > > Subject: Composite Link Requirements
> > > 
> > > 
> > > Hi there good people of RTGWG,
> > > 
> > > This is in regards to the goals that are embodied in the RTGWG 
> > > acceptance of a draft to deal with requirements for 
> composite link, 
> > > currently named draft-ietf-rtgwg-cl-requirement-00.txt
> > > 
> > > 
> > > I'm bringing up two issues in this email.  One is prior composite 
> > > link work and the
> >  
> > This is the subject of this thread: 
> > > other is prior methods of
> > > handling composite link, which should be acknowledged.  
> >  
> >  
> > Snipped
> >  
> > > 
> > > Note that ITU's G.800 does not define what a composite 
> link is and 
> > > only mentions composite four times in the document, 
> including use of 
> > > composite link and composite trail.  The figure indicates that a 
> > > composite link is "inverse multiplexing".  For this 
> reason, I don't 
> > > think G.800 should be referenced because its a big load 
> of **** with 
> > > only slight mention of CL.
> > > 
> >  
> > Part of the rtgwg acceptance of this a wg draft was to include a 
> > reference to G.800. Lucy pointed out that the current text 
> in section
> > 2.2 is a paraphrase. We could replace with the following quote from 
> > section 6.9.2 of G.800, if that is wg consensus:
> >  
> > "Multiple parallel links between the same subnetworks can 
> be bundled 
> > together into a single composite link. Each component link of the 
> > composite link is independent in the sense that each 
> component link is 
> > supported by a separated service layer trail. The composite link 
> > conveys communication ion using different server 
> layer trails 
> > thus the sequence of symbols cross these links may not be 
> preserved."
> >  
> > The text related to Inverse multiplexing is one of three cases in 
> > section 6.9.2. The text above is the first case. G.800 states that 
> > these are separate cases.
> >  
> > The text above Figure 16 in G.800 related to concatenated server 
> > trails may also be relevant (at least in the framework).
> >  
> > So, the choices are replace current text with the direct 
> quote, remove 
> > the reference to G.800 and replace it with some other text.
> >  
> > WG comments?
> 
> If I understand this definition of CL encompasses all 
> existing LAG (or non-Ethernet LAG like aggregation) and 
> existing ECMP and existing unequal mulltipath techniques.

In G.800 yes, in terms of the draft the intent was to limit the scope to
MPLS-labelled flows.

> 
> If so, then the requirements here define yet another instance 
> of CL as defined by ITU.
> 
> You might want to acknowledge that there was a very similar 
> prior registered tradement meaning of CL that is now 
> abandonned if for no other reason to say "that's not what we mean".
> 
> > Text Snipped
> >  
> > > 
> > > Second issue is how CL has been handed in the past.
> > > 
> > > Whether it was two links to two places that took completely 
> > > different paths (trails in ITU speak but this is IETF 
> where we say 
> > > path), or two parallel links, this has been called ECMP 
> in IETF (and 
> > > elsewhere) for two decades or more.  Both ISIS and OSPF 
> use the term 
> > > ECMP.  The techniques used for ECMP load balance was discussed on 
> > > IETF lists quite a bit in the early to mid-1990s.  The three 
> > > techniques applied to IP networks (in the terminology of 
> that time) 
> > > were:
> > > 
> > >   1.  per packet load balance
> > >   2.  per bit or byte load balance aka bit striping or inverse-mux
> > >   3.  IP src/dst hash
> > > 
> > > The second is applicable only to parallel links.  Using larger 
> > > chunks it is also the technique used in MPPP (multilink 
> PPP).  MPPP 
> > > is also sometimes abbreviated PPP-ML, though not in the 
> RFC.  MPPP 
> > > is no longer of much interest as it was only applied to low speed 
> > > links.
> > > 
> > > The per packet load balance caused packet reorder and a 
> great deal 
> > > of grief for service providers, hence the abundance of discussion 
> > > within IETF at the time.  The use of IP src/dst hash, while 
> > > widespread and widely discussed, did not get documented in an RFC 
> > > until Chris Hopps and Dave Thaler wrote RFC 2991 
> "Multipath Issues 
> > > in Unicast and Multicast Next-Hop Selection" and RFC 2992 
> "Analysis 
> > > of an Equal-Cost Multi-Path Algorithm" in November 2000.  
> (at least 
> > > AFAIK).
> > > 
> > > The IP src/dst technique itself is beleived to have originated in 
> > > the T1-NSFNET, which puts its use back to circa 1987.
> > > 
> > > The OMP work predates RFC 2991 and RFC 2992 but never 
> made it past 
> > > the internet-draft stage.  In that work the use of 
> src/dst hash and 
> > > the use of adaptive algorithms with src/dst hash is 
> discussed.  On 
> > > the IETF mailing lists even methods of implementation were 
> > > discussed, table based and parallel sets of comparator 
> pairs (TCAM 
> > > like).
> > > 
> > > Circa 2000 there was a lot of discussion of the use of the MPLS 
> > > label stack to provide the entropy for ECMP vs looking past the 
> > > label stack at the IP payload.  Today's PW control word 
> acknowledges 
> > > this common practice and avoids it for PW, but the fat-pw aka 
> > > entropy label puts better entropy back into PW.
> > > 
> > > In practice today, all core hardware uses the same IP 
> src/dst hash 
> > > to provide a load balance for ECMP and LAG.
> > > 
> > > 
> > > The existing internet-draft acknowledges link bundling, 
> but does not 
> > > accurately characterize ECMP and LAG and the src/dst hash 
> technicque 
> > > used by both, nor does it acknowledge the prior OMP work.
> > > 
> >  
> > The existing draft states some of these points already, but 
> there is 
> > certainly more background information that you provide. It 
> seems that 
> > you have more specific suggestions on the text in the draft 
> and that 
> > is where we propose specific changes to address your 
> comments. Another 
> > approach could be to add more text on IP-related load balancing as 
> > compared with the MPLS-based load balancing which is the 
> focus of the 
> > draft.
> 
> There are two things I'd like to see changed though I was not 
> clear in that I didn't provider suggested changes to text.  
> If you agree in principle, then I can provide some text.
> 
> The two are:
> 
>   1.  Accurately characterize what exists today, what existing CL
>       techniques have come before this, in use or not, and accurately
>       characterize the common use cases of existing CL.

The amount of detail here needed depends on the scope decision. Accuracy
is of course important.

> 
>   2.  State as a requirement (we are at the requirement stage) that to
>       the exent possible new CL capability will:
> 
>       1.  Continue to accommodate common use cases today, including an
>           ability to carry IP traffic which MAY BE omitted in an
>           implementation but MUST be accommodated, at least as an
>           option, by any proposed solution.

Dependent upon a wg agreement to increase scope to include IP.

> 
>       2.  Retain backward compatibility with existing MPLS/GMPLS LSR
>           with no loss of existing capability, but possibly no gain in
>           functionality if the legacy LSR is anywhere on the LSP path
>           include as an LER.

IMHO this is desirable (SHOULD) but not mandatory (SHALL).

> 
> If the characterization of existing CL gets too long it could 
> be a separate informational internet-draft that is referenced 
> but I don't think it will get that long.

Editorially, I like this suggestion. The current draft is on the verge
of being too long already. If an agreement to increase the scope is made
by the wg then I think this should be done.

> 
> > Could you provide a URL for the prior OMP work that we can add as 
> > informative reference.
> 
> Did you want one that still works?  :-)
> 
> BTW- Its a part of your current employer that very abruptly 
> shut down the web site that had this and some other work on 
> it (UUNET after I left shut down engr.ans.net).  Some content 
> was never recovered, but this is just an aside.

The world is ever changing. Many things are not as they once were. :)

> 
> Data tracker is probably the best reference.
> 
> https://datatracker.ietf.org/doc/draft-ietf-ospf-omp/
> https://datatracker.ietf.org/doc/draft-ietf-isis-omp/
> https://datatracker.ietf.org/doc/draft-villamizar-mpls-omp/
> 
> There are also IETF WG meeting minutes and lots of mailing 
> list archive discussion but no need to reference them.

Thanks. For the draft could you also please provide a few sentences as
to why this was never standardized by the IETF.

> 
> > Also, the scope of the draft is MPLS, which is not covered 
> > specifically in the points above.
> >  
> > Remainder of orignal message snipped.
> 
> The scope of the draft being only MPLS is one of the big 
> issues.  The existing use cases for LAG/ECMP are:
> 
>   IP traffic:
>     provider core networks:
>       hash based on IP source and IP destination address
>     provider non-core:
>       hash based on IP source and IP destination address
>       and sometimes UDP and TCP ports (but rarely)
>     enterprise:
>       hash based on IP source and IP destination and UDP or TCP port
> 
>   MPLS traffic:
>     hash based on a subset of the MPLS stack
>     size of the subset varies with vendors
>     below some total stack depth (typically 8) the BOS label 
> is included.
>     number of labels included typically varies from 3-8
>     less than 3 labels doesn't work well due to inadequate diversity
>     using the top label very rarely (if ever) works well
>     (VPN label, LDP label typically are not diverse enough)
> 
> What an implementation does with the hashed value varies:
> 
>   At one extreme is the simplicity of just doing a modulo.  This
>   doesn't work if component links are not all the same bandwidth.
> 
>   A little better is a table lookup based on the hash which allows a
>   more even split across component links that are not the same.
> 
>   Still better is a table lookup (or other implementation) that uses
>   feedback to adjust load balance.  This include two subcategories:
> 
>     Feedback is internal to a single NE (Avici's CL is an example), is
>     transparent to other NE, and requires no signaling changes.  Avici
>     calls this CL but I've also heard the term Adaptive LAG somewhere.
> 
>     Feedback is external and requires signaling extensions (OMP is an
>     example but was never implemented by an equipment vendor).
> 
> The existing use of Link Bundling for MPLS is:
> 
>   Maximum LSP Bandwidth is advertised.
> 
>   Accounting is on a per component basis.
> 
>   The RRO identifies both the component link and label and therefore
>   cannot be changed.
> 
> This is very rough text (an outline really).  If you agree in 
> principle to add something I can clean it up.

The above seems to be text that would better suited for a separate
informational draft that summarizes prior approaches.

> 
> The CL requirements should specify a delta based on what 
> exists.  We should also acknowledge a problem with LAG that 
> there is no way to signal characteristics of the LAG.  Having 
> this a requirement makes it easier to propose a document to 
> fix this (though not necessarily the same document that 
> proposes a new, possible MPLS only CL).  All that is really 
> needed for LAG is a maximum allowed microflow size (size of 
> component link minus epsilon for adaptive LAG, a configured 
> fraction of component link size for simple LAG).  An LSP can 
> then signal a largest expected microflow.  For example, an 
> LSP carrying nothing but a set of 1GbE PW is not going to 
> have a microflow larger than 1G.

This appears to be a good example of where there could be commonality of
solution between MPLS and IP. 

> 
> The requirement document need not preclude proposing a CL 
> type that is MPLS only, but it should definitely not mandate it.

This is related to the scope decision.

> 
> Do you agree in principle to these sort of changes?  If you 
> at least partially agree I'll write something more concrete 
> that you could consider.
> 
> Curtis
> _______________________________________________
> rtgwg mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/rtgwg
> 
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

<Prev in Thread] Current Thread [Next in Thread>