<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc strict="no"?>
<?rfc rfcedstyle="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="std" docName="draft-ietf-rift-sr-03" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" xml:lang="en" tocInclude="true" tocDepth="3" symRefs="true" sortRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 2.45.3 -->
  <front>
    <title abbrev="SRIFT">SRIFT: Segment Routing in Fat Trees</title>
    <seriesInfo name="Internet-Draft" value="draft-ietf-rift-sr-03"/>
    <author fullname="Zhaohui Zhang" initials="Z." surname="Zhang">
      <organization>Juniper Networks</organization>
      <address>
        <email>zzhang@juniper.net</email>
      </address>
    </author>
    <author fullname="Jeff Tantsura" initials="J." surname="Tantsura">
      <organization>Nvidia</organization>
      <address>
        <email>jefftant.ietf@gmail.com</email>
      </address>
    </author>
    <author fullname="Jordan Head" initials="J." surname="Head">
      <organization>Juniper Networks</organization>
      <address>
        <email>jhead@juniper.net</email>
      </address>
    </author>    
    <workgroup>RIFT</workgroup>
    <abstract>
      <t>
<!-- This document specifies signaling procedures for Segment Routing <xref target="RFC8402" format="default">RFC8402</xref> 
using <xref target="RIFT" format="default">RIFT</xref>'s Key-Value distribution mechanism.
  Top-of-Fabric (ToF) nodes will use KV S-TIEs to distribute loopback 
address, System ID, Segment Routing Global Block (SRGB) information, 
and Node Segment Identifiers (Node-SID) for all nodes in a SRIFT domain so that 
the appropriate SR reachability information is established. -->
   This document specifies signaling procedures for Segment Routing	
   in RIFT. Each node's loopback address, Segment Routing	
   Global Block (SRGB) and Node Segment Identifier (Node-SID),
   which are typically assigned by a configuration management system and distibuted by routing protocols,
   are distributed southbound from the Top Of	
 	   Fabric (TOF) nodes via RIFT's Key-Value distribution mechanism, so that	
 	   each node can compute how to reach a segment represented by the active
 	   SID in a packet. An SR controller signals SR policies to ingress
	   nodes so that they can send packets with a desired segment list to steer
	   traffic.
      </t>
    </abstract>
    <note>
      <name>Requirements Language</name>
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
appear in all capitals, as shown here.</t>
    </note>
  </front>
  <middle>
    <section numbered="true" toc="default">
      <name>Introduction</name>
      <t>
Before we discuss the SR procedures for RIFT, let us first review how SR works
with OSPF <xref target="RFC8665" format="default"/> and IS-IS <xref target="RFC8667" format="default"/>.
      </t>
      <t>
Each node is provisioned with a loopback address as well as SRGB and Node-SID 
values.  The loopback address and Node-SID are centrally coordinated and are 
unique per-node within the SR network.  These values are then communicated to 
each node out-of-band and stored as configuration information.  Communication 
could be done via primitive pen and paper or via modern signaling (Netconf/YANG) from a configuration management system.
      </t>
      <t>
SRGB information represents the label range of the "global" labels that can be allocated on a 
particular node for SR.  SRGB could have more than one contiguous range of labeks allocated to it. 
It is comprised of the first available label value and the 
total number of available labels per range.  While in modern networks it is common for 
each node to have identical SRGB values so that a Node-SID will correspond to 
the same label on each node, this is not required as to allow for flexible label 
allocation.  In either scenario, SRGB is part of each node's configuration.  
In today's networks, it is likely pushed to nodes by a configuration management system.
      </t>
      <t>
Each node then signals its SRGB and Node-SID to the other nodes.  A Node-SID 
is an index value assigned to a node (say node X), and another node (say node Y)
uses the Node-SID to derive (from Y's SRGB) the label to use when sending traffic to node X.
      </t>
      <t>
Consider the following example illustrating Node A's computed IP route and label values.
      </t>
      <artwork name="" type="" align="left" alt=""><![CDATA[

                    B
                  *   *
                *       *
              *           *
            A               D
              *           *
                *       *
                  *   *
                    C                

Node Name   Loopback   Node SID   SRGB Label Base   SRGB Label Range
---------   --------   --------   ---------------   ----------------

A           10.1.1.1   1          100               50
B           10.1.1.2   2          100               50
C           10.1.1.3   3          200               50
D           10.1.1.4   4          100               50

	]]></artwork>

      <artwork name="" type="" align="left" alt=""><![CDATA[

   Destination    Next Hop
   -----------    --------
   10.1.1.1       local
   10.1.1.2       if_ab
   10.1.1.3       if_ac
   10.1.1.4       if_ab, if_ac
   
   Label           Next Hop
   -----           --------
   100 (La_a)      pop and look up next header
   101 (Lb_a)      swap to 101 (Lb_b), via if_ab
   102 (Lc_a)      swap to 202 (Lc_c), via if_ac
   103 (Ld_a)      swap to 103 (Ld_b), via if_ab
                   swap to 203 (Ld_c), via if_ac

	]]></artwork>
	  <t>
  The specific notation Lb_a refers to the label derived for node B,
  using B's Node-SID as index into A's SRGB. Similarly, Ld_c refers to the
  label derived for Node D, using D's Node-SID as index into C's SRGB.
	  </t>
      <t>
Node A computes the route to Node D's loopback address. The next hops 
are Node B (via if_ab) and Node C (via if_ac).  Node A uses Node D's 
Node-SID (which was advertised along with the loopback address) to index 
into its local SRGB to obtain a label value of 103 (Ld_a).  Furthermore, 
Node A also uses Node D's Node-SID to derive label values for Node B and 
Node C, 103 (Ld_b) and 203 (Ld_c) respectively, using D's Node-SIDs
as index into B and C' SRGBs respectively.  Notice that Node C's SRGB 
is different from the other nodes.  Node A can now program its label forwarding 
state with (Ld_a --&gt; (via if_ab swap to Ld_b, via if_ac swap to Ld_c)).
      </t>
      <t>
Similarly, Node B computes the route to Node D's loopback address, but this time 
finds that the next hop is Node D itself (via if_bd).  Node B will also use 
Node D's Node-SID (again, advertised with the loopback address) to index into 
its local SRGB and obtain a label value of 103 (Ld_b) and index into Node D's 
SRGB and obtain a label value of 103 (Ld_d).  The label forwarding 
state can be programmed with (Ld_b --&gt; via if_bd swap to Ld_d).  Finally, 
Node D programs its label forwarding state with (Ld_d -&gt; pop and lookup next
header).
      </t>
    </section>
    <section numbered="true" toc="default">
      <name>SR in RIFT (SRIFT) </name>
      <t>  
In referring to the previous section, it is clear that each RIFT node 
participating in a SR domain requires the following information:
      </t>
      <ul spacing="normal">
        <li>
SRGB values of all adjacent nodes
    </li>
        <li>
Node-SID values of all nodes participating in the routing domain
    </li>
        <li>
Loopback addresses or System IDs of all other nodes
    </li>
      </ul>
      <t>
In OSPF and IS-IS, each node's SR information is simply flooded.  <!--In RIFT, Node North TIEs 
are flooded to all north nodes, whereas Node South TIEs are only flooded to 
nodes one hop south (and then reflected one hop north). -->With RIFT, Node TIEs could be used to flood SR information, but each node would have to learn its own SR information first. With RIFT's Key-Value mechanism, KV-TIEs can be used for TOF nodes to flood all nodes' SR information that it learns from an SR controller, therefore accommodating both provisioning and signalling of SR. The non-TOF nodes do not need any SR related
provisioning, which goes very well with RIFT's ZTP concept.
      </t>
      <t>
ToF nodes in an SR domain MUST 
populate KV South TIEs with the minimum required SR information for each node.  Specifically 
SRGB Label Base, SRGB Label Range, Node-SID, RIFT System ID, and Loopback Address. 
While the Loopback Address must be included, it MAY be set to an empty value in cases if loopbacks are not configured for nodes.
      </t>
      <t>
Traffic forwarding in an SR network is typically done in two ways.
      </t>
      <t>
The first option is to use Prefix-SIDs and allow traffic to follow the shortest
paths for the prefixes. Prefix-SIDs for node prefixes, i.e. Node-SIDs (for loopback addresses),
can be used both for encapsulating service traffic to service nodes (e.g. VPN
PEs) and for SR-TE traffic steering purposes (see below), but the benefits
of other Prefix-SIDs are not clear, so currently only Node-SIDs are supported
with RIFT.
	  </t>
      <t>
The second option is to use SR-TE and follow a specific segment list in the 
packet header. Each node in the path steers the packet to the currently
active segment in the list, following the natural path for that segment
(see above). Since a node only has the full topology south of it,
and a leaf node does not have any south topology, the traffic steering
information (i.e. the segment list) must be programmed by controllers into
ingress nodes via SR policies.</t>
<t>
  Support for Adjacency SIDs will be considered in future revisions.
</t>
      <t>
Consider the following 4-level topology:
      </t>
      <artwork name="" type="" align="left" alt=""><![CDATA[

                      ToF1                      ToF2

             Spine1_11    Spine1_21  |  Spine1_21    Spine1_22
                                     |
             Spine2_11    Spine2_21  |  Spine2_21    Spine2_22
                                     |
               Leaf11       Leaf12   |   Leaf21       Leaf22

	]]></artwork>
      <t>
   Suppose the TE controller instructs Leaf11 to send a packet to Spine2_11
   with label stack (Label_TOF2, Label_Spine2_21, Label_Leaf21).
   Spine2_11 recognizes that Label_TOF2 maps to node TOF2 and it
   should not simply follow the default route (because the default route
   could lead to an unintended path via TOF1).  In other words, each
   node needs to have a specific route to every node (that may appear
   in the segment list).  That
   means for RIFT the southbound distance vector routing needs to
   additionally advertise routes for the nodes in the north, and they
   must be propagated all the way down. Each node originates a route for
   its own loopback address
   and advertises it southbound, with a special marking that allows a
   south node to re-advertise it further south.
	  </t>
	  <t>
   If loopback addresses are not used, similar "routes" for System IDs
   must be used. It is RECOMMENDED to use loopback addresses to
   reuse existing mechanisms.
	  </t>
    </section>
    <section numbered="true" toc="default">
      <name>SRIFT-Node Key-Type</name>
      <t>
This section requests an entry from the RIFT Key-Types Registry for RIFT
networks that use SR along with suggested values in accordance with <xref target="I-D.ietf-rift-kv-tie-structure-and-processing" format="default"/>.
      </t>
      <table align="left">
        <name>Requested Entries</name>
        <thead>
          <tr>
            <th align="left">Name</th>
            <th align="left">Value</th>
            <th align="left">Description</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td align="left">SRIFT-Node</td>
            <td align="left">TBD</td>
            <td align="left">Key-Type describing a SRIFT node</td>
          </tr>
        </tbody>
      </table>
      <section numbered="true" toc="default">
        <name>SRIFT Node Key-Type</name>
        <artwork align="left" name="" type="" alt=""><![CDATA[
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      TBD      |      Key Identifier for a SRIFT Node          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     (System ID,                                               |
   |      Loopback Address,                                        | 
   |      SRGB Label Base,                                         |
   |      SRGB Label Range,                                        |
   |      Node-SID)                                                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ]]></artwork>
        <t>where:</t>
        <ul empty="true" spacing="normal">
          <li>
            <dl newline="true" spacing="normal">
              <dt>System ID:</dt>
              <dd>A node's 64-bit RIFT System ID.</dd>
              <dt>Loopback Address:</dt>
              <dd>A node's loopback address.  This MAY be set to 0 if loopback addresses are not used.</dd>
              <dt>SRGB Label Base:</dt>
              <dd>The first valid label within the corresponding node's SRGB.</dd>
              <dt>SRGB Label Range:</dt>
              <dd>The total number of valid labels in the corresponding node's SRGB.</dd>
              <dt>Node-SID:</dt>
              <dd>The corresponding node's Node-SID value.</dd>
            </dl>
          </li>
        </ul>
		<t>The Key Identifier for a RIFT node is assigned by an orchestrator,
		which distributes the KVs to all ToF nodes. In the simplest scenario,
		the Key Identifier could be the SID index into the SRGB.
		The ToF nodes all have
		the same KVs learned from the orchestrator and distribute them
		southbound. A node may learn the same KV from multiple north nodes
		and the tie-breaking will lead to the same information to be selected
		and distributed further south.
		</t>
      </section>
    </section>
    <section numbered="true" toc="default">
      <name>Security Considerations</name>
      <t>This document does not introduce any new security concerns with RIFT or 
      any other referenced protocols.  RIFT KV TIEs are already extensively 
      secured via RIFT's specification.
      </t>
    </section>
    <section anchor="Acknowledgements" numbered="true" toc="default">
      <name>Acknowledgements</name>
      <t>
The authors thank Bruno Rijsman and Antoni Przygenda for their review 
and suggestions.
      </t>
    </section>
  </middle>
  <back>
    <references title="Normative References">
      <?rfc include='reference.RFC.2119'?>
      <?rfc include='reference.RFC.8174'?>
      <?rfc include='reference.I-D.ietf-rift-kv-tie-structure-and-processing'?>
	  <?rfc include='reference.I-D.ietf-rift-rift'?>
    </references>

    <references title="Informative References">
      <?rfc include='reference.RFC.8665'?>
      <?rfc include='reference.RFC.8667'?>
    </references>
  </back>
</rfc>
