Internet-Draft | CATS BGP | September 2024 |
Li & Liu | Expires 28 March 2025 | [Page] |
CATS (Computing-Aware Traffic Steering) is a traffic engineering approach that takes into account the dynamic nature of computing resources and network state to optimize service-specific traffic forwarding towards a service contact instance. This document defines the control plane BGP extension for CATS (Computing-Aware Traffic Steering).¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 28 March 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
When deploying edge clouds, multiple service instances might be deployed in multiple edge sites to provide equivalent function to end users. However, the resource of each edge site is limited due to only limited end users accessing to it locally. When burst traffic is generated by emergent events, the traffic should be balanced to multiple edge sites to provide better services to the end users. In order to provide better traffic steering among edge sites, a framework called CATS (Computing-Aware Traffic Steering) [I-D.ietf-cats-framework] is proposed.¶
CATS (Computing-Aware Traffic Steering) [I-D.ietf-cats-framework] is a traffic engineering approach that takes into account the dynamic nature of computing resources and network state to optimize service-specific traffic forwarding. Various relevant metrics are defined in [I-D.ysl-cats-metric-definition], which can be used in CATS.¶
A general BGP extension of conveying computing metrics are defined in [I-D.ietf-idr-5g-edge-service-metadata]. It proposed a new Path Attribute, called Metadata Path Attribute, and some related sub-TLVs to carry the computing related metadata.¶
This document defines the BGP extension for CATS control plane. Depending the deployment of CATS, there are two modes for CATS control plane, 1) Basic mode and 2) Sharing mode. For the basic mode, CATS can reuse the BGP extension defined in [I-D.ietf-idr-5g-edge-service-metadata]. For the sharing Mode, this document defines a new mechanism based on the basic extension defined in [I-D.ietf-idr-5g-edge-service-metadata].¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document uses terms defined in [I-D.ietf-cats-framework]. Some terms are listed below for clarification.¶
Computing-Aware Traffic Steering (CATS): An architecture that takes into account the dynamic nature of computing resources and network state to steer service traffic to a service instance. This dynamicity is expressed by means of relevant metrics.¶
Service: An offering that is made available by a provider by orchestrating a set of resources (networking, compute, storage, etc.).¶
Service instance: An instance of running resources according to a given service logic.¶
Generally, in edge cloud, each service exclusively occupies some computing resources in the edge cloud. Therefore, in the CATS framework, when a service needs to synchronize the computing status of the occupied computing resource to the network, the computing status is transferred in a per-service manner. In other words, each service has its own computing status information release and control plane processing. This mainstream deployment mode is called the basic mode. In this mode, each service exclusively occupies computing resources, causing the price of cloud resources and the CATS scheduling may be high, which is more suitable for large-scale content/service providers such as OTT. However, the price of this deployment mode may be too high for small or medium-sized content/service providers, which prevents a large number of small- and medium-sized content/service providers from choosing CATS services.¶
In order to allow more small- and medium-sized content/service providers to choose edge computing and CATS scheduling services, this document proposes a new mode, called sharing mode. In sharing Mode, multiple services can use the same shared resource and share the CATS scheduling service in the BGP control plane, reducing the price of purchasing edge cloud services and CATS scheduling services. This deployment mode also enables carriers to obtain more service providers.¶
The detailed BGP extension of basic and sharing modes will be described in the following sections.¶
In the basic mode, each service is deployed independently, including occupying independent computing resources in edge site. The computing resources update is triggered by the service itself and independent with each other, for example, by using an independent BGP update message. In this mode, the computing metrics update and the CATS steering decision is implemented in a service-oriented way.¶
In order to support 5G services in edge cloud, a general mechanism with BGP extensions is defined in [I-D.ietf-idr-5g-edge-service-metadata]. CATS framework basic mode can reuse the extensions defined in [I-D.ietf-idr-5g-edge-service-metadata]. However, in real implementation, only the basic extensions instead of all the extensions defined are required to be implemented. This document will specify the mandatory part and optional part of extensions in CATS implementation to support easer implementation and inter-operability test.¶
This section specifies the mandatory extensions for CATS in implementation.¶
In CATS, the fundamental features are reporting the computing metrics from the edge site to the network devices, such as the egress router of CATS overlay path, which is connected to the edge site. Therefore the basic and mandatory features are reporting the service related capability and available resource to the network devices. When the egress CATS forwarder [I-D.ietf-cats-framework] receive the computing metrics, it can reuse the Service-Oriented Capability and Service-Oriented Available Resource sub-TLVs to distribute the metrics to the ingress CATS forwarder.¶
Capability information describes the total resources that can be used by a service, which is important in selecting the best service contact instance, so the distribution of this information is mandatory.¶
This document reuses the Service-Oriented Capability sub-TLV in Metadata Path Attribute [I-D.ietf-idr-5g-edge-service-metadata] to distribute the capability information from an egress CATS forwarder to an ingress CATS forwarder. The format of Service-Oriented Capability sub-TLV is shown below for reference.¶
The received capability information of a service is encoded by the egress CATS forwarder into the SO-CapValue field, and the sub-TLV will be encoded into the Metadata Path Attribute. The whole mechanism of encoding and processing the capability reuses the mechanism defined in [I-D.ietf-idr-5g-edge-service-metadata].¶
Service-Oriented Available Resource sub-TLV [I-D.ietf-idr-5g-edge-service-metadata] is defined for distributing the real-time available resource of a service. The real-time available resource is vital for some low-latency service/applications which need the dynamic resource status to achieve on-time load balancing. Therefore, the distribution of available resource for a service is mandatory in CATS implementation.¶
This document reuses the Service-Oriented Available Resource sub-TLV in Metadata Path Attribute [I-D.ietf-idr-5g-edge-service-metadata] to distribute the available resource information from an egress CATS forwarder to an ingress CATS forwarder. The format of Service-Oriented Available Resource sub-TLV is shown below for reference.¶
The received available resource information of a service is encoded by the egress CATS forwarder into the SO-AvailRes field, and the sub-TLV will be encoded into the Metadata Path Attribute. The whole mechanism of encoding and processing the capability reuses the mechanism defined in [I-D.ietf-idr-5g-edge-service-metadata].¶
The above two sub-TLVs and the Metadata Path Attribute are required in CATS implementation.¶
In order to support some enhanced features, the following optional extension can be implemented when implementing CATS.¶
Site preference indicates the preference of each site when comparing service contact instances from different site. The preference value may be generated from several factors, such as the price of energy, renting fee of DC rack, etc. However, this is a site-level selection instead of service-level. Except the same site-level conditions, such as energy price, other service specific factors may be more important when selecting a better service contact instance from sites.¶
In addition, the factors such as energy price can be added as a factor in generating the normalized metric value of Service-Oriented Capability and Available Resource. Therefore, Site Preference is an optional metric in CATS implementation, and the extension and processing refers to the Site Preference Index Sub-TLV defined in [I-D.ietf-idr-5g-edge-service-metadata].¶
Similar to Site Preference, Site Physical Availability Index is a site-level metric, which is useful in batch processing of BGP updates. This is an enhancement of the CATS implementation, which can be implemented optionally.¶
The end-2-end delay information is vital for time-sensitive service, such as 5G uRLLC services. However, it is not easy to detect the real-time delay in deployment. As per [I-D.ietf-idr-5g-edge-service-metadata], Service Delay Prediction Index sub-TLV is defined for encoding the delay information. It includes two types of delay formats, one is for delay prediction index, another one is the NTP format of raw delay information. CATS implementation can reuse this mechanism.¶
Distributing real-time delay information may bring too heavy burden to the BGP control plane, therefore this document defines it as an optional feature in CATS implementation. Furthermore, the delay information can be predicted by some raw metrics, such as the capability of GPU/CPU, which can be considered as factors when generating the normalized Service-Oriented Capability and Service-Oriented Available Resource metrics.¶
Raw metadata is too complicated in standardization and implementation, therefore, this document does not recommend to implement the mechanism of Raw Metadata defined in [I-D.ietf-idr-5g-edge-service-metadata]. An implementation might choose to implement it.¶
In the basic mode, all the extensions and mechanisms can reuse the extensions and mechanisms defined in [I-D.ietf-idr-5g-edge-service-metadata], therefore, this document will not explain the details of processing any sub-TLVs.¶
This section will be updated in the future revision.¶
This section will be updated according to the progress of the sharing mode.¶
The authors would like to thank Zongpeng Du, Huijuan Yao, Kehan Yao, Guofeng Qian, Haibo Wang, Xia Chen, Jianwei Mao, Zhenbin Li, Xinyue Zhang, Weier Li, and Linda Dunbar for their comments and suggestions.¶