Internet-Draft | Flow-Level Load Balancing of Computing-A | July 2024 |
Fu, et al. | Expires 25 January 2025 | [Page] |
This document proposes a flow-level load balancing solution for CATS, and is designed to effectively manage CS-ID traffic by addressing issues like frequent control plane operations and uneven use of computing resources. The approach entails concurrently identifying multiple next-hop choices, factoring in both network pathways and service instances. Traffic is then distributed among these service instances using flow-based load balancing, which relies on the five-tuple characteristics of packets.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 25 January 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Computing-Aware Traffic Steering (CATS) [I-D.ldbc-cats-framework] targets efficient routing at the network edge, directing traffic between service clients and providers. It relies on real-time computing and network status data for informed decisions. CATS operates as an overlay system, choosing optimal service instances for requests, yet the CATS framework does not assume any specific data plane and control plane solutions.¶
This proposal suggests deploying a flow-level load balancing mechanism for CATS to tackle issues related to frequent control plane activities and imbalanced resource utilization. The approach focuses on CS-ID traffic and involves determining multiple next-hop alternatives by considering both network routes and service instance identitifiers. Traffic is then distributed based on the five-tuple of packets, ensuring efficient workload allocation. The control plane concurrently identifies multiple paths and service instances that adhere to Service Level Agreement (SLA) Requirements, while the data plane enhances forwarding effectiveness through equal-cost multi-path routing techniques.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document makes use of the terms defined in [I-D.ldbc-cats-framework].¶
The current CATS network technologies utilize periodic or threshold-triggered resource status reports to optimize the selection of service instances and paths to meet quality of service requirements. However, this approach faces two primary challenges:¶
*Firstly, there is an uneven utilization of computing resources, resulting in imbalances in resource distribution when longer reporting intervals or threshold triggers are employed. This can lead to a situation where the same service instance receives multiple requests, causing a temporary imbalance in resource distribution.¶
*Secondly, the frequent operations on the control plane due to resource imbalances can result in increased calculation and update tasks on the control plane. While incremental calculation and policy delivery may provide some relief, they do not address the underlying issue.¶
In order to address these obstacles and achieve a fairer and more effective distribution of resources while meeting service level agreement requirements, it is crucial to tackle issues related to uneven resource usage and manage excessive load.¶
To address the aforementioned challenges, a flow-level load balancing solution has been developed based on the following guiding principles:¶
1. Minimizing the impact of state changes on individual calculation instances and reducing the frequency of calculations and updates in the control plane.¶
2. Extending the update intervals for CATS routing table entries to ensure load balancing on the data plane.¶
The proposed solution, as detailed in this document, involves the simultaneous calculation of multiple network paths and service instances that meet SLA requirements. Each unique next-hop entry in the CATS routing table contains both a network path and a service instance, facilitating non-equivalent load balancing during service forwarding to optimize overall performance.¶
Furthermore, the Flow-Level Load Balancing of Computing-Aware Traffic Steering is constructed based on the framework established in the CATS architecture [I-D.ldbc-cats-framework](Figure 1 for a visual representation).¶
Both documents, [I-D.lbdd-cats-dp-sr] and [I-D.fu-cats-muti-dp-solution], utilize anycast IP addresses for computing services in CS-ID. When the egress CATS-forwarder is connected to multiple service instances, traffic is steered to the appropriate instance via END.DX4/6 Service SID. Conversely, with a single service instance, traffic is steered using the END.DT4/6 Service SID along with the anycast IP address.¶
To simplify the expression, the selection result of C-PS is called CATS routing table, and the entry used for forwarding packets on the forwarding plane is called CATS forwording table.¶
The C-PS component is conventionally situated in the head node or central network controller. Here, it collects service instance status like CS-ID , CIS-ID, and Metrics through the C-SMA component.Furthermore, it obtains network capacity and status information via the C-NMA component.¶
The C-PS component, considering the SLA requirements associated with the CS-ID, processes the collected data to determine viable network path and service instances that conform to the SLA. Subsequently, it allocates traffic share ratios among these identified paths.¶
The outcome is translated into VRF-ID, CS-ID, and a set of multiple next-hop destinations (such as SR-Policy and service SID) which incorporate load sharing proportions to direct the forwarding of service packets within the data plane. It is crucial to limit the number of next-hops in accordance with hardware capabilities and opt for the most efficient paths that adhere to the SLA requirements.¶
Figure 2 shows an example of a representation of multi-next-hop CATS routing table designed for a specific CS-ID1.¶
The C-PS component calculates the CATS routing table, which is subsequently translated into a data plane strategy. This strategy entails decomposing the Unequal-Cost Multiple Path (UCMP) routing for traffic load balancing into multiple Equal-Cost Multi-Path (ECMP) entries. This process resembles the conventional conversion of IP UCMP to ECMP at the hardware level.¶
For instance, if the original CATS routing table indicates four next-hops with a load-sharing ratio of 2:3:3:2, this would result in 10 ECMP routing entries upon conversion. To maintain consistency with the ECMP load-balancing rule, each of these entries is then duplicated according to a predefined UCMP ratio. This ensures that packet forwarding occurs efficiently and aligns with the ECMP balance principle.¶
Figure 3 shows an example of the CATS forwarding table following the changes.¶
The following procedure describe how it works in general.¶
1)Ingress CATS-Forwarder gets user's computing service request, extracting VRF-ID, interface, and CS-ID.¶
2)Ingress CATS-Forwarder checks the forwarding table with these IDs. If found, it proceeds to Step 3; otherwise, it discards the packet.¶
3)Ingress CATS-Forwarder searches the flow affinity table. With a match, it retrieves SR-Policy and Service SID to forward the packet in Step 5; if not, it goes to Step 4.¶
4)Ingress CATS-Forwarder hashes packet attributes, finds next-hop in the forwarding table, gets SR-Policy and Service SID, and creates a flow affinity table entry for future packets. This ensures consistent routing and load balancing.¶
5)Ingress CATS-Forwarder adds SRH based on gathered information and forwards the IPv6 packet using SRH for underlay routing.¶
6)Egress CATS-forwarder removes SRH and sends the packet based on Service SID: END.DX sends to a tunnel, END.DT uses destination IP according to VRF-ID.¶
7)The service instance processes the request and sends a response.¶
The C-SMA component uses multi-level gradient thresholds to monitor the performance of service instances, such as latency and bandwidth.It sets different standards for delay (x1, x2,..., xM) and bandwidth (y1, y2,..., yN). Once the service instance delay or bandwidth reaches the critical status, the C-PS component immediately calculates and selects the path to the service location and instance.¶
To enhance the process, it is suggested to blend threshold alerts with session-based load balancing. This could evenly distribute user sessions across networks and instances, and minimize instances surpassing limits, creating a low-frequency feedback loop that lessens control overhead.¶
It is important to highlight that load balancing operations are conducted at the ingress CATS-Forwarder. Before creating a flow affinity table, the CATS forwarding table can be directly used by the data plane or control plane to process the first packet, and the next hop is determined by the 5-tuple HASH.¶
TBD.¶
To be added upon contributions, comments and suggestions.¶