CCWG I. Johansson Internet-Draft M. Westerlund Obsoletes: 8298 (if approved) Ericsson Intended status: Experimental 21 October 2024 Expires: 24 April 2025 Self-Clocked Rate Adaptation for Multimedia draft-johansson-ccwg-rfc8298bis-screamv2-02 Abstract This memo describes a congestion control algorithm for conversational media services such as interactive video. The solution conforms to the packet conservation principle and is a hybrid loss- and delay based congestion control that also supports ECN and L4S. The algorithm is evaluated over both simulated Internet bottleneck scenarios as well as in a mobile system simulator using LongTerm Evolution (LTE) and 5G and is shown to achieve both low latency and high video throughput in these scenarios. This specification obsoletes RFC 8298. The algorithm supports handling of multiple media streams, typical use cases are streaming for remote control, AR and 3D VR googles. About This Document This note is to be removed before publishing as an RFC. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-johansson-ccwg-rfc8298bis- screamv2/. Discussion of this document takes place on the Congestion Control Working Group (ccwg) Working Group mailing list (mailto:ccwg@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/ccwg/. Subscribe at https://www.ietf.org/mailman/listinfo/ccwg/. Source for this draft and an issue tracker can be found at https://github.com/gloinul/draft-johansson-ccwg-scream-bis. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Johansson & Westerlund Expires 24 April 2025 [Page 1] Internet-Draft SCReAMv2 October 2024 Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 24 April 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Wireless (LTE and 5G) Access Properties . . . . . . . . . 4 1.2. Why is it a self-clocked algorithm? . . . . . . . . . . . 5 1.3. Requirements on media and feedback protocol . . . . . . . 6 1.4. Comparison with LEDBAT and TFWC in TCP . . . . . . . . . 6 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 8 3. Overview of SCReAMv2 Algorithm . . . . . . . . . . . . . . . 8 3.1. Network Congestion Control . . . . . . . . . . . . . . . 8 3.2. Sender Transmission Control . . . . . . . . . . . . . . . 9 3.3. Media Rate Control . . . . . . . . . . . . . . . . . . . 10 4. Detailed Description of SCReAMv2 sender algorithm . . . . . . 10 4.1. Constants and variables . . . . . . . . . . . . . . . . . 12 4.1.1. Constants . . . . . . . . . . . . . . . . . . . . . . 12 4.1.2. State Variables . . . . . . . . . . . . . . . . . . . 13 4.2. Network Congestion Control . . . . . . . . . . . . . . . 15 4.2.1. Reaction to Delay, Data unit Loss and ECN-CE . . . . 17 4.2.2. Reference Window Update . . . . . . . . . . . . . . . 19 4.2.3. Lost Data Unit Detection . . . . . . . . . . . . . . 23 4.2.4. Send Window Calculation . . . . . . . . . . . . . . . 23 4.2.5. Packet Pacing . . . . . . . . . . . . . . . . . . . . 25 Johansson & Westerlund Expires 24 April 2025 [Page 2] Internet-Draft SCReAMv2 October 2024 4.2.6. Stream Prioritization . . . . . . . . . . . . . . . . 25 4.3. Media Rate Control . . . . . . . . . . . . . . . . . . . 26 4.4. Competing Flows Compensation . . . . . . . . . . . . . . 27 4.5. Handling of systematic errors in video coders . . . . . . 29 5. Receiver Requirements on Feedback Intensity . . . . . . . . . 30 6. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 31 7. Algorithm changes . . . . . . . . . . . . . . . . . . . . . . 32 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 9. Security Considerations . . . . . . . . . . . . . . . . . . . 33 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 10.1. Normative References . . . . . . . . . . . . . . . . . . 33 10.2. Informative References . . . . . . . . . . . . . . . . . 34 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 36 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36 1. Introduction Congestion in the Internet occurs when the transmitted bitrate is higher than the available capacity over a given transmission path. Applications that are deployed in the Internet have to employ congestion control to achieve robust performance and to avoid congestion collapse in the Internet. Interactive real-time communication imposes a lot of requirements on the transport; therefore, a robust, efficient rate adaptation for all access types is an important part of interactive real-time communications, as the transmission channel bandwidth can vary over time. Wireless access such as LTE and 5G, which is an integral part of the current Internet, increases the importance of rate adaptation as the channel bandwidth of a default LTE bearer [QoS-3GPP] can change considerably in a very short time frame. Thus, a rate adaptation solution for interactive real-time media, such as WebRTC [RFC7478], should be both quick and be able to operate over a large range in channel capacity. This memo describes Self-Clocked Rate Adaptation for Multimedia version 2 (SCReAMv2), an update to SCReAM congestion control for media streams such as RTP [RFC3550]. While SCReAM was originally devised for WebRTC, SCReAM as well as SCReAMv2 can also be used for other applications where congestion control of different type of real-time streams, especially media streams is necessary. SCReAM is based on the self-clocking principle of TCP and uses estimates the forward queue delay in the same way as Low Extra Delay Background Transport (LEDBAT) [RFC6817]. Johansson & Westerlund Expires 24 April 2025 [Page 3] Internet-Draft SCReAMv2 October 2024 SCReAMv2 is not entirely self-clocked as it augments self-clocking with pacing and a minimum send rate. Further, SCReAMv2 can take advantage of Explicit Congestion Notification (ECN) [RFC3168] and Low Latency Low Loss and Scalable throughput (L4S) [RFC9330] in cases where ECN or L4S is supported by the network and the hosts. However, ECN or L4S is not required for the basic congestion control functionality in SCReAMv2. This specification replaces the previous experimental version [RFC8298] of SCReAM with SCReAMv2. There are many and fairly significant changes to the original SCReAM algorithm. The algorithm in this memo differs greatly against the previous version of SCReAM. The main differences are: * L4S support added. The L4S algoritm has many similarities with the DCTCP and Prague congestion control but has a few extra modifications to make it work well with peridic sources such as video. * The delay based congestion control is changed to implement a pseudo-L4S approach, this simplifies the delay based congestion control. * The fast increase mode is removed. The reference window additive increase is replaced with an adaptive multiplicative increase to enhance convergence speed. * The algorithm is more rate based than self-clocked. The calculated congestion window is used mainly to calculate proper media bitrates. Bytes in flight is however allowed to exceeed the reference window. * The media bitrate calculation is dramatically changed and simplified. * Additional compensation is added to make SCReAMv2 handle cases such as large changing frame sizes. 1.1. Wireless (LTE and 5G) Access Properties [RFC8869] describes the complications that can be observed in wireless environments. Wireless access such as LTE and 5G typically cannot guarantee a given bandwidth; this is true especially for default bearers. The network throughput can vary considerably, for instance, in cases where the wireless terminal is moving around. Even though 5G can support bitrates above 1 Gbps, there are cases when the available bitrate can be much lower (less than 10 Mbps); Johansson & Westerlund Expires 24 April 2025 [Page 4] Internet-Draft SCReAMv2 October 2024 examples are situations with high network load and poor coverage. An additional complication is that the network throughput can drop for short time intervals (e.g., at handover); these short glitches are initially very difficult to distinguish from more permanent reductions in throughput. Unlike wireline bottlenecks with large statistical multiplexing, it is typically not possible to try to maintain a given bitrate when congestion is detected with the hope that other flows will yield. This is because there are generally few other flows competing for the same bottleneck. Each user gets its own variable throughput bottleneck, where the throughput depends on factors like channel quality, network load, and historical throughput. The bottom line is, if the throughput drops, the sender has no other option than to reduce the bitrate. Once the radio scheduler has reduced the resource allocation for a bearer, a flow in that bearer aims to reduce the sending rate quite quickly (within one RTT) in order to avoid excessive queuing delay or packet loss. This has the consenquence that L4S capable 5G radio bearers will build a queue unless these are prioritized over other bearers. This differs from e.g DualQ [RFC9332], which prioritizes L4S traffic in a weighted scheduler and achives fairness with additional marking for the L4S flows. 1.2. Why is it a self-clocked algorithm? Self-clocked congestion control algorithms provide a benefit over their rate-based counterparts in that the former consists of two adaptation mechanisms: * A reference window computation that evolves over a longer timescale (several RTTs) especially when the reference window evolution is dictated by estimated delay (to minimize vulnerability to, e.g., short-term delay variations). The term reference window is used instead of congestion window, as the reference window does not set an absolute limit on the bytes in flight. * A fine-grained congestion control given by the self-clocking; it operates on a shorter time scale (1 RTT). The benefits of self- clocking are also elaborated upon in [TFWC]. The self-clocking however acts more like an emergency break as bytes in flight can exceed the reference window only to a certain degree. The rationale is to be able to transmit large video frames and avoid that they are unnecessarily queued up on the sender side, but still prevent a large network queue. Johansson & Westerlund Expires 24 April 2025 [Page 5] Internet-Draft SCReAMv2 October 2024 A rate-based congestion control algorithm typically adjusts the rate based on delay and loss. The congestion detection needs to be done with a certain time lag to avoid overreaction to spurious congestion events such as delay spikes. Despite the fact that there are two or more congestion indications, the outcome is that there is still only one mechanism to adjust the sending rate. This makes it difficult to reach the goals of high throughput and prompt reaction to congestion. 1.3. Requirements on media and feedback protocol SCReAM was originally designed to with with RTP + RTCP where [RFC8888] was used as recommended feedback. RTP offers unique packet indication with the sequence number and [RFC8888] offers timestamps of received packets and the status of the ECN bits. SCReAM is however not limited to RTP as long as some requirements are fulfilled : * Media data is split in data units that when encapsulated in IP packets fit in the network MTU. * Each data unit can be uniquely identified. * Data units can be queued up in a packet queue before transmission. * Feedback can indicate reception time for each data units, or a group of data units. * Feedback can indicate packets that are ECN-CE marked. Unique ECN bits indication for each packet is not necessary. An ECN-CE counter similar to what is defined in [RFC9000] is sufficient. 1.4. Comparison with LEDBAT and TFWC in TCP The core SCReAM algorithm, which is still similar in SCReAMv2, has similarities to the concepts of self-clocking used in TCP-friendly window-based congestion control [TFWC] and follows the packet conservation principle. The packet conservation principle is described as a key factor behind the protection of networks from congestion [Packet-conservation]. The reference window is determined in a way similar to the congestion window in LEDBAT [RFC6817]. LEDBAT is a congestion control algorithm that uses send and receive timestamps to estimate the queuing delay (from now on denoted "qdelay") along the transmission path. This information is used to adjust the congestion window. The general problem described in the paper is that the base delay is offset by LEDBAT's own queue buildup. The big difference with using LEDBAT in Johansson & Westerlund Expires 24 April 2025 [Page 6] Internet-Draft SCReAMv2 October 2024 the SCReAM context lies in the facts that the source is rate limited and that the data unit queue must be kept short (preferably empty). In addition, the output from a video encoder is rarely constant bitrate; static content (talking heads, for instance) gives almost zero video bitrate. This yields two useful properties when LEDBAT's delay-based rate estimation techniques are used as part of SCReAM; they help to avoid the issues described in [LEDBAT-delay-impact]: 1. There is always a certain probability that SCReAM is short of data to transmit; this means that the network queue will become empty every once in a while. 2. The max video bitrate can be lower than the link capacity. If the max video bitrate is 5 Mbps and the capacity is 10 Mbps, then the network queue will become empty. It is sufficient that any of the two conditions above is fulfilled to make the base delay update properly. Furthermore, [LEDBAT-delay-impact] describes an issue with short-lived competing flows. In SCReAM, these short-lived flows will cause the self- clocking to slow down, thereby building up the data unit queue; in turn, this results in a reduced media video bitrate. Thus, SCReAM slows the bitrate more when there are competing short-lived flows than the traditional use of LEDBAT does. The basic functionality in the use of LEDBAT in SCReAM is quite simple; however, there are a few steps in order to make the concept work with conversational media: * Addition of a media rate control function. * Reference window validation techniques. The reference window is used as a basis for the target bitrate calculation. For that reason, various actions are taken to avoid that the reference window grows too much beyond the bytes in flight. Additional contraints are applied when in congested state and when the max target bitrate is reached. * Use of inflection points in the reference window calculation to achieve reduced delay jitter. * Adjustment of qdelay target for better performance when competing with other loss-based congestion-controlled flows. The above-mentioned features will be described in more detail in Section Section 4. The SCReAM/SCReAMv2 congestion control method uses techniques similar to LEDBAT [RFC6817] to measure the qdelay. As is the case with LEDBAT, it is not necessary to use synchronized clocks in the sender Johansson & Westerlund Expires 24 April 2025 [Page 7] Internet-Draft SCReAMv2 October 2024 and receiver in order to compute the qdelay. However, it is necessary that they use the same clock frequency, or that the clock frequency at the receiver can be inferred reliably by the sender. Failure to meet this requirement leads to malfunction in the SCReAM/ SCReAMv2 congestion control algorithm due to incorrect estimation of the network queue delay. Use of [RFC8888] as feedback ensures that the same time base is used in sender and receiver. 2. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 3. Overview of SCReAMv2 Algorithm SCReAMv2 still consists of three main parts: network congestion control, sender transmission control, and media rate control. All of these parts reside at the sender side. Figure 1 shows the functional overview of a SCReAMv2 sender. The receiver-side algorithm is very simple in comparison, as it only generates feedback containing acknowledgements of received data units and indication of ECN-CE marking, either as an accumulated counter, or individual per data unit. 3.1. Network Congestion Control The network congestion control sets an upper limit on how much data can be in the network (bytes in flight); this limit is called reference window (ref_wnd) and is used in the sender transmission control. The sender calculates the reference window based on the feedback from the data receiver. The feedback from the receiver is timestamp and individual data units or group of data units and ECN status per data unit or an accumulated ECN-CE count. The sender keeps a list of transmitted packets, their respective sizes, and the time they were transmitted. This information is used to determine the number of bytes that can be transmitted at any given time instant. A reference window puts an upper limit on how many bytes can be in flight, i.e., transmitted but not yet acknowledged. The reference window is however not an absolute limit as a slack is given to efficiently transmit temporary larger media objects, such as video frames. Johansson & Westerlund Expires 24 April 2025 [Page 8] Internet-Draft SCReAMv2 October 2024 The reference window seeks to increase by one segment per RTT and this increase regardless congestion occurs or not, the reference window increase is restriced or relaxed based on the current value of the reference window relative to a previous max value and the time elapsed since last congestion event. The reference window update is increased by one MSS (maximum known data unit size) per RTT with some variation based on reference window size and time elapsed since the last congestion event. Multiplicative increase allows the congestion to increase by a fraction of ref_wnd when congestion has not occured for a while. The reference window is thus an adaptive multiplicative increase that is mainly additive increase when steady state is reached but allows a faster convergence to a higher link speed. Reference window reduction is triggered by: * Packet loss or Classic ECN marking is detected : The reference window is reduced by a predetermined fraction. * Estimated queue delay exceeds a given threshold : The reference window is reduced given by how much the delay exceeds the threshold. * L4S ECN marking detected : The reference window is reduced in proportion to the fraction of packets that are marked (scalable congestion control). 3.2. Sender Transmission Control The sender transmission control limits the output of data, given by the relation between the number of bytes in flight and the reference window. The congestion window is however not a hard limit, additional slack is given to avoid that data units are queued up unnecessarily on the sender side. This means that the algoritm prefers to build up a queue in the network rather than on the sender side. Additional congestion that this causes will reflect back and cause a reduction of the reference window. Packet pacing is used to mitigate issues with ACK compression that MAY cause increased jitter and/or packet loss in the media traffic. Packet pacing limits the packet transmission rate given by the estimated link throughput. Even if the send window allows for the transmission of a number of packets, these packets are not transmitted immediately; rather, they are transmitted in intervals given by the packet size and the estimated link throughput. Packets are generally paced at a higher rate than the target bitrate, this makes it possible to transmit occasionally larger video frames in a timely manner. Johansson & Westerlund Expires 24 April 2025 [Page 9] Internet-Draft SCReAMv2 October 2024 3.3. Media Rate Control The media rate control serves to adjust the media bitrate to ramp up quickly enough to get a fair share of the system resources when link throughput increases. The reaction to reduced throughput must be prompt in order to avoid getting too much data queued in the data unit queue(s) in the sender. The media rate is calculated based on the reference window and RTT. For the case that multiple streams are enabled, the media rate among the streams is distrubuted according to relative priorities. In cases where the sender's frame queues increase rapidly, such as in the case of a Radio Access Type (RAT) handover, the SCReAMv2 sender MAY implement additional actions, such as discarding of encoded media frames or frame skipping in order to ensure that the data unit queues are drained quickly. Frame skipping results in the frame rate being temporarily reduced. Which method to use is a design choice and is outside the scope of this algorithm description. 4. Detailed Description of SCReAMv2 sender algorithm This section describes the sender-side algorithm in more detail. It is split between the network congestion control, sender transmission control, and media rate control. The sender implements media rate control and an data unit queue for each media type or source, where data units containing encoded media frames are temporarily stored for transmission. Figure 1 shows the details when a single media source (or stream) is used. A transmission scheduler (not shown in the figure) is added to support multiple streams. The transmission scheduler can enforce differing priorities between the streams and act like a coupled congestion controller for multiple flows. Support for multiple streams is implemented in [SCReAM-CPP-implementation]. Johansson & Westerlund Expires 24 April 2025 [Page 10] Internet-Draft SCReAMv2 October 2024 +------------------------------+ | Media encoder | +------------------------------+ ^ |(1) | Data unit |(3) | | V | +-----------+ +---------+ | | | Media | | Queue | | rate | | | | control | | Data units| +---------+ | | ^ +-----------+ | | | (2) |(4) | Data unit | | | v +------------+ +--------------+ | Network | (7) | Sender | +-->| congestion |--------->| Transmission | | | control | | Control | | +------------+ +--------------+ | |(5) +----------Feed back--------+ Data unit (6) | | | v +---------------+ | UDP | | socket | +---------------+ Figure 1: Sender Functional View Media frames are encoded and forwarded to the data unit queue (1) in Figure 1. The data units are picked from the data unit queue (4), for multiple flows from each data unit queue based on some defined priority order or simply in a round-robin fashion, by the sender transmission controller. The sender transmission controller (in case of multiple flows a transmission scheduler) sends the data units to the UDP socket (5). In the general case, all media SHOULD go through the sender transmission controller and is limited so that the number of bytes in flight is less than the reference window albeit with a slack to avoid that packets are unnecessarily delayed in the data unit queue. Johansson & Westerlund Expires 24 April 2025 [Page 11] Internet-Draft SCReAMv2 October 2024 RTCP packets are received (6) and the information about the bytes in flight and reference window is exchanged between the network congestion control and the sender transmission control (7). The reference window and the estimated RTT is communicated to the media rate control (2) to compute the appropriate target bitrate. The target bitrate is updated whenever the reference window is updated. Additional parameters are also communicated to make the rate control more stable when the congestion window is very small or when L4S is not active. This is described more in detail below. 4.1. Constants and variables Constants and state variables are listed in this section. Temporary variables are not listed; instead, they are appended with '_t' in the pseudocode to indicate their local scope. 4.1.1. Constants The RECOMMENDED values, within parentheses "()", for the constants are deduced from experiments. * QDELAY_TARGET_LO (0.06): Target value for the minimum qdelay [s]. * QDELAY_TARGET_HI (0.4): Target value for the maximum qdelay [s]. This parameter provides an upper limit to how much the target qdelay (qdelay_target) can be increased in order to cope with competing loss-based flows. However, the target qdelay does not have to be initialized to this high value, as it would increase end-to-end delay and also make the rate control and congestion control loops sluggish. * MIN_REF_WND (3000): Minimum reference window [byte]. * MAX_BYTES_IN_FLIGHT_HEAD_ROOM (1.1): Headroom for the limitation of ref_wnd. * BETA_LOSS (0.7): ref_wnd scale factor due to loss event. * BETA_ECN (0.8): ref_wnd scale factor due to ECN event. * MSS (1000): Maximum segment size = Max data unit size [byte]. * TARGET_BITRATE_MIN: Minimum target bitrate in [bps] (bits per second). * TARGET_BITRATE_MAX: Maximum target bitrate in [bps]. Johansson & Westerlund Expires 24 April 2025 [Page 12] Internet-Draft SCReAMv2 October 2024 * RATE_PACE_MIN (50000): Minimum pacing rate in [bps]. * REF_WND_OVERHEAD (1.5): Indicates how much bytes in flight is allowed to exceed ref_wnd. * L4S_AVG_G (1.0/16): Exponentially Weighted Moving Average (EWMA) factor for l4s_alpha * QDELAY_AVG_G (1.0/4): Exponentially Weighted Moving Average (EWMA) factor for qdelay_avg * PACKET_OVERHEAD (20) : Estimated packetization overhead [byte] * POST_CONGESTION_DELAY_RTT (100): Determines how many RTTs after a congestion event the reference window growth should be cautious. * MUL_INCREASE_FACTOR (0.02): Determines how much (as a fraction of ref_wnd) that the ref_wnd can increase per RTT. * IS_L4S (false): Congestion control operates in L4S mode. * VIRTUAL_RTT (0.025): Virtual RTT [s]. This mimics Prague's RTT fairness such that flows with RTT below VIRTUAL_RTT should get a roughly equal share over an L4S path. * PACKET_PACING_HEADROOM (1.5): Extra head room for packet pacing. * BYTES_IN_FLIGHT_HEAD_ROOM (2.0): Extra headroom for bytes in flight. 4.1.2. State Variables The values within parentheses "()" indicate initial values. * qdelay_target (QDELAY_TARGET_LO): qdelay target [s], a variable qdelay target is introduced to manage cases where a fixed qdelay target would otherwise starve the data flow under such circumstances (e.g., FTP competes for the bandwidth over the same bottleneck). The qdelay target is allowed to vary between QDELAY_TARGET_LO and QDELAY_TARGET_HI. * qdelay_fraction_avg (0.0): Fractional qdelay filtered by the Exponentially Weighted Moving Average (EWMA). * ref_wnd (MIN_REF_WND): Reference window [byte]. * ref_wnd_i (1): Reference window inflection point [byte]. Johansson & Westerlund Expires 24 April 2025 [Page 13] Internet-Draft SCReAMv2 October 2024 * bytes_newly_acked (0): The number of bytes that was acknowledged with the last received acknowledgement, i.e., bytes acknowledged since the last ref_wnd update. * max_bytes_in_flight (0): The maximum number of bytes in flight in the last round trip [byte]. * max_bytes_in_flight_prev (0): The maximum number of bytes in flight in previous round trip [byte]. * send_wnd (0): Upper limit to how many bytes can currently be transmitted. Updated when ref_wnd is updated and when data unit is transmitted [byte]. * target_bitrate (0): Media target bitrate [bps]. * rate_media (0.0): Measured bitrate [bps] from the media encoder. * s_rtt (0.0): Smoothed RTT [s], computed with a similar method to that described in [RFC6298]. * data_unit_size (0): Size [byte] of the last transmitted data unit. * loss_event_rate (0.0): The estimated fraction of RTTs with lost data units detected. * bytes_in_flight_ratio (0.0): Ratio between the bytes in flight and the reference window. * ref_wnd_ratio (0.0): Ratio between MSS and ref_wnd. * last_fraction_marked (0.0): fraction marked data units in last update * l4s_alpha (0.0): Average fraction of marked data units per RTT. * l4s_active (false): Indicates that L4S is enabled and data units are indeed marked. * last_update_l4s_alpha_time (0): Last time l4s_alpha was updated [s]. * last_update_qdelay_avg_time (0): Last time qdelay_avg was updated [s]. * data_units_delivered_this_rtt (0): Counter for delivered data units. Johansson & Westerlund Expires 24 April 2025 [Page 14] Internet-Draft SCReAMv2 October 2024 * data_units_marked_this_rtt (0): Counter delivered and ECN-CE marked data units. * last_congestion_detected_time (0): Last time congestion event occured [s]. * last_ref_wnd_i_update_time (0): Last time ref_wnd_i was updated [s]. * bytes_newly_acked (0): Number of bytes newly ACKed, reset to 0 when congestion window is updated [byte]. * bytes_newly_acked_ce (0): Number of bytes newly ACKed and CE marked, reset to 0 when reference window is updated [byte]. * pace_bitrate (1e6): Data unit pacing rate [bps]. * t_pace (1e-6): Pacing interval between data units [s]. * rel_framesize_high (1.0): High percentile of frame size, normalized by nominal frame size for the given target bitrate * frame_period (0.02): The frame period [s]. 4.2. Network Congestion Control This section explains the network congestion control, which performs two main functions: * Computation of reference window at the sender: This gives an upper limit to the number of bytes in flight. * Calculation of send window at the sender: Data units are transmitted if allowed by the relation between the number of bytes in flight and the reference window. This is controlled by the send window. SCReAMv2 is a window-based and byte-oriented congestion control protocol, where the number of bytes transmitted is inferred from the size of the transmitted data units. Thus, a list of transmitted data units and their respective transmission times (wall-clock time) MUST be kept for further calculation. The number of bytes in flight (bytes_in_flight) is computed as the sum of the sizes of the data units ranging from the data unit most recently transmitted, down to but not including the acknowledged data unit with the highest sequence number. This can be translated to the difference between the highest transmitted byte sequence number and Johansson & Westerlund Expires 24 April 2025 [Page 15] Internet-Draft SCReAMv2 October 2024 the highest acknowledged byte sequence number. As an example: If a data unit with sequence number SN is transmitted and the last acknowledgement indicates SN-5 as the highest received sequence number, then bytes_in_flight is computed as the sum of the size of data units with sequence number SN-4, SN-3, SN-2, SN-1, and SN. It does not matter if, for instance, the data unit with sequence number SN-3 was lost -- the size of data unit with sequence number SN-3 will still be considered in the computation of bytes_in_flight. bytes_in_flight_ratio is calculated as the ratio between bytes_flight and ref_wnd. This value should be computed at the beginning of the ACK processing. ref_wnd_ratio is computed as the relation between MSS and ref_wnd. Furthermore, a variable bytes_newly_acked is incremented with a value corresponding to how much the highest sequence number has increased since the last feedback. As an example: If the previous acknowledgement indicated the highest sequence number N and the new acknowledgement indicated N+3, then bytes_newly_acked is incremented by a value equal to the sum of the sizes of data units with sequence number N+1, N+2, and N+3. Data units that are lost are also included, which means that even though, e.g., data unit N+2 was lost, its size is still included in the update of bytes_newly_acked. The bytes_newly_acked_ce is, similar to bytes_newly_acked, a counter of bytes newly acked with the extra condition that they are ECN-CE marked. The bytes_newly_acked and bytes_newly_acked_ce are reset to zero after a ref_wnd update. The feedback from the receiver is assumed to consist of the following elements. * A list of received data units' sequence numbers. With an indication if data units are ECN-CE marked, the ECN status can be either per data unit or an accumulated count of ECN-CE marked data units. * The wall-clock timestamp corresponding to the received data unit with the highest sequence number. When the sender receives RTCP feedback, the qdelay is calculated as outlined in [RFC6817]. A qdelay sample is obtained for each received acknowledgement. A number of variables are updated as illustrated by the pseudocode below; temporary variables are appended with '_t'. Division operation is always floating point unless otherwise noted. l4s_alpha is calculated based in number of data units delivered (and marked). This makes calculation of L4S alpha more accurate at very low bitrates, given that the tail data unit in e.g a video frame is often smaller than MSS. Johansson & Westerlund Expires 24 April 2025 [Page 16] Internet-Draft SCReAMv2 October 2024 The smoothed RTT (s_rtt) is computed in a way similar to [RFC6298]. data_units_delivered_this_rtt += data_units_acked data_units_marked_this_rtt += data_units_acked_ce # l4s_alpha is updated at least every 10ms if (now - last_update_l4s_alpha_time >= min(0.01,s_rtt) # l4s_alpha is calculated from data_units marked istf bytes marked fraction_marked_t = data_units_marked_this_rtt/ data_units_delivered_this_rtt l4s_alpha = L4S_AVG_G*fraction_marked_t + (1.0-L4S_AVG_G)*l4S_alpha last_update_l4s_alpha_time = now data_units_delivered_this_rtt = 0 data_units_marked_this_rtt = 0 last_fraction_marked = fraction_marked_t end if (now - last_update_qdelay_avg_time >= s_rtt) # qdelay_avg is updated with a slow attack, fast decay EWMA filter if (qdelay < qdelay_avg) qdelay_avg = qdelay else qdelay_avg = QDELAY_AVG_G*qdelay + (1.0-QDELAY_AVG_G)*qdelay_avg end last_update_qdelay_avg_time = now end 4.2.1. Reaction to Delay, Data unit Loss and ECN-CE Congestion is detected based on three different indicators: * Lost data units detected. The loss detection is described in Section 4.2.3. * ECN-CE marked data units detected. * Estimated queue delay exceeds a threshold. A congestion event occurs if any of the above indicators are true AND it is at least min(VIRTUAL_RTT,s_rtt) since the last congestion event. This ensures that the reference window is reduced at most once per smoothed RTT. Johansson & Westerlund Expires 24 April 2025 [Page 17] Internet-Draft SCReAMv2 October 2024 4.2.1.1. Lost data units The reference window back-off due to loss events is deliberately a bit less than is the case with TCP Reno, for example. TCP is generally used to transmit whole files; the file is then like a source with an infinite bitrate until the whole file has been transmitted. SCReAMv2, on the other hand, has a source which rate is limited to a value close to the available transmit rate and often below that value; the effect is that SCReAMv2 has less opportunity to grab free capacity than a TCP-based file transfer. To compensate for this, it is RECOMMENDED to let SCReAMv2 reduce the reference window less than what is the case with TCP when loss events occur. 4.2.1.2. ECN-CE and classic ECN In classic ECN mode the ref_wnd is scaled by a fixed value (BETA_ECN). The reference window back-off due to an ECN event MAY be smaller than if a loss event occurs. This is in line with the idea outlined in [RFC8511] to enable ECN marking thresholds lower than the corresponding data unit drop thresholds. 4.2.1.3. ECN-CE and L4S The ref_wnd is scaled down in proportion to the fraction of marked data units per RTT. The scale down proportion is given by l4s_alpha, which is an EWMA filtered version of the fraction of marked data units per RTT. This is inline with how DCTCP works [RFC8257]. Additional methods are applied to make the reference window reduction reasonably stable, especially when the reference window is only a few MSS. In addition, because SCReAMv2 can quite often be source limited, additional steps are taken to restore the reference window to a proper value after a long period without congestion. 4.2.1.4. Increased queue delay SCReAMv2 implements a delay based congestion control approach where it mimics L4S congestion marking when the averaged queue delay exceeds a target threshold. This threshold is set to qdelay_target/2 and the congestion backoff factor (l4s_alpha_v) increases linearly from 0 to 100% as qdelay_avg goes from qdelay_target/2 to qdelay_target. The averaged qdelay (qdelay_avg) is used to avoid that the SCReAMv2 congestion control over-reacts to scheduling jitter, sudden delay spikes due to e.g. handover or link layer retransmissions. Furthermore, the delay based congestion control is inactivated when it is reasonably certain that L4S is active, i.e. L4S is enabled and congested nodes apply L4S marking of data units. Johansson & Westerlund Expires 24 April 2025 [Page 18] Internet-Draft SCReAMv2 October 2024 This reduces negative effects of clockdrift, that the delay based control can introduce, whenever possible. 4.2.2. Reference Window Update The reference window update contains two parts. One that reduces the congestion window when congestion events (listed above) occur, and one part that continously increases the reference window. The target bitrate is updated whenever the reference window is updated. Actions when congestion detected # The reference window is updated at least every VIRTUAL_RTT if (now - last_congestion_detected_time >= min(VIRTUAL_RTT,s_rtt) if (loss detected) is_loss_t = true end if (data units marked) is_ce_t = true end if (qdelay > qdelay_target/2) # It is expected that l4s_alpha is below a given value, l4_alpha_lim_t = 2 / target_bitrate * MSS * 8 / s_rtt if (l4s_alpha < l4_alpha_lim_t || !l4s_active) # L4S does not seem to be active l4s_alpha_v_t = min(1.0, max(0.0, (qdelay_avg - qdelay_target / 2) / (qdelay_target / 2))); is_virtual_ce_t = true end end end if (is_loss_t || is_ce_t || is_virtual_ce_t) if (now - last_ref_wnd_i_update_time > 10*s_rtt) # Update ref_wnd_i, no more often than every 10 RTTs # Additional median filtering over more congestion epochs # may improve accuracy of ref_wnd_i last_ref_wnd_i_update_time = now ref_wnd_i = ref_wnd end end # Either loss, ECN mark or increased qdelay is detected if (is_loss_t) Johansson & Westerlund Expires 24 April 2025 [Page 19] Internet-Draft SCReAMv2 October 2024 # Loss is detected ref_wnd = ref_wnd * BETA_LOSS end if (is_ce_t) # ECN-CE detected if (IS_L4S) # L4S mode backoff_t = l4s_alpha / 2 # Increase stability for very small ref_wnd backOff_t *= max(0.5, 1.0 - ref_wnd_ratio) if (now - last_congestion_detected_time > 100*max(VIRTUAL_RTT,s_rtt)) # A long time (>100 RTTs) since last congested because # link throughput exceeds max video bitrate. # There is a certain risk that ref_wnd has increased way above # bytes in flight, so we reduce it here to get it better on # track and thus the congestion episode is shortened ref_wnd = min(ref_wnd, max_bytes_in_flight_prev) # Also, we back off a little extra if needed # because alpha is quite likely very low # This can in some cases be an over-reaction # but as this function should kick in relatively seldom # it should not be to too big concern backoff_t = max(backoff_t, 0.25) # In addition, bump up l4sAlpha to a more credible value # This may over react but it is better than # excessive queue delay l4sAlpha = 0.25 end ref_wnd = (1.0 - backoff_t) * ref_wnd else # Classic ECN mode ref_wnd = ref_wnd * BETA_ECN end end if (is_virtual_ce_t) backoff_t = l4s_alpha_v_t / 2 ref_wnd = (1.0 - backoff_t) * ref_wnd end ref_wnd = max(MIN_REF_WND, ref_wnd) if (is_loss_t || is_ce_t || is_virtual_ce_t) last_congestion_detected_time = now end Johansson & Westerlund Expires 24 April 2025 [Page 20] Internet-Draft SCReAMv2 October 2024 The variable max_bytes_in_flight_prev indicates the maximum bytes in flights in the previous round trip. The reason to this is that bytes in flight can spike when congestion occurs, max_bytes_in_flight_prev thus ensures better that an uncongested bytes in flight is used. Reference window increase # Delay factor for multiplicative reference window increase # after congestion post_congestion_scale_t = max(0.0, min(1.0, (now - last_congestion_detected_time) / (POST_CONGESTION_DELAY_RTTS * max(VIRTUAL_RTT, s_rtt)))) # Scale factor for ref_wnd update ref_wnd_scale_factor_t = 1.0 + (MUL_INCREASE_FACTOR * ref_wnd) / MSS) # Calculate bytes acked that are not CE marked # For the case that only accumulated number of CE marked packets is # reported by the feedback, it is necessary to make an approximation # of bytes_newly_acked_ce based on average data unit size. bytes_newly_acked_minus_ce_t = bytes_newly_acked- bytes_newly_acked_ce increment_t = bytes_newly_acked_minus_ce_t*ref_wnd_ratio # Reduce increment for small RTTs tmp_t = min(1.0, s_rtt / VIRTUAL_RTT) increment_t *= tmp_t * tmp_t # Apply limit to reference window growth when close to last # known max value before congestion scl_t = (ref_wnd-ref_wnd_i) / ref_wnd_i scl_t *= 4 scl_t = scl_t * scl_t scl_t = max(0.1, min(1.0, scl_t)) if (!is_l4s_active) increment_t *= scl_t end # Limit on CWND growth speed further for small CWND # This is complemented with a corresponding restriction on CWND # reduction increment_t *= max(0.5,1.0-ref_wnd_ratio) # Scale up increment with multiplicative increase # Limit multiplicative increase when congestion occured Johansson & Westerlund Expires 24 April 2025 [Page 21] Internet-Draft SCReAMv2 October 2024 # recently and when reference window is close to the last # known max value float tmp_t = ref_wnd_scale_factor_t if (tmp_t > 1.0) tmp_t = 1.0 + (tmp_t - 1.0) * post_congestion_scale_t * scl_t; end increment *= tmp_t # Increase ref_wnd only if bytes in flight is large enough # Quite a lot of slack is allowed here to avoid that bitrate # locks to low values. max_allowed_t = MSS + max(max_bytes_in_flight, max_bytes_in_flight_prev) * BYTES_IN_FLIGHT_HEAD_ROOM int ref_wnd_t = ref_wnd + increment_t if (ref_wnd_t <= max_allowed_t) ref_wnd = ref_wnd_t end The ref_wnd_scale_factor_t scales the reference window increase. The ref_wnd_scale_factor_t is increased with larger ref_wnd to allow for a multiplicative increase and thus a faster convergence when link capacity increases. The variable max_bytes_in_flight indicates the max bytes in flight in the current round trip. The multiplicative increase is restricted directly after a congestion event and the restriction is gradually relaxed as the time since last congested increased. The restriction makes the reference window growth to be no faster than additive increase when congestion continusly occurs. For L4S operation this means that the SCReAMv2 algorithm will adhere to the 2 marked data units per RTT equilibrium at steady state congestion, with the exception of the case below. The reference window increase is restricted to values as small as 0.1MSS/RTT when the reference window is close to the last known max value (ref_wnd_i). This increases stability and reduces periodic overshoot. This restriction is applied in full only for small reference windows when in L4S operation. It is particularly important that the reference window reflects the transmitted bitrate especially in L4S mode operation. An inflated ref_wnd takes extra RTTs to bring down to a correct value upon congestion and thus causes unnecessary queue buildup. At the same time the reference window must be allowed to be large enough to avoid that the SCReAMv2 algorithm begins to limit itself, given that the target bitrate is calculated based on the ref_wnd. Two mechanisms are used to manage this: Johansson & Westerlund Expires 24 April 2025 [Page 22] Internet-Draft SCReAMv2 October 2024 * Restore correct value of ref_wnd upon congestion. This is done if is a prolonged time since the link was congested. A typical example is that SCReAMv2 has been rate limited, i.e the target bitrate has reached the TARGET_BITRATE_MAX. * Limit ref_wnd when the target_bitrate has reached TARGET_BITRATE_MAX. The ref_wnd is restricted based on a history of the last max_bytes_in_flight values. See [SCReAM-CPP-implementation] for details. The two mechanisms complement one another. 4.2.3. Lost Data Unit Detection Lost data unit detection is based on the received sequence number list. A reordering window SHOULD be applied to prevent data unit reordering from triggering loss events. The reordering window is specified as a time unit, similar to the ideas behind Recent ACKnowledgement (RACK) [RFC8985]. The computation of the reordering window is made possible by means of a lost flag in the list of transmitted data units. This flag is set if the received sequence number list indicates that the given data unit is missing. If later feedback indicates that a previously lost marked data unit was indeed received, then the reordering window is updated to reflect the reordering delay. The reordering window is given by the difference in time between the event that the data unit was marked as lost and the event that it was indicated as successfully received. Loss is detected if a given data unit is not acknowledged within a time window (indicated by the reordering window) after an data unit with a higher sequence number was acknowledged. 4.2.4. Send Window Calculation The basic design principle behind data unit transmission in SCReAM was to allow transmission only if the number of bytes in flight is less than the congestion window. There are, however, two reasons why this strict rule will not work optimally: Johansson & Westerlund Expires 24 April 2025 [Page 23] Internet-Draft SCReAMv2 October 2024 * Bitrate variations: Media sources such as video encoders generally produce frames whose size always vary to a larger or smaller extent. The data unit queue absorbs the natural variations in frame sizes. However, the data unit queue should be as short as possible to prevent the end-to-end delay from increasing. A strict 'send only when bytes in flight is less than the reference window' rule can cause the data unit queue to grow simply because the send window is limited. The consequence is that the reference window will not increase, or will increase very slowly, because the reference window is only allowed to increase when there is a sufficient amount of data in flight. The final effect is that the media bitrate increases very slowly or not at all. * Reverse (feedback) path congestion: Especially in transport over buffer-bloated networks, the one-way delay in the reverse direction can jump due to congestion. The effect is that the acknowledgements are delayed, and the self-clocking is temporarily halted, even though the forward path is not congested. The REF_WND_OVERHEAD allows for some degree of reverse path congestion as the bytes in flight is allowed to exceed ref_wnd. In SCReAMv2, the send window is given by the relation between the adjusted reference window and the amount of bytes in flight according to the pseudocode below. The multiplication of ref_wnd with REF_WND_OVERHEAD and rel_framesize_high has the effect that bytes in flight is 'around' the ref_wnd rather than limited by the ref_wnd when the link is congested. The implementation allows the data unit queue to be small even when the frame sizes vary and thus increased e2e delay can be avoided. send_wnd = ref_wnd * REF_WND_OVERHEAD * rel_framesize_high - bytes_in_flight The send window is updated whenever an data unit is transmitted or an feedback messaged is received. The variable rel_framesize_high is based on calculation of the high percentile of the frame sizes. The calculation is based on a histogram of the frame sizes relative to the expected frame size given the target bitrate and frame period. The calculation of rel_framesize_high is done for every new video frame and is outlined roughly with the pseudo code below. For more detailed code, see [SCReAM-CPP-implementation]. Johansson & Westerlund Expires 24 April 2025 [Page 24] Internet-Draft SCReAMv2 October 2024 # frame_size is that frame size for the last encoded frame tmp_t = frame_size / (target_bitrate * frame_period / 8) if (tmp_t > 1.0) # Insert sample into histogram insert_into_histogram(tmp_t) # Get high percentile rel_framesize_high = get_histogram_high_percentile() end A 75%-ile is used in [SCReAM-CPP-implementation], the histogram can be made leaky such that old samples are gradually forgotten. 4.2.5. Packet Pacing Packet pacing is used in order to mitigate coalescing, i.e., when packets are transmitted in bursts, with the risks of increased jitter and potentially increased packet loss. Packet pacing is also recommended to be used with L4S and also mitigates possible issues with queue overflow due to key-frame generation in video coders. The time interval between consecutive data unit transmissions is greater than or equal to t_pace, where t_pace is given by the equations below : pace_bitrate = max(RATE_PACE_MIN, target_bitrate) * PACKET_PACING_HEADROOM t_pace = data_unit_size * 8 / pace_bitrate data_unit_size is the size of the last transmitted data unit. RATE_PACE_MIN is the minimum pacing rate. 4.2.6. Stream Prioritization The SCReAM algorithm makes a distinction between network congestion control and media rate control. This is easily extended to many streams. Data units from two or more data unit queues are scheduled at the rate permitted by the network congestion control. The scheduling can be done by means of a few different scheduling regimes. For example, the method for coupled congestion control specified in [RFC8699] can be used. One implementation of SCReAMv2 [SCReAM-CPP-implementation] uses credit-based scheduling. In credit- based scheduling, credit is accumulated by queues as they wait for service and is spent while the queues are being serviced. For instance, if one queue is allowed to transmit 1000 bytes, then a credit of 1000 bytes is allocated to the other unscheduled queues. This principle can be extended to weighted scheduling, where the credit allocated to unscheduled queues depends on the relative Johansson & Westerlund Expires 24 April 2025 [Page 25] Internet-Draft SCReAMv2 October 2024 weights. The latter is also implemented in [SCReAM-CPP-implementation] in which case the target bitrate for the streams are also scaled relative to the scheduling priority. 4.3. Media Rate Control The media rate control algorithm is executed whenever the reference window is updated and updates the target bitrate. The target bitrate is essentiatlly based on the reference window and the (smoothed) RTT according to target_bitrate = 8 * ref_wnd / s_rtt The role of the media rate control is to strike a reasonable balance between a low amount of queuing in the data unit queue(s) and a sufficient amount of data to send in order to keep the data path busy. Because the reference window is updated based on loss, ECN-CE and delay, so does the target rate also update. The code above however needs some modifications to work fine in a number of scenarios * L4S is inactive, i.e L4S is either not enabled or congested bottlenecks do not L4S mark data units * ref_wnd is very small, just a few MSS or smaller The complete pseudo code for adjustment of the target bitrate is shown below Johansson & Westerlund Expires 24 April 2025 [Page 26] Internet-Draft SCReAMv2 October 2024 tmp_t = 1.0 # limit bitrate if bytes in flight exceeds is close to or # exceeds ref_wnd. This helps to avoid large rate fluctiations and # variations in RTT # Only applied when L4S is inactive if (!l4s_active && bytes_in_flight_ratio > BYTES_IN_FLIGHT_LIMIT) tmp_t /= min(BYTES_IN_FLIGHT_LIMIT_COMPENSATION, bytesInFlightRatio / BYTES_IN_FLIGHT_LIMIT) end # Scale down rate slighty when the reference window is very # small compared to MSS tmp_t *= 1.0 - min(0.2, max(0.0, ref_wnd_ratio - 0.1)) # Additional compensation for packetization overhead, # important when MSS is small tmp_t_ *= mss/(mss + PACKET_OVERHEAD) # Calculate target bitrate and limit to min and max allowed # values target_bitrate = tmp_t * 8 * ref_wnd / s_rtt target_bitrate = min(TARGET_BITRATE_MAX, max(TARGET_BITRATE_MIN,target_bitrate)) 4.4. Competing Flows Compensation It is likely that a flow will have to share congested bottlenecks with other flows that use a more aggressive congestion control algorithm (for example, large FTP flows using loss-based congestion control). The worst condition occurs when the bottleneck queues are of tail-drop type with a large buffer size. SCReAMv2 takes care of such situations by adjusting the qdelay_target when loss-based flows are detected, as shown in the pseudocode below. Johansson & Westerlund Expires 24 April 2025 [Page 27] Internet-Draft SCReAMv2 October 2024 adjust_qdelay_target(qdelay) qdelay_norm_t = qdelay / QDELAY_TARGET_LOW update_qdelay_norm_history(qdelay_norm_t) # Compute variance qdelay_norm_var_t = VARIANCE(qdelay_norm_history(200)) # Compensation for competing traffic # Compute average qdelay_norm_avg_t = AVERAGE(qdelay_norm_history(50)) # Compute upper limit to target delay new_target_t = qdelay_norm_avg_t + sqrt(qdelay_norm_var_t) new_target_t *= QDELAY_TARGET_LO if (loss_event_rate > 0.002) # Data unit losses detected qdelay_target = 1.5 * new_target_t else if (qdelay_norm_var_t < 0.2) # Reasonably safe to set target qdelay qdelay_target = new_target_t else # Check if target delay can be reduced; this helps prevent # the target delay from being locked to high values forever if (new_target_t < QDELAY_TARGET_LO) # Decrease target delay quickly, as measured queuing # delay is lower than target qdelay_target = max(qdelay_target * 0.5, new_target_t) else # Decrease target delay slowly qdelay_target *= 0.9 end end end # Apply limits qdelay_target = min(QDELAY_TARGET_HI, qdelay_target) qdelay_target = max(QDELAY_TARGET_LO, qdelay_target) Two temporary variables are calculated. qdelay_norm_avg_t is the long-term average queue delay, qdelay_norm_var_t is the long-term variance of the queue delay. A high qdelay_norm_var_t indicates that the queue delay changes; this can be an indication that bottleneck bandwidth is reduced or that a competing flow has just entered. Thus, it indicates that it is not safe to adjust the queue delay target. Johansson & Westerlund Expires 24 April 2025 [Page 28] Internet-Draft SCReAMv2 October 2024 A low qdelay_norm_var_t indicates that the queue delay is relatively stable. The reason could be that the queue delay is low, but it could also be that a competing flow is causing the bottleneck to reach the point that data unit losses start to occur, in which case the queue delay will stay relatively high for a longer time. The queue delay target is allowed to be increased if either the loss event rate is above a given threshold or qdelay_norm_var_t is low. Both these conditions indicate that a competing flow may be present. In all other cases, the queue delay target is decreased. The function that adjusts the qdelay_target is simple and could produce false positives and false negatives. The case that self- inflicted congestion by the SCReAMv2 algorithm may be falsely interpreted as the presence of competing loss-based FTP flows is a false positive. The opposite case -- where the algorithm fails to detect the presence of a competing FTP flow -- is a false negative. Extensive simulations have shown that the algorithm performs well in LTE and 5G test cases and that it also performs well in simple bandwidth-limited bottleneck test cases with competing FTP flows. However, the potential failure of the algorithm cannot be completely ruled out. A false positive (i.e., when self-inflicted congestion is mistakenly identified as competing flows) is especially problematic when it leads to increasing the target queue delay, which can cause the end-to-end delay to increase dramatically. If it is deemed unlikely that competing flows occur over the same bottleneck, the algorithm described in this section MAY be turned off. One such case is QoS-enabled bearers in 3GPP-based access such as LTE and 5G. However, when sending over the Internet, often the network conditions are not known for sure, so in general it is not possible to make safe assumptions on how a network is used and whether or not competing flows share the same bottleneck. Therefore, turning this algorithm off must be considered with caution, as it can lead to basically zero throughput if competing with loss-based traffic. 4.5. Handling of systematic errors in video coders Some video encoders are prone to systematically generate an output bitrate that is systematically larger or smaller than the target bitrate. SCReAMv2 can handle some deviation inherently but for larger devation it becomes necessary to compensate for this. The algorithm for this is detailed in [SCReAM-CPP-implementation]. ToDo: A future draft version will describe this in more detail as it has been fully integrated into SCReAMv2. Johansson & Westerlund Expires 24 April 2025 [Page 29] Internet-Draft SCReAMv2 October 2024 5. Receiver Requirements on Feedback Intensity The simple task of the receiver is to feed back acknowledgements with with time stamp and ECN bits indication for received data units to the sender. Upon reception of each data unit, the receiver MUST maintain enough information to send the aforementioned values to the sender via an RTCP transport- layer feedback message. The frequency of the feedback message depends on the available RTCP bandwidth. The requirements on the feedback elements and the feedback interval are described below. SCReAMv2 benefits from relatively frequent feedback. It is RECOMMENDED that a SCReAMv2 implementation follows the guidelines below. The feedback interval depends on the media bitrate. At low bitrates, it is sufficient with a feedback every frame; while at high bitrates, a feedback interval of roughly 5ms is preferred. At very high bitrates, even shorter feedback intervals MAY be needed in order to keep the self-clocking in SCReAMv2 working well. One indication that feedback is too sparse is that the SCReAMv2 implementation cannot reach high bitrates, even in uncongested links. More frequent feedback might solve this issue. The numbers above can be formulated as a feedback interval function that can be useful for the computation of the desired RTCP bandwidth. The following equation expresses the feedback rate: # Assume 100 byte feedback packets rate_fb = 0.02 * [average received rate] / (100.0 * 8.0); rate_fb = min(1000, max(10, rate_fb)) # Calculate feedback intervals fb_int = 1.0/rate_fb Feedback should also forcibly be transmitted in any of these cases: * More than N data units received since last feedback has been transmitted * A data unit with marker bit set or other last data unit for media frame is received Johansson & Westerlund Expires 24 April 2025 [Page 30] Internet-Draft SCReAMv2 October 2024 The transmission interval is not critical. So, in the case of multi- stream handling between two hosts, the feedback for two or more synchronization sources (SSRCs) can be bundled to save UDP/IP overhead. However, the final realized feedback interval SHOULD NOT exceed 2*fb_int in such cases, meaning that a scheduled feedback transmission event should not be delayed more than fb_int. SCReAMv2 works with AVPF regular mode; immediate or early mode is not required by SCReAMv2 but can nonetheless be useful for RTCP messages not directly related to SCReAMv2, such as those specified in [RFC4585]. It is RECOMMENDED to use reduced-size RTCP [RFC5506], where regular full compound RTCP transmission is controlled by trr- int as described in [RFC4585]. While the guidelines above are somewhat RTCP specific, similar principles apply to for instance QUIC. 6. Discussion This section covers a few discussion points. * Clock drift: SCReAM/SCReAMv2 can suffer from the same issues with clock drift as is the case with LEDBAT [RFC6817]. However, Appendix A.2 in [RFC6817] describes ways to mitigate issues with clock drift. A clockdrift compensation method is also implemented in [SCReAM-CPP-implementation]. Furthermore, the SCReAM implementation resets base delay history when it is determined that clock drift becomes too large. This is achieved by reducing the target bitrate for a few RTTs. * Clock skipping: The sender or receiver clock can occasionally skip. Handling of this is implemented in [SCReAM-CPP-implementation]. * The target bitrate given by SCReAMv2 is the bitrate including the data unit and Forward Error Correction (FEC) overhead. The media encoder SHOULD take this overhead into account when the media bitrate is set. This means that the media coder bitrate SHOULD be computed as: media_rate = target_bitrate - data_unit_plus_fec_overhead_bitrate It is not necessary to make a 100% perfect compensation for the overhead, as the SCReAM algorithm will inherently compensate for moderate errors. Under- compensating for the overhead has the effect of increasing jitter, while overcompensating will cause the bottleneck link to become underutilized. * The link utilization with SCReAMv2 can be lower than 100%. There are several possible reasons to this: Johansson & Westerlund Expires 24 April 2025 [Page 31] Internet-Draft SCReAMv2 October 2024 - Large variations in frame sizes: Large variations in frame size makes SCReAMv2 push down the target_bitrate to give sufficient headroom and avoid queue buildup in the network. It is in general recommended to operate video coders in low latency mode and enable GDR (Gradual Decoding Refresh) if possible to minimize frame size variations. - Link layer properties: Media transport in 5G in uplink typically requires to transmit a scheduling request (SR) to get persmission to transmit data. Because transmission of video is frame based, there is a high likelihood that the channel becomes idle between frames (especially with L4S), in which case a new SR/grant exchange is needed. This potentially means that uplink transmission slots are unused with a lower link utilization as a result. * Packet pacing is recommended, it is however possible to operate SCReAMv2 with packet pacing disabled. The code in [SCReAM-CPP-implementation] implements additonal mechanisms to achieve a high link utilization when packet pacing is disabled. * Feedback issues: RTCP feedback packets [RFC8888] can be lost, this means that the loss detection in SCReAMv2 may trigger even though packets arrive safely on the receiver side. [SCReAM-CPP-implementation] solves this by using overlapping RTCP feedback, i.e RTCP feedback is transmitted no more seldom than every 16th packet, and where each RTCP feedback spans the last 32 received packets. This however creates unnecessary overhead. [RFC3550] RR (Receiver Reports) can possibly be another solution to achieve better robustness with less overhead. QUIC [RFC9000] overcomes this issue because of inherent design. * SCReAM has over time been evaluated in a number of different experiments, a few examples are found in [SCReAM-evaluation-L4S]. 7. Algorithm changes The algorithm has changed quite considerably since [RFC8298]. The main changes are: * L4S support added. The L4S algoritm has many similarities with the DCTCP and Prague congestion control but has a few extra modifications to make it work well with peridic sources such as video. * The delay based congestion control is changed to implement a pseudo-L4S approach, this simplifies the delay based congestion control. Johansson & Westerlund Expires 24 April 2025 [Page 32] Internet-Draft SCReAMv2 October 2024 * The fast increase mode is removed. The reference window additive increase is replaced with an adaptive multiplicative increase to enhance convergence speed. * The algorithm is more rate based than self-clocked. The calculated reference window is used mainly to calculate proper media bitrates. Bytes in flight is however allowed to exceeed the reference window. * The media bitrate calculation is dramatically changed and simplified. In practive it is manifested with a relatively simple relation between the reference window and RTT. * Additional compensation is added to make SCReAMv2 handle cases such as large changing frame sizes. Algorithm changes since the last draft version -01 are: * Slow down reference window growth when close the last know max is disabled when L4S active. This makes SCReAM adhere more closely to 2 marked packets per RTT at steady state. * Reference window decrease and increase reduced by up to 50% when ref_wnd/mss is small. This reduces rate oscillations. * Target bitrate down adjustment when ref_wnd/mss is small is modified to only help to avoid that the data unit queue grows excessively in certain low bitrate cases. * Timing set to multiples of RTTs instead of seconds. 8. IANA Considerations This document does not require any IANA actions. 9. Security Considerations The feedback can be vulnerable to attacks similar to those that can affect TCP. It is therefore RECOMMENDED that the RTCP feedback is at least integrity protected. Furthermore, as SCReAM/SCReAMv2 is self- clocked, a malicious middlebox can drop RTCP feedback packets and thus cause the self-clocking to stall. However, this attack is mitigated by the minimum send rate maintained by SCReAM/SCReAMv2 when no feedback is received. 10. References 10.1. Normative References Johansson & Westerlund Expires 24 April 2025 [Page 33] Internet-Draft SCReAMv2 October 2024 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, . [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, July 2003, . [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, "Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, DOI 10.17487/RFC4585, July 2006, . [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size Real-Time Transport Control Protocol (RTCP): Opportunities and Consequences", RFC 5506, DOI 10.17487/RFC5506, April 2009, . [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, "Computing TCP's Retransmission Timer", RFC 6298, DOI 10.17487/RFC6298, June 2011, . [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, "Low Extra Delay Background Transport (LEDBAT)", RFC 6817, DOI 10.17487/RFC6817, December 2012, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC9330] Briscoe, B., Ed., De Schepper, K., Bagnulo, M., and G. White, "Low Latency, Low Loss, and Scalable Throughput (L4S) Internet Service: Architecture", RFC 9330, DOI 10.17487/RFC9330, January 2023, . 10.2. Informative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . Johansson & Westerlund Expires 24 April 2025 [Page 34] Internet-Draft SCReAMv2 October 2024 [RFC7478] Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- Time Communication Use Cases and Requirements", RFC 7478, DOI 10.17487/RFC7478, March 2015, . [RFC8298] Johansson, I. and Z. Sarker, "Self-Clocked Rate Adaptation for Multimedia", RFC 8298, DOI 10.17487/RFC8298, December 2017, . [RFC8511] Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, "TCP Alternative Backoff with ECN (ABE)", RFC 8511, DOI 10.17487/RFC8511, December 2018, . [RFC8699] Islam, S., Welzl, M., and S. Gjessing, "Coupled Congestion Control for RTP Media", RFC 8699, DOI 10.17487/RFC8699, January 2020, . [RFC8869] Sarker, Z., Zhu, X., and J. Fu, "Evaluation Test Cases for Interactive Real-Time Media over Wireless Networks", RFC 8869, DOI 10.17487/RFC8869, January 2021, . [RFC8985] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "The RACK-TLP Loss Detection Algorithm for TCP", RFC 8985, DOI 10.17487/RFC8985, February 2021, . [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., and G. Judd, "Data Center TCP (DCTCP): TCP Congestion Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, October 2017, . [RFC8888] Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP Control Protocol (RTCP) Feedback for Congestion Control", RFC 8888, DOI 10.17487/RFC8888, January 2021, . [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, May 2021, . [RFC9332] De Schepper, K., Briscoe, B., Ed., and G. White, "Dual- Queue Coupled Active Queue Management (AQM) for Low Latency, Low Loss, and Scalable Throughput (L4S)", RFC 9332, DOI 10.17487/RFC9332, January 2023, . Johansson & Westerlund Expires 24 April 2025 [Page 35] Internet-Draft SCReAMv2 October 2024 [Packet-conservation] Jacobson, V., "Congestion Avoidance and Control", ACM SIGCOMM Computer Communication Review, DOI 10.1145/52325.52356, August 1988, . [LEDBAT-delay-impact] Ros, D. and M. Welzl, "Assessing LEDBAT's Delay Impact", IEEE Communications Letters, Vol. 17, No. 5,, DOI 10.1109/LCOMM.2013.040213.130137, May 2013, . [QoS-3GPP] "Policy and charging control architecture", 3GPP TS 23.203, July 2017, . [SCReAM-CPP-implementation] Ericsson Research, "SCReAM - Mobile optimised congestion control algorithm", n.d., . [SCReAM-evaluation-L4S] Ericsson Research, "SCReAM - evaluations with L4S", n.d., . [TFWC] Choi, S. and M. Handley, "Fairer TCP-Friendly Congestion Control Protocol for Multimedia Streaming Applications", DOI 10.1145/1364654.1364717, December 2007, . Appendix A. Acknowledgments Zaheduzzaman Sarker was a co-author of RFC 8298 the previous version of scream which this document was based on. We would like to thank the following people for their comments, questions, and support during the work that led to this memo: Mirja Kuehlewind. Authors' Addresses Ingemar Johansson Ericsson Email: ingemar.s.johansson@ericsson.com Magnus Westerlund Ericsson Johansson & Westerlund Expires 24 April 2025 [Page 36] Internet-Draft SCReAMv2 October 2024 Email: magnus.westerlund@ericsson.com Johansson & Westerlund Expires 24 April 2025 [Page 37]