Internet-Draft | BFD Encapsulated in Large Packets | January 2025 |
Haas & Fu | Expires 17 July 2025 | [Page] |
The Bidirectional Forwarding Detection (BFD) protocol is commonly used to verify connectivity between two systems. BFD packets are typically very small. It is desirable in some circumstances to know that not only is the path between two systems reachable, but also that it is capable of carrying a payload of a particular size. This document specifies how to implement such a mechanism using BFD in Asynchronous mode.¶
YANG modules for managing this mechanism are also defined in this document. These YANG modules augment the existing BFD YANG modules defined in RFC 9314. The YANG modules in this document conform to the Network Management Datastore Architecture (NMDA) (RFC 8342).¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 17 July 2025.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Bidirectional Forwarding Detection (BFD) [RFC5880] protocol is commonly used to verify connectivity between two systems. However, some applications may require that the Path MTU [RFC1191] between those two systems meets a certain minimum criterion. When the Path MTU decreases below the minimum threshold, those applications may wish to consider the path unusable.¶
BFD may be encapsulated in a number of transport protocols. An example of this is single-hop BFD [RFC5881]. In that case, the link MTU configuration is typically enough to guarantee communication between the two systems for that size MTU. BFD Echo mode (Section 6.4 of [RFC5880]) is sufficient to permit verification of the Path MTU of such directly connected systems. Previous proposals ([I-D.haas-xiao-bfd-echo-path-mtu]) have been made for testing Path MTU for such directly connected systems. However, in the case of multi-hop BFD [RFC5883], this guarantee does not hold.¶
The encapsulation of BFD in multi-hop sessions is a simple UDP packet. The BFD elements of procedure (Section 6.8.6 of [RFC5880]) covers validating the BFD payload. However, the specification is silent on the length of the encapsulation that is carrying the BFD PDU. While it is most common that the transport protocol payload (i.e., UDP) length is the exact size of the BFD PDU, this is not required by the elements of procedure. This leads to the possibility that the transport protocol length may be larger than the contained BFD PDU.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Support for BFD between two systems is typically configured, even if the actual session may be dynamically created by a client protocol. A new BFD variable is defined in this document:¶
The Don't Fragment bit (Section 2.3 of [RFC0791]) of the IP payload, when using IPv4 encapsulation, MUST be set.¶
While this document proposes no change to the BFD protocol, implementations may not permit arbitrarily padded transport PDUs to carry BFD packets. While Section 6 of [RFC5880] warns against excessive pedantry, implementations may not work with this mechanism without additional support.¶
[RFC5880], section 6.8.6, discusses the procedures for receiving BFD Control packets. The length of the BFD Control packet is validated to be less than or equal to the payload of the encapsulating protocol. When a receiving implementation is incapable of processing Large BFD Packets, it could manifest in one of two possible ways:¶
In each of these cases, the BFD state machine would behave as if it were not receiving Control packets and the receiving implementation would follow normal BFD procedures regarding not having received control packets.¶
If Large BFD Packets is enabled on a session that is already in the Up state and the remote BFD system does not, or cannot support receiving the padded BFD control packets, the session will go Down.¶
Since the consideration is path MTU, BFD sessions using this feature only need to use an appropriate value of bfd.PaddedPduSize appropriate to exercise the path MTU for the desired application. This may be significantly smaller than the system's link MTU; e.g., desired path MTU is 1512 bytes while the interface MTU that BFD with large packets is running on is 9000 bytes.¶
In the case multiple BFD clients desire to test the same BFD endpoints using different bfd.PaddedPduSize parameters, implementations SHOULD select the largest bfd.PaddedPduSize parameter from the configured sessions. This is similar to how implementations of BFD select the most aggressive timing parameters for multiple sessions to the same endpoint. Failure to select the largest size will result in BFD sessions going to the Up state and dependent applications not having their MTU requirements satisfied.¶
The accepted MTU for an interface is impacted by packet encapsulation considerations at a given layer; e.g., layer 2, layer 3, tunnel, etc. A common misconfiguration of interface parameters is inconsistent MTU. In the presence of inconsistent MTU, it is possible for applications to have unidirectional connectivity.¶
When it is necessary for an application using BFD with Large Packets to test the bi-directional Path MTU, it is necessary to configure the bfd.PaddedPduSize parameter on each side of the BFD session. E.g., if the desire is to verify a 1500 byte MTU in both directions on an Ethernet or point to point link, each side of the BFD session must have bfd.PaddedPduSize set to 1500. In the absence of such consistent configuration, BFD with Large Packets may correctly determine unidirectional connectivity at the tested MTU, but bi-directional MTU may not be properly validated.¶
It should be noted that some interfaces may intentionally have different MTUs. Setting the bfd.PaddedPduSize appropriately for each side of the BFD session supports such scenarios.¶
Once BFD sessions using Large Packets has reached the Up state, connectivity at the tested MTU(s) for the session is being validated. If the path MTU tested by the BFD with Large Packets session falls below the tested MTU, the BFD session will go Down.¶
In the opposite circumstance where the path MTU increases, the BFD session will continue without being impacted. BFD for Large Packets only ensures that the minimally acceptable MTU for the session can be used.¶
Various mechanisms are utilized to increase throughput between two endpoints at various network layers. Such features include Link Aggregate Groups (LAGs) or ECMP forwarding. Such mechanisms balance traffic across multiple physical links while hiding the details of that balancing from the higher networking layers. The details of that balancing are highly implementation specific.¶
In the presence of such load balancing mechanisms, it is possible to have member links that are not properly forwarding traffic. In such circumstances, this will result in dropped traffic when traffic is chosen to be load balanced across those member links.¶
Such load balancing mechanisms may not permit all link members to be properly tested by BFD. This is because the BFD Control packets may be forwarded only along links that are up. BFD on LAG, [RFC7130], was developed to help cover one such scenario. However, for testing forwarding over multiple hops, there is no such specified general purpose BFD mechanism for exercising all links in an ECMP. This may result in a BFD session being in the Up state while some traffic may be dropped or otherwise negatively impacted along some component links.¶
Some BFD implementations utilize their internal understanding of the component links and their resultant forwarding to exercise BFD in such a way to better test the ECMP members and to tie the BFD session state to the health of that ECMP. Due to the implementation specific load balancing, it is not possible to standardize such additional mechanisms for BFD.¶
Misconfiguration of some member MTUs may lead to Load Balancing that may have an inconsistent Path MTU depending on how the traffic is balanced. While the intent of BFD with Large Packets is to verify path MTU, it is subject to the same considerations above.¶
The above text also applies to most, if not all, BFD techniques.¶
This mechanism also can be applied to other forms of BFD, including S-BFD [RFC7880].¶
This YANG module augments the "ietf-bfd" module to add a flag 'padding' to enable this feature. The feature statement 'padding' needs to be enabled to indicate that BFD Encapsulated in Large Packet is supported by the implementation.¶
Further, this YANG module augments the YANG modules for single-hop, multi-hop, LAG, and MPLS to add the "padded-pdu-size" parameter to those session types to configure Large BFD packets.¶
Finally, similar to the grouping "client-cfg-parms" defined in Section 2.1 of [RFC9314], this YANG module defines a grouping "bfd-large-common" that may be utilized by BFD clients using "client-cfg-params" to uniformly add support for the feature defined in this RFC.¶
This YANG module imports A YANG Data Model for Routing [RFC8349], and YANG Data Model for Bidirectional Forwading Detection (BFD) [RFC9314].¶
This document does not change the underlying security considerations of the BFD protocol or its encapsulations.¶
On-path attackers that can selectively drop BFD packets, including those with large MTUs, can cause BFD sessions to go Down.¶
The contents of the padding payload are set to zero. This avoids implementation issues where the local uninitialized data may be leaked.¶
The YANG module specified in this document defines a schema for data that is designed to be accessed via network management protocols such as NETCONF [RFC6241] or RESTCONF [RFC8040]. The lowest NETCONF layer is the secure transport layer, and the mandatory-to-implement secure transport is Secure Shell (SSH) [RFC6242]. The lowest RESTCONF layer is HTTPS, and the mandatory-to-implement secure transport is TLS [RFC8446]. The NETCONF Access Control Model (NACM) [RFC8341] provides the means to restrict access for particular NETCONF or RESTCONF users to a preconfigured subset of all available NETCONF or RESTCONF protocol operations and content.¶
There are a number of data nodes defined in this YANG module that are writable/creatable/deletable (i.e., config true, which is the default). These data nodes may be considered sensitive or vulnerable in some network environments. Write operations (e.g., edit-config) to these data nodes without proper protection can have a negative effect on network operations. Some of the subtrees and data nodes and their sensitivity/vulnerability are described here.¶
Some of the readable data nodes in this YANG module may be considered sensitive or vulnerable in some network environments. It is thus important to control read access (e.g., via get, get-config, or notification) to these data nodes.¶
There are no read-only data nodes defined in this model.¶
Some of the RPC operations in this YANG module may be considered sensitive or vulnerable in some network environments. It is thus important to control access to these operations.¶
There are no RPC operations defined in this model.¶
This document registers one URIs in the "ns" subregistry of the "IETF XML" registry [RFC3688]. Following the format in [RFC3688], the following registration is requested:¶
This document registers one YANG modules in the "YANG Module Names" registry [RFC6020]. Following the format in [RFC6020], the following registrations are requested:¶
The authors would like to thank Les Ginsberg, Mahesh Jethanandani, Robert Raszuk, and Ketan Talaulikar, for their valuable feedback on this proposal.¶