Internet-Draft DNS Servers MUST Shuffle Answers November 2024
Shane Kerr Expires 8 May 2025 [Page]
Workgroup:
Network Working Group
Internet-Draft:
draft-kerr-everybodys-shuffling-00
Published:
Intended Status:
Informational
Expires:
Author:
Shane Kerr
IBM

DNS Servers MUST Shuffle Answers

Abstract

DNS Resource Records (DNS RR) are often used in the order that they are returned. This means that if there are more than one RR in a RR set, then they should be returned in a random order, to reduce bias in how the answers are used.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 8 May 2025.

Table of Contents

1. Introduction

Answers returned by DNS servers in a consistent order cause applications to use answers that appear first more often. To avoid this, DNS RRset should be randomized.

1.1. Requirements Notation

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Recommendations

2.1. Authoritative Servers

DNS Authoritative servers SHOULD return RR within an RRset in random order. They MAY use a fixed order for records returned for delegations, such as DS records, NS records, and associated glue records returned for delegations.

2.2. Recursive Resolvers

DNS Recursive Resolvers MUST return RR within an RRset in random order.

2.3. Stub Resolvers

DNS Stub Resolvers SHOULD return RR within an RRset in random order.

2.4. Applications

Applications SHOULD use RR within an RRset in random order.

3. Discussion

RRsets are sets in the mathematical sense, meaning that they are unordered, and that a given value can only appear in the set once. However, when an RRset is represented in presentation format (how it appears in zone files) or in message format (also known as "wire format") then the RR have an order.

For example, a DNS answer may have:

foo.example.    60  IN  AAAA 2001:db8::aaaa
foo.example.    60  IN  AAAA 2001:db8::bbbb

This is exactly the same RRset as:

foo.example.    60  IN  AAAA 2001:db8::bbbb
foo.example.    60  IN  AAAA 2001:db8::aaaa

However, an application may treat the second differently than the first, usually by sending traffic to 2001:db8::bbbb instead of 2001:db8::aaaa.

A common pattern for services is to publish multiple addresses in the DNS in order to provide a sort of simple load-balancing for services. However, this only works if the addresses are all used at similar rates.

3.1. Caching

The most important place to randomize results is in components that cache RRset. This is because these RRset are used many times, so if they are not randomized then any bias will be amplified. Resolvers cache RRset, so it is very important that they randomize order in replies.

RRset may also be cached in other places, such as in stub resolvers. These stub resolvers can reside in several places, including the operating system, or within software libraries. If these cache RRset, then they should also randomize.

Applications themselves may also "cache" RR. This may be actual caching of RRsets using TTL from the DNS protocol, or it may as simple as the application doing a DNS lookup once and then using the result until it exits. Whether or not an application stores results is less important than how the application uses the results. A sophisticated application may use several of the IP addresses for a given name concurrently, for example using Happy Eyeballs [RFC8305]. However a naive application may simply pick the first IP address that it gets back from the DNS; such applications will benefit from using a resolver that randomizes answers.

3.2. Response Chains

DNS responses mostly originate at authoritative servers, and then proceed to recursive resolvers, and then make their way to applications.

If we have applications that use results in the order returned, then resolvers can help by randomizing their results. Since they return answers from cache, resolvers are probably the most critical.

However, not all resolvers randomize the order of results, so it is helpful for authoritative servers to randomize the order as well.

3.3. Possible Issues

3.3.1. Shuffling versus Rotating

One approach to changing the order may be rotating through the answers. So, with these answers:

bar.example.    60  CNAME a0.example.
bar.example.    60  CNAME b1.example.
bar.example.    60  CNAME c2.example.

One approach could be returning a0/b1/c2, then b1/c2/a0, then c2/a0/b1. The problem with this is that if one of the values does not work, a resolver going in order will just go to the next one, so load will shift to the next one in order rather than splitting across the working servers.

Any software returning in a random order SHOULD fully shuffle the results rather than just rotating through them.

3.4. Answers from Authority Servers versus Answers from Cache

One approach is to shuffle answers return by a recursive resolver when taken from cache. This is mostly reasonable, since we expect that the majority of answers are read from cache. However, for RRsets with low TTL or low query rates they will be returning answers that cannot be read from the cache. This may result in a significant portion of answers that are returned in the same order.

3.5. Motivations for Ordered Results

The main motivation for returning ordered results is probably to save CPU time on the server. A server may pre-build or remember the answer and have most of it ready for replies, whereas shuffling answers means having to build the response for every query.

In some cases, a server may deliberately wish to bias traffic. This is possible when supported, for example by using the "weight" field of the SRV record type [RFC2782].

However, if using a record type that does not have something like "weight", there is no guarantee that traffic will prefer the first answers returned. In other words, sending results in a specific order cannot help, but it can do harm.

4. Security Considerations

The algorithm chosen to randomize the order of results does not have to be cryptographically-secure.

5. IANA Considerations

None.

6. References

6.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.

6.2. Informative References

[RFC2782]
Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for specifying the location of services (DNS SRV)", RFC 2782, DOI 10.17487/RFC2782, , <https://www.rfc-editor.org/rfc/rfc2782>.
[RFC8305]
Schinazi, D. and T. Pauly, "Happy Eyeballs Version 2: Better Connectivity Using Concurrency", RFC 8305, DOI 10.17487/RFC8305, , <https://www.rfc-editor.org/rfc/rfc8305>.

Author's Address

Shane Kerr
IBM
Johan Huizingalaan 765
1066 VH Amsterdam
Netherlands