DNS 101: Round Robin (Or Back When I was Young And Foolish Part II)

I learned something today. It's something that made me feel stupid for not knowing. Something that seemed elemental and trivial - yet, I did not know it. So please, allow me to relay my somewhat embarrassing learning experience in the hopes that it will save someone else from the same embarrassment.

I did know what DNS round robin was. Or at least, I would have said that I did.

Imagine you configure DNS1, as a DNS server, to use round robin. Then, you create 3 host (A or AAAA) records for the same host name, using different IPs. Let's say we create the following A records on DNS1:

server01 - A 10.0.0.4
server01 - A 10.0.0.5
server01 - A 10.0.0.6

Then on a workstation which is configured to use DNS1 as a DNS server, you ping server01. You receive 10.0.0.4 as a reply. You ping server01 again. With no hesitation, you get a reply from 10.0.0.4 again. We assume that your local workstation has cached 10.0.0.4 locally and will reuse that IP for server01 until the entry either expires, or we flush the DNS cache on the workstation with a command like ipconfig/flushdns.

I run ipconfig/flushdns. Then I ping server01 again.

This time I receive a response from 10.0.0.5. Now I assume DNS round robin is working perfectly. I go home for the day feeling like I know everything there is to know about DNS.

But was it that the DNS server is responding to DNS queries with the single next A/AAAA record that it has on file, in a round-robin type sequential fashion to every DNS query that it receives? That is what I assumed.

But the fact of the matter is that DNS servers, when queried for a host name, actually return a list of all A/AAAA records associated with that host name, every time that host name is queried for. (To a point - the list must fit within a UDP packet, and some firewalls/filters don't let UDP packets longer than 512 bytes through. That's changing though. Our idea of how big data is and should be allowed to be is always growing.)

I assume that www.google.com, being one of the busiest websites in the world, has not only some global load balancing and other advanced load balancing techniques employed, but probably also has more than one host record associated with it. To test my theory, I fire up Wireshark and start a packet capture. I then flush my local DNS cache with ipconfig/flushdns and then ping www.google.com.

Notice how I pinged it, got one IP address in response (.148), then flushed my DNS cache, pinged it again and got another different IP address (.144)? But despite what it may look like, that name server is not returning just one A/AAAA record each time I query it:


*Click for Larger*

My workstation is ::9. My workstation's DNS server is ::1. The DNS server is configured to forward DNS requests for zones for which it is not authoritative on to yet another DNS server. So I ask for www.google.com, my DNS server doesn't know, so it forwards the request. The forwardee finally finds out and reports back to my DNS server, which in turn relays back to me a list of all the A records for www.google.com. I get a long list containing not only a mess of A records, but a CNAME thrown in there too, all from a single DNS query! (We're not worried about the subsequent query made for an AAAA record right now. Another post perhaps.)

I was able to replicate this same behavior in a sanitary lab environment running a Windows DNS server and confirmed the same behavior. (Using the server01 example I mentioned earlier.)

Where round robin comes in is that it rotates the order of the list given to each subsequent client who requests it. Keep in mind that while round robin-ing the A records in your DNS replies does supply a primitive form of load distribution, it's a pretty poor substitute for real load balancing, since if one of the nodes in the list goes down, the DNS server will be none the wiser and will continue handing out the list with the downed node's IP address on it.

Lastly, since we know that our client is receiving an entire list of A records for host names which have many IP addresses, what does it actually do with the list?  Well, the ping utility doesn't do much. If the first IP address on the list is down, you get a destination unreachable message and that's it. (Leading to a lot of people not realizing they have a whole list of IPs they could try.) Web browsers however, have a nifty feature known as "browser retry" or "client retry," where they will continue trying the other IPs in the list until they find a working one. Then they will cache the working IP address so that the user does not continue to experience the same delay in web page loading as they did the first time. Yes, there are exploits concerning this feature, and yes it's probably a bad idea to rely on this since browser retry is implemented differently across every different browser and operating system. It's a relatively new mechanism actually, and people may not believe you if you tell them. To prove it to them, find (or create) a host name which has several bad IPs and one or two good ones. Now telnet to that hostname. Even telnet (a modern version from a modern operating system) will use getaddrinfo() instead of gethostbyname() and if it fails to connect the first IP, you can watch it continue trying the next IPs in the list.

More info here, here and here. That last link is an MSDN doc on getaddrinfo(). Notice that it does talk about different implementations on different operating systems, and that ppResult is "a pointer to a linked list of one or more addrinfo structures that contains response information about the host."

Comments (4) -

Colm MacCarthaigh 1/20/2012 2:24:43 PM

In reality, DNS providers use a mixture of round-robin DNS and multi-RR responses. Google uses both;

colmmacc% dig www.l.google.com @ns1.google.com.

; <<>> DiG 9.4.2 <<>> www.l.google.com @ns1.google.com.
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7668
;; flags: qr aa rd; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;www.l.google.com.    IN  A

;; ANSWER SECTION:
www.l.google.com.  300  IN  A  74.125.127.99
www.l.google.com.  300  IN  A  74.125.127.105
www.l.google.com.  300  IN  A  74.125.127.106
www.l.google.com.  300  IN  A  74.125.127.147
www.l.google.com.  300  IN  A  74.125.127.103
www.l.google.com.  300  IN  A  74.125.127.104


That response is an answer containing multiple A records. The order of these may be random, and in general a good recursive name-server will also rotate the order in which it serves this response to clients and stub resolvers. But another query to Google, for the same name:

colmmacc% dig www.l.google.com @ns1.google.com.

; <<>> DiG 9.4.2 <<>> www.l.google.com @ns1.google.com.
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29133
;; flags: qr aa rd; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;www.l.google.com.    IN  A

;; ANSWER SECTION:
www.l.google.com.  300  IN  A  173.194.33.20
www.l.google.com.  300  IN  A  173.194.33.16
www.l.google.com.  300  IN  A  173.194.33.17
www.l.google.com.  300  IN  A  173.194.33.18
www.l.google.com.  300  IN  A  173.194.33.19

;; Query time: 74 msec
;; SERVER: 216.239.32.10#53(216.239.32.10)
;; WHEN: Fri Jan 20 12:20:03 2012
;; MSG SIZE  rcvd: 114


Reveals that Google are also using round-robin responses. A DNS query doesn't necessarily give you all of the potential A records for the name, just a subset.

So it looks like Google return you multiple answers for just the reasons you suggest - fault tolerance via retries - but that Google also have far more endpoints than would fit in a single response, and so also use dns-server-side rotation of answer sets.

Awesome info -- thank you for not only the information, but for taking the time to stop by and comment.  Good comments like this encourage me to keep the content frequent and high-quality. Smile

It is important to note that the round-robin happened mostly in the caching DNS server (the one requesting the data on behalf of a client), not on the authoritative server (the DNS Server hosting the data), as clients usually never talk directly to an authoritative DNS Server.

The owner of a domain name has very little control over this, she/he can only put multiple records in the DNS zone and hope the best. Some caching DNS servers can be configured not to do round-robin (like the BIND DNS Server with the rrset-order statement, or some DNS server do not support round robin at all and might sort the responses according to the network topology (to present the "nearest" IP on top).

DNS record that share the same domain name, same network class (IN for Internet) and the same record type (A - IPv4 Address record) are creating a so called resource record set and will be always returned complete by a DNS Server. A DNS Server will not answer with a subset of the resource record set (RRSET). In case of google, there are custom DNS servers that change the content "view" of the zone depending on the query (location ...).

Awesome addition.  I like your blog too.  It looks deeply technical and like you keep it fairly updated.  Hope you don't mind if I add it to my blog roll. Smile

Comments are closed