DNS is one of the most critical pieces of internet infrastructure, quietly translating human-readable domain names into IP addresses billions of times per day.
Most resources online claim DNS uses UDP for resolution, but that’s not true. Rather, it leverages both UDP and TCP depending on the situation. Let’s dig deeper…
Understanding DNS Query Patterns
Before exploring the transport protocols, let’s establish the context. DNS operates on a simple request-response model where clients (resolvers) query servers for resource records. A typical DNS query follows this flow:
- Application requests IP for
example.com
- Stub resolver checks local cache
- If not cached, query sent to recursive resolver
- Recursive resolver may query root servers, TLD servers, and authoritative servers
- Response travels back through the chain
By the way, if you want to dig deeper, here’s a video of me explaining - how DNS really works.
This seemingly simple process involves multiple network hops, each with different performance and reliability requirements. The choice of transport protocol directly impacts latency, reliability, and resource utilization at each step.
Where UDP Comes In
DNS primarily uses UDP on port 53, and this choice is fundamental to its performance characteristics, typically gained by avoiding the three-way handshake required by TCP. For DNS queries, this means:
- Reduced Latency: A DNS query over UDP requires only two packets (query + response) versus TCP’s minimum of seven packets (3 for handshake + query + response + 2 for connection teardown)
- Lower Resource Consumption: No connection state to maintain on servers (each packet is independent)
Consider a busy DNS server handling 100,000 queries per second. With UDP, each query is stateless and independent. With TCP, the server would need to maintain 100,000 concurrent connections, consuming significant memory and file descriptors.
Cache-Friendly Design
DNS responses are heavily cached at multiple levels (browser, OS, recursive resolver (typically router), authoritative server). UDP’s stateless nature aligns perfectly with this caching strategy:
Client → [Cache Hit] → Immediate Response (0 network hops)
Client → [Cache Miss] → Recursive Resolver → [Cached] → Response (1 hop)
Client → [Cache Miss] → Full Resolution Chain → Response (3-4 hops)
Since most DNS queries result in cache hits, the overhead of TCP connection establishment would be wasteful for the majority of requests.
TCP, When UDP Isn’t Enough
While UDP handles the majority of DNS traffic, certain scenarios require TCP’s additional capabilities. Let’s understand these use cases…
Response Size Limitations
The most common reason for TCP fallback is response size. UDP has practical limitations:
- Original Limit: 512 bytes (RFC 1035)
- EDNS0 Extension: Up to 4096 bytes (though many networks limit this)
- Real-world Constraints: Many middleboxes fragment or drop large UDP packets
When a DNS response exceeds the negotiated UDP size limit, the server sets the “truncated” (TC) bit in the response header, signaling the client to retry over TCP.
Common scenarios triggering TCP fallback:
- DNSSEC Responses: Cryptographic signatures significantly increase response size
- Large TXT Records: SPF (email authentication), DKIM (again email authentication), and other text records can be substantial
- Many A/AAAA Records: Popular services with multiple IP addresses
- CNAME Chains: Complex and multiple redirection scenarios
Example of a CNAME chain:
www.example.com → CNAME → site.hosting.com
site.hosting.com → CNAME → cdn.provider.net
cdn.provider.net → A record → 192.0.2.55
So the resolution path is:
www.example.com → site.hosting.com → cdn.provider.net → 192.0.2.55
Zone Transfers
DNS zone transfer is the process of copying DNS records from one DNS server to another. It happens in two flavours, AXFR (Full transfer) and IXFR (Incremental Zone Transfer) exclusively use TCP. This makes sense because:
- Large Data Sets: Zone files can contain thousands of records
- Reliability Requirements: Data integrity is critical for zone synchronization
- Connection-Oriented Nature: Zone transfers are sustained operations, not quick queries
A typical zone transfer might look like:
# Simplified zone transfer flow
secondary_server.connect_tcp(primary_server, port=53)
secondary_server.send_axfr_request("example.com")
primary_server.send_soa_record()
primary_server.send_all_zone_records() # Could be megabytes
primary_server.send_soa_record() # Indicates end
secondary_server.close_connection()
How Fallback Works
Understanding how clients handle UDP-to-TCP fallback is interesting … Let’s dig deeper
Client Behavior
Most DNS resolvers implement a standard fallback pattern:
- Initial UDP Query: Send query with EDNS0 indicating maximum UDP size
- Evaluate Response: Check for TC bit or timeout
- TCP Retry: If needed, establish TCP connection and resend query
- Caching Decision: Cache both the response and the knowledge that this query type requires TCP
Here’s how this might look in pseudocode:
def dns_query(domain, record_type):
# Try UDP first
response = send_udp_query(domain, record_type, max_size=4096)
if response.truncated or response is None:
# Fallback to TCP
response = send_tcp_query(domain, record_type)
# Remember this domain/type combo needs TCP
cache_tcp_preference(domain, record_type)
return response
Performance Implications
The fallback mechanism is costly but essential.
- Double Latency: UDP attempt + TCP retry = 2x the latency for large responses
- Client Complexity: Implementations must handle both protocols correctly
Modern resolvers (routers or ISPs or even global DNS resolvers) optimize this by:
- Caching TCP Requirements: Remember which queries need TCP. So next time it directly fires TCP instead of UDP.
- Parallel Queries: Send both UDP and TCP queries simultaneously for critical requests
Numbers
Latency
Here are rough numbers to provide an estimate for the time it takes to resolve a DNS query over UDP and TCP.
UDP DNS Query (cache miss):
- Connection: 0ms (connectionless)
- Query/Response: ~20ms (network RTT)
- Total: ~20ms
TCP DNS Query (cache miss):
- Connection establishment: ~20ms (1 RTT)
- Query/Response: ~20ms (network RTT)
- Connection teardown: ~0ms (async)
- Total: ~40ms
This 2x latency difference explains why UDP remains the default choice.
Throughput Analysis
For high-volume DNS servers, the throughput differences are even more pronounced:
UDP Performance:
- Queries/second: 100,000+
- Memory per query: ~1KB (temporary)
- File descriptors: Minimal
TCP Performance:
- Queries/second: 10,000-50,000
- Memory per connection: ~8KB minimum
- File descriptors: 1 per connection
These numbers demonstrate why authoritative DNS servers strongly prefer UDP for routine queries.
Implementation Considerations
When building systems that interact with DNS, understanding the UDP/TCP duality has practical implications.
Fallback Example
Choose DNS libraries that handle fallback gracefully. Here’s an example using the Python package dnspython
.
# pip install dnspython
# Library handles UDP/TCP automatically
import dns.resolver
result = dns.resolver.resolve('example.com', 'A')
print(result.response.to_text())
# Explicit protocol control when needed
import dns.query
import dns.message
query = dns.message.make_query('example.com', 'A')
try:
response = dns.query.udp(query, '8.8.8.8', timeout=5)
if response.flags & dns.flags.TC:
response = dns.query.tcp(query, '8.8.8.8', timeout=10)
except dns.exception.Timeout:
response = dns.query.tcp(query, '8.8.8.8', timeout=10)
print(response.to_text())
Monitoring and Debugging
Monitor both UDP and TCP DNS traffic in production:
- UDP Success Rate: Percentage of queries resolved without TCP fallback
- TCP Fallback Frequency: Indicates large response prevalence
- Response Size Distribution: Helps optimize EDNS0 buffer sizes
- Timeout Patterns: May indicate UDP packet loss requiring TCP retry
Network Configuration
Ensure network infrastructure supports both protocols:
- Firewall Rules: Allow both UDP and TCP on port 53
- Load Balancer Configuration: Handle both protocols appropriately
- Monitoring Systems: Track both UDP and TCP DNS metrics
Footnotes
DNS uses both UDP and TCP, contrary to the common belief that it only relies on UDP. UDP is preferred for its speed and efficiency in handling the billions of routine queries that keep the internet running, while TCP ensures reliability for larger responses and tasks like zone transfers.
So the next time you see a DNS query timeout or notice varying response times in your applications, remember: behind the scenes, DNS is choosing the best transport for the job—whether that’s the speed and efficiency of UDP or the dependable reliability of TCP.
DNS is actually pretty wild.