Why does DNS use both UDP and TCP

Arpit Bhayani

curious, tinkerer, and explorer


DNS is one of the most critical pieces of internet infrastructure, quietly translating human-readable domain names into IP addresses billions of times per day.

Most resources online claim DNS uses UDP for resolution, but that’s not true. Rather, it leverages both UDP and TCP depending on the situation. Let’s dig deeper…

Understanding DNS Query Patterns

Before exploring the transport protocols, let’s establish the context. DNS operates on a simple request-response model where clients (resolvers) query servers for resource records. A typical DNS query follows this flow:

  1. Application requests IP for example.com
  2. Stub resolver checks local cache
  3. If not cached, query sent to recursive resolver
  4. Recursive resolver may query root servers, TLD servers, and authoritative servers
  5. Response travels back through the chain

By the way, if you want to dig deeper, here’s a video of me explaining - how DNS really works.

This seemingly simple process involves multiple network hops, each with different performance and reliability requirements. The choice of transport protocol directly impacts latency, reliability, and resource utilization at each step.

Where UDP Comes In

DNS primarily uses UDP on port 53, and this choice is fundamental to its performance characteristics, typically gained by avoiding the three-way handshake required by TCP. For DNS queries, this means:

  • Reduced Latency: A DNS query over UDP requires only two packets (query + response) versus TCP’s minimum of seven packets (3 for handshake + query + response + 2 for connection teardown)
  • Lower Resource Consumption: No connection state to maintain on servers (each packet is independent)

Consider a busy DNS server handling 100,000 queries per second. With UDP, each query is stateless and independent. With TCP, the server would need to maintain 100,000 concurrent connections, consuming significant memory and file descriptors.

Cache-Friendly Design

DNS responses are heavily cached at multiple levels (browser, OS, recursive resolver (typically router), authoritative server). UDP’s stateless nature aligns perfectly with this caching strategy:

Client → [Cache Hit] → Immediate Response (0 network hops)
Client → [Cache Miss] → Recursive Resolver → [Cached] → Response (1 hop)
Client → [Cache Miss] → Full Resolution Chain → Response (3-4 hops)

Since most DNS queries result in cache hits, the overhead of TCP connection establishment would be wasteful for the majority of requests.

TCP, When UDP Isn’t Enough

While UDP handles the majority of DNS traffic, certain scenarios require TCP’s additional capabilities. Let’s understand these use cases…

Response Size Limitations

The most common reason for TCP fallback is response size. UDP has practical limitations:

  • Original Limit: 512 bytes (RFC 1035)
  • EDNS0 Extension: Up to 4096 bytes (though many networks limit this)
  • Real-world Constraints: Many middleboxes fragment or drop large UDP packets

When a DNS response exceeds the negotiated UDP size limit, the server sets the “truncated” (TC) bit in the response header, signaling the client to retry over TCP.

Common scenarios triggering TCP fallback:

  • DNSSEC Responses: Cryptographic signatures significantly increase response size
  • Large TXT Records: SPF (email authentication), DKIM (again email authentication), and other text records can be substantial
  • Many A/AAAA Records: Popular services with multiple IP addresses
  • CNAME Chains: Complex and multiple redirection scenarios

Example of a CNAME chain:

www.example.com → CNAME → site.hosting.com
site.hosting.com → CNAME → cdn.provider.net
cdn.provider.net → A record → 192.0.2.55

So the resolution path is:
www.example.com → site.hosting.com → cdn.provider.net → 192.0.2.55

Zone Transfers

DNS zone transfer is the process of copying DNS records from one DNS server to another. It happens in two flavours, AXFR (Full transfer) and IXFR (Incremental Zone Transfer) exclusively use TCP. This makes sense because:

  • Large Data Sets: Zone files can contain thousands of records
  • Reliability Requirements: Data integrity is critical for zone synchronization
  • Connection-Oriented Nature: Zone transfers are sustained operations, not quick queries

A typical zone transfer might look like:

# Simplified zone transfer flow
secondary_server.connect_tcp(primary_server, port=53)
secondary_server.send_axfr_request("example.com")
primary_server.send_soa_record()
primary_server.send_all_zone_records()  # Could be megabytes
primary_server.send_soa_record()  # Indicates end
secondary_server.close_connection()

How Fallback Works

Understanding how clients handle UDP-to-TCP fallback is interesting … Let’s dig deeper

Client Behavior

Most DNS resolvers implement a standard fallback pattern:

  1. Initial UDP Query: Send query with EDNS0 indicating maximum UDP size
  2. Evaluate Response: Check for TC bit or timeout
  3. TCP Retry: If needed, establish TCP connection and resend query
  4. Caching Decision: Cache both the response and the knowledge that this query type requires TCP

Here’s how this might look in pseudocode:

def dns_query(domain, record_type):
    # Try UDP first
    response = send_udp_query(domain, record_type, max_size=4096)
    
    if response.truncated or response is None:
        # Fallback to TCP
        response = send_tcp_query(domain, record_type)
        
        # Remember this domain/type combo needs TCP
        cache_tcp_preference(domain, record_type)
    
    return response

Performance Implications

The fallback mechanism is costly but essential.

  • Double Latency: UDP attempt + TCP retry = 2x the latency for large responses
  • Client Complexity: Implementations must handle both protocols correctly

Modern resolvers (routers or ISPs or even global DNS resolvers) optimize this by:

  • Caching TCP Requirements: Remember which queries need TCP. So next time it directly fires TCP instead of UDP.
  • Parallel Queries: Send both UDP and TCP queries simultaneously for critical requests

Numbers

Latency

Here are rough numbers to provide an estimate for the time it takes to resolve a DNS query over UDP and TCP.

UDP DNS Query (cache miss):
- Connection: 0ms (connectionless)
- Query/Response: ~20ms (network RTT)
- Total: ~20ms

TCP DNS Query (cache miss):
- Connection establishment: ~20ms (1 RTT)
- Query/Response: ~20ms (network RTT)  
- Connection teardown: ~0ms (async)
- Total: ~40ms

This 2x latency difference explains why UDP remains the default choice.

Throughput Analysis

For high-volume DNS servers, the throughput differences are even more pronounced:

UDP Performance:
- Queries/second: 100,000+
- Memory per query: ~1KB (temporary)
- File descriptors: Minimal

TCP Performance:
- Queries/second: 10,000-50,000
- Memory per connection: ~8KB minimum
- File descriptors: 1 per connection

These numbers demonstrate why authoritative DNS servers strongly prefer UDP for routine queries.

Implementation Considerations

When building systems that interact with DNS, understanding the UDP/TCP duality has practical implications.

Fallback Example

Choose DNS libraries that handle fallback gracefully. Here’s an example using the Python package dnspython.

# pip install dnspython

# Library handles UDP/TCP automatically
import dns.resolver
result = dns.resolver.resolve('example.com', 'A')
print(result.response.to_text())

# Explicit protocol control when needed
import dns.query
import dns.message

query = dns.message.make_query('example.com', 'A')
try:
    response = dns.query.udp(query, '8.8.8.8', timeout=5)
    if response.flags & dns.flags.TC:
        response = dns.query.tcp(query, '8.8.8.8', timeout=10)
except dns.exception.Timeout:
    response = dns.query.tcp(query, '8.8.8.8', timeout=10)

print(response.to_text())

Monitoring and Debugging

Monitor both UDP and TCP DNS traffic in production:

  • UDP Success Rate: Percentage of queries resolved without TCP fallback
  • TCP Fallback Frequency: Indicates large response prevalence
  • Response Size Distribution: Helps optimize EDNS0 buffer sizes
  • Timeout Patterns: May indicate UDP packet loss requiring TCP retry

Network Configuration

Ensure network infrastructure supports both protocols:

  • Firewall Rules: Allow both UDP and TCP on port 53
  • Load Balancer Configuration: Handle both protocols appropriately
  • Monitoring Systems: Track both UDP and TCP DNS metrics

Footnotes

DNS uses both UDP and TCP, contrary to the common belief that it only relies on UDP. UDP is preferred for its speed and efficiency in handling the billions of routine queries that keep the internet running, while TCP ensures reliability for larger responses and tasks like zone transfers.

So the next time you see a DNS query timeout or notice varying response times in your applications, remember: behind the scenes, DNS is choosing the best transport for the job—whether that’s the speed and efficiency of UDP or the dependable reliability of TCP.

DNS is actually pretty wild.


If you find this helpful and interesting,

Arpit Bhayani

Staff Engg at GCP Memorystore, Creator of DiceDB, ex-Staff Engg for Google Ads and GCP Dataproc, ex-Amazon Fast Data, ex-Director of Engg. SRE and Data Engineering at Unacademy. I spark engineering curiosity through my no-fluff engineering videos on YouTube and my courses

Writings and Learnings

Blogs

Papershelf

Bookshelf

RSS Feed


Arpit's Newsletter read by 145,000 engineers

Weekly essays on real-world system design, distributed systems, or a deep dive into some super-clever algorithm.


The courses listed on this website are offered by

Relog Deeptech Pvt. Ltd.
203, Sagar Apartment, Camp Road, Mangilal Plot, Amravati, Maharashtra, 444602
GSTIN: 27AALCR5165R1ZF