DNS Load Balancing

Updated June 3, 2026
M
Magic Magnets Team
7 min read

When you type netflix.com into your browser, your computer has no idea where Netflix lives. Before it can make any HTTP request, it first asks the Domain Name System to translate the hostname into an IP address. That DNS lookup is the very first network hop in any request — and for large-scale systems, it's also the first opportunity to route traffic intelligently.

How DNS Becomes a Load Balancer

Normally, a DNS A record maps one domain to one IP address: example.com → 203.0.113.5. If that server goes down or gets overwhelmed, the entire service is unreachable.

DNS load balancing breaks this 1:1 constraint. Instead of returning a single IP, a DNS server can return multiple IPs, rotate through a list, or choose which IP to return based on where the request is coming from. This makes DNS the coarsest but most globally scalable layer of traffic distribution.

Round-Robin DNS

algobase.dev
Round-robin DNS — the simplest form of DNS load balancing. The DNS server holds multiple A records for the same domain and returns them in rotation: client 1 gets 10.0.0.1, client 2 gets 10.0.0.2, and so on. Free to configure, requires no additional infrastructure. The critical limitation: DNS is blind to server health — if 10.0.0.2 goes down, DNS keeps handing out that dead IP until the TTL expires.
1 / 1

Round-robin DNS — each query gets a different IP from the rotation

The simplest form. The DNS server holds multiple A records for the same domain and returns them in rotation. Client A gets 10.0.0.1, client B gets 10.0.0.2, client C gets 10.0.0.3, client D loops back to 10.0.0.1.

No additional infrastructure required. However, DNS is entirely blind to server health — if 10.0.0.2 goes down, DNS keeps handing out that dead IP until someone manually removes the record or the TTL expires and clients refresh.

This blindness makes round-robin DNS unsuitable as a standalone solution for production traffic. It's a starting point, not a destination.

Quiz Time

What is the core mechanism that allows DNS to act as a load balancer?

Quiz Time

Round-robin DNS is unsuitable as a standalone production solution primarily because it cannot detect server health failures.

Geolocation and Latency-Based Routing

algobase.dev
Geo-DNS — the DNS server returns different IPs based on the geographic origin of the request. AWS Route 53, Cloudflare, and Google Cloud DNS all support this. A US user gets the US-East load balancer IP; an EU user gets the EU-West IP. This reduces latency by hundreds of milliseconds — not by making the app faster, but by putting it physically closer. The TTL determines how long clients cache the routing decision: a 60-second TTL means failover propagates within a minute.
1 / 1

Geo-DNS routing US users to us-east and EU users to eu-west

The speed of light through fiber optic cable is a hard physical constraint. A user in Tokyo querying a server in Virginia will always experience roughly 150–200ms of round-trip latency just from the physical distance — before any application processing happens.

Geo-DNS solves this by inspecting the geographic origin of each DNS query and returning the IP of the closest data center. AWS Route 53 supports both geolocation routing (based on country or continent) and latency-based routing (based on measured round-trip times to each region). A user in Europe gets the eu-west-1 load balancer IP; a user in Asia gets the ap-southeast-1 IP — both querying the exact same domain name.

The TTL on these records matters. A 60-second TTL means that if the nearest data center fails and its IPs are removed, users will see errors for up to 60 seconds while their cached DNS entries expire. Setting the TTL too low (say, 5 seconds) puts constant pressure on DNS infrastructure and reduces caching benefits.

Quiz Time

A company wants to reduce latency for users in Asia by routing them to a nearby data center without requiring them to use a different domain name. Which DNS routing strategy fits this requirement?

Quiz Time

Setting a very short DNS TTL (e.g., 5 seconds) eliminates the caching problem and makes DNS routing as fast as an application load balancer.

Weighted Routing

Weighted routing assigns a percentage of traffic to each endpoint. Send 90% of queries to the stable cluster and 10% to the canary. If the canary holds up, gradually shift the weights. This is the DNS-layer equivalent of a canary deployment, useful for testing new regions or gradually migrating traffic without a hard cutover.

Quiz Time

A team wants to deploy a new region gradually, sending 10% of traffic there and 90% to the existing region. Which DNS feature supports this?

Health-Check Routing (Failover)

Managed DNS providers like Route 53 and Cloudflare can actively monitor your endpoints. They periodically probe each IP address and, if health checks fail, automatically stop returning that IP in responses. When your Virginia data center becomes unreachable, Route 53 detects the failure and within seconds begins returning only the Ohio failover IP to new clients.

This is significantly better than passive round-robin — failures are detected proactively rather than left to clients to discover. The speed of recovery is bounded by the check interval (typically 10–30 seconds) plus the TTL of cached records.

The Caching Problem

DNS load balancing has one fundamental limitation that cannot be engineered away: caching. Browsers, operating systems, and ISPs all cache DNS responses for the duration of the TTL. When you update a DNS record — whether to pull a failing IP or shift routing weights — the change doesn't propagate instantly. Any client that already has the old record cached won't see the new one until their TTL expires.

This makes DNS a coarse routing layer. It's excellent for directing users to the right continent or data center, but it's too slow and too imprecise to make per-request decisions.

DNS as the First Layer, Not the Only Layer

algobase.dev
DNS load balancing in production — DNS is the first layer that routes traffic to the right data center or region, while an application load balancer handles per-request distribution within that region. They solve different problems: DNS is coarse and slow to change (TTL-bound), but can route globally. The app LB operates in milliseconds and can make fine-grained routing decisions. Every large-scale system uses both layers together.
1 / 1

DNS resolves to LB IP, then app LB distributes per-request within the region

In practice, DNS load balancing is almost always paired with a dedicated application load balancer (Nginx, HAProxy, AWS ALB) inside each data center. DNS gets you to the right region in O(seconds) granularity. The application LB then distributes individual requests across the backend fleet in O(milliseconds) granularity.

These two layers solve different problems. DNS handles global, coarse routing — it decides which data center or region serves a user. The application LB handles fine-grained, per-request distribution within that region. You need both for a production system at scale.

Quiz Time

In a production system at scale, DNS load balancing and application load balancers solve the same problem, so only one of them is needed.

Summary

DNS load balancing distributes traffic by returning different IP addresses for the same domain name. Round-robin DNS is the simplest form but has no health awareness. Geo-DNS and latency-based routing (like AWS Route 53) dramatically reduce latency by routing users to the nearest physical data center. Weighted routing enables safe canary deployments at the DNS layer. Health-check routing provides automatic failover when endpoints fail.

The key constraint is TTL-based caching: DNS changes propagate slowly, making it unsuitable for fine-grained per-request decisions. For that, you pair it with a traditional application load balancer — DNS as the global router, the app LB as the local distributor.

Anycast Routing

How helpful was this content?

Comments

0/2000

Sign in to join the discussion

Saved on this device only

Sign in to sync progress across devices