Service Discovery

Updated June 8, 2026

Magic Magnets Team

9 min read

In a monolith, your code calls a function. Simple. But in a microservices system, your Order Service needs to call your Inventory Service — and that means one service has to know where the other one is running.

That's the service discovery problem, and it's more subtle than it first appears.

Why You Can't Just Hardcode Addresses

The naive approach: hardcode the IP address of each service in a config file. Order Service is at 10.0.1.42:8080. Done.

This breaks immediately in any dynamic environment:

Containers restart and get new IP addresses
Auto-scaling adds and removes instances constantly
Rolling deploys cycle out old instances and bring in new ones
Failures take instances down and replacements come up on different hosts

In a Kubernetes cluster, a pod's IP address is essentially ephemeral. In an auto-scaling group on AWS, instances can come and go every few minutes. You need a system that tracks where services are running right now — not where they were running when you wrote the config.

That system is called a service registry, and the mechanisms for using it are called service discovery.

algobase.dev

Inventory instances register themselves with the Service Registry on startup and send heartbeats to stay listed. The Order Service queries the registry to find healthy instances, then calls one directly. If Inventory A crashes, its heartbeat stops and the registry removes it.

1 / 1

Inventory instances register with the Service Registry on startup. The Order Service queries the registry for healthy instances, then calls one directly.

Client-Side Discovery

In client-side discovery, the calling service is responsible for finding the address of the service it wants to call:

The Order Service queries the service registry: "Where are the Inventory Service instances?"
The registry returns a list of healthy instances with their addresses
The Order Service picks one using a load balancing algorithm: round robin, least connections, and so on.
The Order Service sends the request directly

Pros:

The client controls load balancing logic — it can make smart decisions (prefer same-availability-zone instances, avoid slow instances)
One fewer network hop — no proxy in the middle
Registry is only queried on lookup, not on every request

Cons:

Every client has to implement the discovery and load balancing logic
Each client needs an SDK or library for this — and you have to maintain that library in every language your services use

Netflix's Eureka with the Ribbon client-side load balancer is the classic example of client-side discovery.

Server-Side Discovery

In server-side discovery, the client just sends a request to a well-known address (a load balancer or proxy). The load balancer handles the registry lookup and routing:

The Order Service sends a request to http://inventory-service/
The load balancer (e.g., an AWS ALB or an Nginx instance) queries the service registry
The load balancer picks a healthy Inventory Service instance and forwards the request

Pros:

Clients don't need any discovery logic — they just call a stable address
Works for clients in any language with zero code changes
The discovery/routing logic is centralized and maintained in one place

Cons:

The load balancer is a potential bottleneck and single point of failure (mitigated by running multiple load balancers)
One extra network hop on every request
Less fine-grained control — the client can't make intelligent routing decisions

AWS Elastic Load Balancing, Kubernetes Services, and service mesh proxies (like Envoy in Istio) all implement server-side discovery.

The Service Registry

At the heart of both approaches is the service registry — a distributed key-value store or database that tracks which instances of each service are healthy and running. When an instance starts up, it registers itself. When it shuts down (or crashes), it deregisters.

Consul (HashiCorp)

The most popular standalone service registry. Consul supports service registration, health checking, and key-value storage. It uses the Raft consensus algorithm to keep the registry consistent across a cluster of Consul agents. It also supports DNS-based service discovery out of the box — services can be discovered at inventory-service.service.consul.

etcd (CoreOS / CNCF)

A distributed key-value store using Raft consensus. Originally built for Kubernetes configuration storage, but widely used as a service registry. Kubernetes itself uses etcd to store all cluster state, including service endpoints.

Apache ZooKeeper

The original distributed coordination service, used heavily in the Hadoop ecosystem. Kafka uses ZooKeeper (historically) for broker coordination. More complex to operate than Consul or etcd, but very battle-tested.

Netflix Eureka

Built by Netflix specifically for AWS deployments. Designed for resilience over consistency — Eureka prefers to show stale registry data rather than go down. Each client caches the registry locally, so it continues working even if the Eureka server is temporarily unavailable.

Health Checking

A registry that just stores addresses without checking health is dangerous. You might get routed to an instance that's running but stuck in an infinite loop, out of memory, or failing to connect to its database.

Service registries implement health checks in two ways:

Active health checks: The registry pings each service instance on a regular interval (GET /health). If the instance fails to respond within a timeout, it gets removed from the registry.
Heartbeat / TTL: Each instance periodically tells the registry "I'm still alive." If the registry doesn't receive a heartbeat within a time window (TTL), it deregisters the instance.

Both approaches have failure modes. Active checks add load. Heartbeat-based checks can leave stale entries if an instance dies between heartbeats. Production systems typically use both.

DNS-Based Service Discovery

The simplest form of service discovery: just use DNS. Each service name resolves to a list of IP addresses. Rotate healthy instances into the DNS record; remove unhealthy ones.

This works and requires no client library. The downsides: DNS TTLs mean stale addresses can persist for minutes after an instance dies (aggressive TTL caching is the main culprit), and DNS doesn't carry health information beyond "the record exists."

In practice, cloud providers make DNS-based discovery very usable: AWS Route 53 supports health-checked records and private hosted zones. AWS Cloud Map lets you register services and query them via DNS or API. It's often the right starting point before reaching for Consul or etcd.

Kubernetes Service Discovery

If you're running on Kubernetes, you get service discovery for free. When you create a Kubernetes Service object, Kubernetes:

Assigns a stable virtual IP (ClusterIP) to the service
Creates a DNS entry: inventory-service.default.svc.cluster.local
Configures kube-proxy (or eBPF rules) on every node to forward traffic to healthy pods

Your Order Service just calls http://inventory-service/ and Kubernetes handles finding pods, health checking, and load balancing. It's server-side discovery implemented by the infrastructure itself.

For more advanced use cases (circuit breaking, retries, mutual TLS between services), service meshes like Istio or Linkerd sit on top of Kubernetes and add those features via sidecar proxies, without any changes to your application code.

Summary

Service discovery solves the problem of services finding each other in a dynamic environment where IP addresses are ephemeral and instance counts change constantly. Client-side discovery puts routing logic in the client. Server-side discovery puts it in a proxy or load balancer. The service registry is the source of truth for what's healthy and where it's running. Consul, etcd, ZooKeeper, and Eureka are the main options. DNS-based discovery is the simplest starting point. On Kubernetes, you get solid service discovery out of the box, with service meshes available when you need more advanced traffic management.

API Gateway Pattern

How helpful was this content?

Comments

0/2000

Saved on this device only