Push vs Pull Architecture
Updated June 3, 2026Every time data needs to move from one place to another in a system, you face a fundamental decision: does the sender push the data to the receiver, or does the receiver come and pull the data when it needs it?
This choice shows up everywhere — notification systems, social feeds, message queues, monitoring pipelines, real-time dashboards. Getting it right has a big impact on latency, infrastructure cost, and system complexity.
Push: The Server Takes Initiative
Push — fan-out writes to all follower feed caches on post
In a push model, the producer proactively sends data to consumers as soon as it's available. The server knows about clients and delivers updates to them directly.
Think of push notifications on your phone. When someone sends you a message, the server doesn't wait for your phone to check in — it reaches out and delivers the notification immediately. You didn't ask for it; it was sent to you.
Characteristics of push:
- Low latency — data arrives as soon as it's available
- Server must track all active clients
- Works well when consumers are always ready to receive
- Can overwhelm slow consumers (the "fan-out" problem)
Which characteristic best describes the push model?
The Fan-Out Problem
Fan-out is where push gets expensive. Imagine a celebrity on Twitter with 50 million followers posts a tweet. In a pure push model, you'd need to immediately write that tweet to 50 million users' home feeds. That's 50 million writes happening in seconds.
Twitter actually built this system (they called it the "Firehose") and it worked — until it didn't. Posting a tweet from an account with tens of millions of followers could trigger hundreds of millions of database writes. The infrastructure cost was enormous.
Why did Twitter's pure push model struggle with celebrity accounts that had tens of millions of followers?
Pull: Clients Take the Initiative
In a pull model, consumers periodically ask for new data. The server sits passively and responds to requests.
Your email client checking for new mail every few minutes is pull. RSS readers polling feeds are pull. Most REST APIs you build are pull — clients call them when they need data.
Characteristics of pull:
- Simple to implement — standard request/response
- No need to track client state on the server
- Latency = polling interval (can't do true real-time)
- Wasted requests when there's nothing new (polling overhead)
The Polling Problem
Pull has a different cost: wasted work. If you poll every 30 seconds and there's new data only once an hour, you're making 119 empty requests for every useful one. At scale, this polling traffic becomes significant load on your servers.
You can reduce this with longer polling intervals, but then your latency goes up — users see data that's increasingly stale.
A pull-based system that checks for updates every 30 seconds, but data only changes once per hour, makes 119 wasted requests for every useful one.
The Hybrid: Twitter's Real Approach
Hybrid — push for regular users, pull-on-read for celebrity accounts
Twitter's home feed is one of the most-cited examples of push vs pull trade-offs, and the actual solution is a hybrid.
For most users (who have moderate follower counts), Twitter uses push. When you post a tweet, it gets fanned out to your followers' pre-computed feed caches. When a follower opens Twitter, their feed loads instantly from cache — no computation needed.
For celebrity accounts with tens of millions of followers, pure push is too expensive. Instead, Twitter uses pull at read time. When you open your feed, the system fetches the regular pre-computed entries, then makes a separate query to fetch recent tweets from any celebrities you follow, and merges the results.
The key insight: you don't have to pick one model for your entire system. Pick the right model for each segment of your data and users.
In Twitter's hybrid feed model, what technique is used for celebrity accounts to avoid expensive fan-out writes?
Practical Decision Framework
Ask these questions to guide your choice:
How real-time does the data need to be?
- Sub-second → push (WebSockets, SSE, or a message broker)
- Seconds to minutes → long polling or short-interval pull
- Minutes to hours → simple periodic pull is fine
What's the fan-out ratio?
- One producer, few consumers → push is straightforward
- One producer, millions of consumers → push cost explodes; consider pull or hybrid
- Many producers, one consumer → pull (consumer decides when to process)
Can consumers handle arbitrary rates of incoming data?
- Yes → push works fine
- No (slow consumers) → pull gives consumers control over their own pace; or use a queue as a buffer
If consumers process data at very different speeds, pull is generally preferred over push because consumers control their own ingestion rate.
How often does data change?
- Frequently → push avoids wasted polling requests
- Rarely → pull is fine; push wastes effort delivering "nothing changed"
Real Examples Across the Industry
| System | Model | Why |
|---|---|---|
| Push notifications (APNs, FCM) | Push | Low latency delivery to devices |
| Kafka consumers | Pull | Consumers control their own pace |
| Email (IMAP IDLE) | Hybrid | Server pushes "you have mail", client then fetches |
| REST APIs | Pull | Stateless, client-driven |
| Server-Sent Events | Push | Server streams updates, one direction |
| Webhook callbacks | Push | Server notifies partner systems of events |
Which of the following real-world systems uses a pull model specifically to let consumers control their own processing pace?
Summary
Push architecture has the server proactively deliver data to clients as soon as it's available — great for low latency, but requires tracking clients and can be expensive at high fan-out. Pull architecture has clients request data on their own schedule — simpler to implement but introduces polling latency and wasted requests when nothing has changed. Most real systems use a hybrid: push for latency-sensitive, low-fan-out cases, and pull (or pre-computed caches with on-demand merging) for high fan-out scenarios. Twitter's home feed is the classic example of a carefully engineered hybrid. The right choice always depends on your latency requirements, fan-out ratio, and how often the data actually changes.
How helpful was this content?
Comments
Sign in to join the discussion
Saved on this device only
Sign in to sync progress across devices