Push vs Pull Architecture

Updated June 3, 2026

Magic Magnets Team

8 min read

Every time data needs to move from one place to another in a system, you face a fundamental decision: does the sender push the data to the receiver, or does the receiver come and pull the data when it needs it?

This choice shows up everywhere — notification systems, social feeds, message queues, monitoring pipelines, real-time dashboards. Getting it right has a big impact on latency, infrastructure cost, and system complexity.

Push: The Server Takes Initiative

algobase.dev

In a push model, the producer proactively delivers data to all consumers as soon as it's available. Twitter uses push for most accounts: when you post a tweet, a fan-out service immediately writes it into the cached feed of every follower. When a follower opens Twitter, their feed loads instantly from cache — no computation, no queries at read time. The latency benefit is real: data arrives at consumers the moment it's produced. The cost is the fan-out write amplification. For regular users with hundreds of followers, writing to a few hundred caches per tweet is trivial. For a celebrity with 50 million followers, a single tweet triggers 50 million cache writes in seconds — an enormous infrastructure cost that Twitter had to solve separately.

1 / 1

Push — fan-out writes to all follower feed caches on post

In a push model, the producer proactively sends data to consumers as soon as it's available. The server knows about clients and delivers updates to them directly.

Think of push notifications on your phone. When someone sends you a message, the server doesn't wait for your phone to check in — it reaches out and delivers the notification immediately. You didn't ask for it; it was sent to you.

Characteristics of push:

Low latency — data arrives as soon as it's available
Server must track all active clients
Works well when consumers are always ready to receive
Can overwhelm slow consumers (the "fan-out" problem)

Quiz Time

Which characteristic best describes the push model?

The Fan-Out Problem

Fan-out is where push gets expensive. Imagine a celebrity on Twitter with 50 million followers posts a tweet. In a pure push model, you'd need to immediately write that tweet to 50 million users' home feeds. That's 50 million writes happening in seconds.

Twitter actually built this system (they called it the "Firehose") and it worked — until it didn't. Posting a tweet from an account with tens of millions of followers could trigger hundreds of millions of database writes. The infrastructure cost was enormous.

Quiz Time

Why did Twitter's pure push model struggle with celebrity accounts that had tens of millions of followers?

Pull: Clients Take the Initiative

In a pull model, consumers periodically ask for new data. The server sits passively and responds to requests.

Your email client checking for new mail every few minutes is pull. RSS readers polling feeds are pull. Most REST APIs you build are pull — clients call them when they need data.

Characteristics of pull:

Simple to implement — standard request/response
No need to track client state on the server
Latency = polling interval (can't do true real-time)
Wasted requests when there's nothing new (polling overhead)

The Polling Problem

Pull has a different cost: wasted work. If you poll every 30 seconds and there's new data only once an hour, you're making 119 empty requests for every useful one. At scale, this polling traffic becomes significant load on your servers.

You can reduce this with longer polling intervals, but then your latency goes up — users see data that's increasingly stale.

Quiz Time

A pull-based system that checks for updates every 30 seconds, but data only changes once per hour, makes 119 wasted requests for every useful one.

The Hybrid: Twitter's Real Approach

algobase.dev

Twitter's actual feed is a hybrid model — one of the most-cited examples of push/pull trade-off engineering. Regular followees use push: their tweets are pre-written to your feed cache when they post. Celebrity accounts (tens of millions of followers) use pull on read: their tweets are NOT pre-fanned-out. Instead, when you open your feed, the system fetches recent celebrity tweets live from the database and merges them with your pre-computed cache. This adds a small amount of latency at read time but eliminates the catastrophic write amplification of pushing to 50M caches. The lesson: you don't have to pick one model for your entire system. Segment your data — push for high-follower-count consumers (low fan-out), pull for low-follower-count producers (high fan-out). Apply the cheapest model to each case.

1 / 1

Hybrid — push for regular users, pull-on-read for celebrity accounts

Twitter's home feed is one of the most-cited examples of push vs pull trade-offs, and the actual solution is a hybrid.

For most users (who have moderate follower counts), Twitter uses push. When you post a tweet, it gets fanned out to your followers' pre-computed feed caches. When a follower opens Twitter, their feed loads instantly from cache — no computation needed.

For celebrity accounts with tens of millions of followers, pure push is too expensive. Instead, Twitter uses pull at read time. When you open your feed, the system fetches the regular pre-computed entries, then makes a separate query to fetch recent tweets from any celebrities you follow, and merges the results.

The key insight: you don't have to pick one model for your entire system. Pick the right model for each segment of your data and users.

Quiz Time

In Twitter's hybrid feed model, what technique is used for celebrity accounts to avoid expensive fan-out writes?

Practical Decision Framework

Ask these questions to guide your choice:

How real-time does the data need to be?

Sub-second → push (WebSockets, SSE, or a message broker)
Seconds to minutes → long polling or short-interval pull
Minutes to hours → simple periodic pull is fine

What's the fan-out ratio?

One producer, few consumers → push is straightforward
One producer, millions of consumers → push cost explodes; consider pull or hybrid
Many producers, one consumer → pull (consumer decides when to process)

Can consumers handle arbitrary rates of incoming data?

Yes → push works fine
No (slow consumers) → pull gives consumers control over their own pace; or use a queue as a buffer

Quiz Time

If consumers process data at very different speeds, pull is generally preferred over push because consumers control their own ingestion rate.

How often does data change?

Frequently → push avoids wasted polling requests
Rarely → pull is fine; push wastes effort delivering "nothing changed"

Real Examples Across the Industry

System	Model	Why
Push notifications (APNs, FCM)	Push	Low latency delivery to devices
Kafka consumers	Pull	Consumers control their own pace
Email (IMAP IDLE)	Hybrid	Server pushes "you have mail", client then fetches
REST APIs	Pull	Stateless, client-driven
Server-Sent Events	Push	Server streams updates, one direction
Webhook callbacks	Push	Server notifies partner systems of events

Quiz Time

Which of the following real-world systems uses a pull model specifically to let consumers control their own processing pace?

Summary

Push architecture has the server proactively deliver data to clients as soon as it's available — great for low latency, but requires tracking clients and can be expensive at high fan-out. Pull architecture has clients request data on their own schedule — simpler to implement but introduces polling latency and wasted requests when nothing has changed. Most real systems use a hybrid: push for latency-sensitive, low-fan-out cases, and pull (or pre-computed caches with on-demand merging) for high fan-out scenarios. Twitter's home feed is the classic example of a carefully engineered hybrid. The right choice always depends on your latency requirements, fan-out ratio, and how often the data actually changes.

Stateful vs Stateless Architecture

How helpful was this content?

Comments

0/2000

Saved on this device only