Scalability

Updated June 6, 2026
M
Magic Magnets Team
8 min read
algobase.dev
Full picture — Redis cache handles hot reads (<1 ms). Two read replicas absorb read traffic, leaving the primary free for writes. Each layer scales independently.
3 / 3

Vertical vs. horizontal scaling, and what a fully scaled system looks like

Scalability is one of those words that gets thrown around a lot in job descriptions, design interviews, and post-mortems. But what does it actually mean?

Here's a simple definition: a system is scalable if it can handle more load without breaking or requiring a complete rewrite.

Load can mean a lot of things, such as more users, more data, more requests per second, or more geographic regions. A scalable system grows gracefully to meet demand. A non-scalable system either falls over under pressure or needs to be replaced entirely.

The Classic Failure: Twitter's Fail Whale

If you were on Twitter around 2007-2009, you know the Fail Whale, a cartoon whale lifted by birds that appeared whenever Twitter went down, which was often.

The original Twitter architecture was a Ruby on Rails monolith backed by MySQL. It worked great at small scale. But as Twitter's user count exploded, those architectural choices became liabilities. A single celebrity tweet could trigger cascading database failures. Computing the home timeline to determine whose tweets you should see became a performance nightmare.

Twitter didn't just tune their database. They rewrote core parts of the system from scratch: pre-computing timelines, moving to Scala microservices, introducing custom caching layers. Scalability wasn't just an optimization; it was a fundamental redesign.

The lesson: scalability problems are usually architectural, not just resource-related.

Quiz Time

What is the defining characteristic of a scalable system?

Vertical vs. Horizontal Scaling

When your system starts struggling under load, you have two basic moves.

Vertical Scaling (Scale Up)

Buy a bigger machine. Add more CPU cores, more RAM, and faster SSDs. This is vertical scaling, which makes a single node more powerful.

algobase.dev
Vertical scaling (scale up): upgrade to a bigger machine with more CPU/RAM — quick fix but expensive and still a single point of failure
1 / 1

Vertical scaling (scale up): upgrading to a bigger machine with more CPU and RAM

It works, up to a point. Hardware has limits. You can't keep doubling your RAM forever, and at the top end you're spending exponentially more money for marginal gains. Vertical scaling also introduces a single point of failure: if that one big machine goes down, everything goes down.

Horizontal Scaling (Scale Out)

Add more machines. Instead of one beefy server, you run ten or a hundred commodity servers behind a load balancer. This is horizontal scaling, which is how the internet's biggest systems work.

algobase.dev
Horizontal scaling (scale out): add servers behind a load balancer — theoretically unlimited capacity with no single point of failure
1 / 1

Horizontal scaling (scale out): adding more servers behind a load balancer

Companies like AWS, Google, and Meta don't run on a handful of supercomputers. They run on hundreds of thousands of commodity machines. The key insight is that cheap commodity hardware, coordinated well, beats expensive specialized hardware every time.

The tradeoff? Distributed systems are hard. You now have to think about network partitions, data replication, and coordination between nodes.

Quiz Time

Vertical scaling eliminates the single point of failure that horizontal scaling introduces.

Instagram's Scaling Story

Instagram launched in 2010. Thirteen weeks later they had a million users. By the time Facebook acquired them in 2012, they had 30 million users with a team of only 13 engineers.

How? A few things stand out:

  • Django + PostgreSQL as the core: solid, boring technology that they understood deeply
  • Aggressive PostgreSQL sharding: splitting the database across many nodes, each responsible for a slice of data
  • Redis for caching: keeping hot data in memory to offload the database
  • Amazon EC2: horizontal scaling on demand without buying hardware

Instagram didn't build magic. They applied well-understood scaling techniques at the right time. The trick was knowing when to scale each layer, not over-engineering from day one.

Quiz Time

What pattern did Instagram use to scale PostgreSQL to handle millions of users?

Types of Scalability

Not all scalability is the same. When you're designing a system, you need to think about at least three dimensions.

Read Scalability

Many systems are read-heavy, including social feeds, product catalogs, and search results. Read scalability is often the easiest to achieve because reads are stateless. You can add read replicas, use caches, and distribute reads across many nodes without worrying about conflicts.

Patterns: Database read replicas, caching layers (Redis, Memcached), CDNs for static content.

Write Scalability

Write scalability is harder. Writes need to hit persistent storage, often require coordination to avoid conflicts, and need to be durable. A single writable primary database becomes a bottleneck quickly.

Patterns: Database sharding (distributing writes across multiple nodes), event sourcing, write-ahead logs, message queues to absorb write bursts.

Quiz Time

Read scalability is generally harder to achieve than write scalability.

Storage Scalability

As data grows, storage becomes a constraint. A single PostgreSQL instance can handle terabytes, but petabytes require a different approach entirely.

Patterns: Distributed file systems (HDFS, Amazon S3), sharded databases, and tiered storage where hot data is kept in fast storage and cold data in cheap storage.

What Scalability Is NOT

A common misconception: scalability ≠ performance. A system can be fast at small scale and fall apart at large scale. A system can also be slow but scale perfectly: adding more nodes makes it proportionally faster.

Another misconception: premature scalability is waste. Engineers love to build for scale that never comes. Your side project does not need a Kafka cluster. The goal is to design systems that can scale when needed, not to over-engineer upfront.

The best systems start simple and scale incrementally, rather than building with the infrastructure of a Fortune 500 company on day one.

Quiz Time

Which of the following best describes the relationship between performance and scalability?

Summary

Scalability means handling more load without breaking. The two fundamental approaches are vertical scaling (bigger machines) and horizontal scaling (more machines), with horizontal scaling being the long-term solution for internet-scale systems. Read scalability, write scalability, and storage scalability are three distinct problems with different solutions. Twitter learned the hard way that scalability has to be designed in from the start, as it can't always be bolted on later. Instagram showed that you don't need magic, just the right tools applied at the right time.

Availability

How helpful was this content?

Comments

0/2000

Sign in to join the discussion

Saved on this device only

Sign in to sync progress across devices