Scalability
Updated June 6, 2026Vertical vs. horizontal scaling, and what a fully scaled system looks like
Scalability is one of those words that gets thrown around a lot in job descriptions, design interviews, and post-mortems. But what does it actually mean?
Here's a simple definition: a system is scalable if it can handle more load without breaking or requiring a complete rewrite.
Load can mean a lot of things, such as more users, more data, more requests per second, or more geographic regions. A scalable system grows gracefully to meet demand. A non-scalable system either falls over under pressure or needs to be replaced entirely.
The Classic Failure: Twitter's Fail Whale
If you were on Twitter around 2007-2009, you know the Fail Whale, a cartoon whale lifted by birds that appeared whenever Twitter went down, which was often.
The original Twitter architecture was a Ruby on Rails monolith backed by MySQL. It worked great at small scale. But as Twitter's user count exploded, those architectural choices became liabilities. A single celebrity tweet could trigger cascading database failures. Computing the home timeline to determine whose tweets you should see became a performance nightmare.
Twitter didn't just tune their database. They rewrote core parts of the system from scratch: pre-computing timelines, moving to Scala microservices, introducing custom caching layers. Scalability wasn't just an optimization; it was a fundamental redesign.
The lesson: scalability problems are usually architectural, not just resource-related.
What is the defining characteristic of a scalable system?
Vertical vs. Horizontal Scaling
When your system starts struggling under load, you have two basic moves.
Vertical Scaling (Scale Up)
Buy a bigger machine. Add more CPU cores, more RAM, and faster SSDs. This is vertical scaling, which makes a single node more powerful.
Vertical scaling (scale up): upgrading to a bigger machine with more CPU and RAM
It works, up to a point. Hardware has limits. You can't keep doubling your RAM forever, and at the top end you're spending exponentially more money for marginal gains. Vertical scaling also introduces a single point of failure: if that one big machine goes down, everything goes down.
Horizontal Scaling (Scale Out)
Add more machines. Instead of one beefy server, you run ten or a hundred commodity servers behind a load balancer. This is horizontal scaling, which is how the internet's biggest systems work.
Horizontal scaling (scale out): adding more servers behind a load balancer
Companies like AWS, Google, and Meta don't run on a handful of supercomputers. They run on hundreds of thousands of commodity machines. The key insight is that cheap commodity hardware, coordinated well, beats expensive specialized hardware every time.
The tradeoff? Distributed systems are hard. You now have to think about network partitions, data replication, and coordination between nodes.
Vertical scaling eliminates the single point of failure that horizontal scaling introduces.
Instagram's Scaling Story
Instagram launched in 2010. Thirteen weeks later they had a million users. By the time Facebook acquired them in 2012, they had 30 million users with a team of only 13 engineers.
How? A few things stand out:
- Django + PostgreSQL as the core: solid, boring technology that they understood deeply
- Aggressive PostgreSQL sharding: splitting the database across many nodes, each responsible for a slice of data
- Redis for caching: keeping hot data in memory to offload the database
- Amazon EC2: horizontal scaling on demand without buying hardware
Instagram didn't build magic. They applied well-understood scaling techniques at the right time. The trick was knowing when to scale each layer, not over-engineering from day one.
What pattern did Instagram use to scale PostgreSQL to handle millions of users?
Types of Scalability
Not all scalability is the same. When you're designing a system, you need to think about at least three dimensions.
Read Scalability
Many systems are read-heavy, including social feeds, product catalogs, and search results. Read scalability is often the easiest to achieve because reads are stateless. You can add read replicas, use caches, and distribute reads across many nodes without worrying about conflicts.
Patterns: Database read replicas, caching layers (Redis, Memcached), CDNs for static content.
Write Scalability
Write scalability is harder. Writes need to hit persistent storage, often require coordination to avoid conflicts, and need to be durable. A single writable primary database becomes a bottleneck quickly.
Patterns: Database sharding (distributing writes across multiple nodes), event sourcing, write-ahead logs, message queues to absorb write bursts.
Read scalability is generally harder to achieve than write scalability.
Storage Scalability
As data grows, storage becomes a constraint. A single PostgreSQL instance can handle terabytes, but petabytes require a different approach entirely.
Patterns: Distributed file systems (HDFS, Amazon S3), sharded databases, and tiered storage where hot data is kept in fast storage and cold data in cheap storage.
What Scalability Is NOT
A common misconception: scalability ≠ performance. A system can be fast at small scale and fall apart at large scale. A system can also be slow but scale perfectly: adding more nodes makes it proportionally faster.
Another misconception: premature scalability is waste. Engineers love to build for scale that never comes. Your side project does not need a Kafka cluster. The goal is to design systems that can scale when needed, not to over-engineer upfront.
The best systems start simple and scale incrementally, rather than building with the infrastructure of a Fortune 500 company on day one.
Which of the following best describes the relationship between performance and scalability?
Summary
Scalability means handling more load without breaking. The two fundamental approaches are vertical scaling (bigger machines) and horizontal scaling (more machines), with horizontal scaling being the long-term solution for internet-scale systems. Read scalability, write scalability, and storage scalability are three distinct problems with different solutions. Twitter learned the hard way that scalability has to be designed in from the start, as it can't always be bolted on later. Instagram showed that you don't need magic, just the right tools applied at the right time.
Saved on this device only
Sign in to sync progress across devices