What is Caching?

Updated June 8, 2026
M
Magic Magnets Team
7 min read

Your users don't care that your database is on the other side of the world. They just know the page felt slow. Caching is the most impactful lever you can pull to fix that, and it's one of the most misunderstood ideas in system design.

Here's the core premise: reading the same data twice from a slow source is wasteful. A cache stores the result of an expensive operation so the next request gets it back instantly.

Why Caching Exists

Two reasons, fundamentally:

  1. Speed — Memory is orders of magnitude faster than disk or network. Reading from RAM takes nanoseconds. Reading from a remote database takes milliseconds. That's a 100,000x difference.
  2. Cost — Every database query consumes compute, I/O, and network bandwidth. When millions of users ask for the same homepage data, serving it from a cache means your database only processes that query once.

Netflix doesn't query its catalog database every time someone opens the app. It caches the catalog in a layer in front of the database. The database handles one query; the cache handles millions.

Where You Can Cache

Caching isn't a single thing. It's a family of techniques at every layer of your stack.

algobase.dev
Caching exists at every layer. Static assets go to the CDN nearest to the user. API requests hit the app server, which checks Redis before touching the database. Each layer absorbs traffic so the layer below it is hit less often.
1 / 1

Caching layers from browser to database: CDN handles static assets, Redis sits between the app server and the database.

LayerWhat It CachesExample
BrowserHTML, JS, CSS, imagesHTTP Cache-Control headers
CDNStatic assets, rendered pagesCloudflare, AWS CloudFront
ApplicationQuery results, computed valuesRedis, Memcached
DatabaseQuery execution plans, buffer poolMySQL InnoDB buffer pool

Each layer has a different scope. Browser cache is per-user. CDN cache is per-edge-location. Application cache is shared across your entire backend.

Browser Cache

The browser stores resources locally after the first download. On subsequent visits, it checks Cache-Control and ETag headers to decide whether to reuse the stored version or fetch a fresh copy. This is why your CSS file loads instantly on repeat visits.

CDN Cache

A Content Delivery Network sits in front of your origin servers, spread across dozens or hundreds of geographic locations. When a user in Singapore requests your app, they hit a CDN node in Singapore, not your US-based origin server. The CDN serves the cached version from nearby, slashing latency.

Application Cache

This is where Redis lives. Your application server checks Redis before hitting the database. If the data is there (a cache hit), return it immediately. If it's not (a cache miss), hit the database, then store the result in Redis for next time.

Database Cache

Databases cache internally too. MySQL's InnoDB buffer pool keeps recently-read pages in memory. PostgreSQL has its own shared buffer. You get this for free, but you can't rely on it for high-traffic workloads. Database-level caching is a last line of defense, not a strategy.

Cache Hit vs Cache Miss

This is the fundamental vocabulary of caching:

  • Cache hit — The data you need is already in the cache. Fast path. Return it.
  • Cache miss — The data isn't in the cache. Slow path. Fetch from the source, store in cache, return it.

Your cache hit rate is the percentage of requests served from cache. A hit rate of 99% means 99 out of 100 requests never touch your database. A hit rate of 50% means you've barely helped.

For most read-heavy applications, a well-designed cache should achieve 90-99% hit rates. If yours is below that, look at your key design and TTL strategy.

The cache miss penalty is the extra time you pay when you miss. You still have to go to the database, plus you did the overhead of checking the cache first. On a hot path, a cold cache (many misses) can be worse than no cache at all.

The Fundamental Tradeoff: Freshness vs Speed

Every cache is a frozen snapshot of data that was true at some point in the past.

The tradeoff is simple:

  • Longer TTL (Time-to-Live) — better performance, staler data
  • Shorter TTL — fresher data, more cache misses, more database load

For a user's public profile on LinkedIn, stale data for 60 seconds is fine. For a bank account balance, stale data for even 1 second is dangerous.

The question to ask: what's the maximum staleness your users will tolerate? That answer drives your TTL.

There's another failure mode: cache invalidation. When data changes, you need to evict or update the cached version. Getting invalidation wrong means users see outdated data even after the TTL would have fixed it. Phil Karlton famously said there are only two hard problems in computer science. One of them is cache invalidation.

Real Examples

Redis for Session Caching

When a user logs into your app, you create a session and store it in Redis with a key like session:{session_id}. Every authenticated request looks up that key. Redis responds in under a millisecond. The alternative, a database query on every request, would be catastrophically slow at scale.

Twitter, GitHub, and practically every major web application use Redis (or a Redis-compatible store) for session management.

CDN for Static Assets

Vercel, Netlify, and AWS CloudFront cache your JavaScript bundles, images, and CSS at edge locations worldwide. Your main.abc123.js file gets a content-hash in its filename, a Cache-Control: max-age=31536000, immutable header, and lives in CDN cache for a year. Zero database load. Zero origin server load. Pure edge speed.

Application-Level Query Caching

Say you run an e-commerce site and the homepage shows the top 10 bestselling products. That query is expensive. It scans order history, aggregates sales counts, and joins product data. But the answer changes maybe once an hour.

key: "bestsellers:homepage" value: [product1, product2, ...] TTL: 3600 seconds (1 hour)

One database query per hour instead of one per page load. At 100,000 daily visitors, that's the difference between 100,000 queries and 24.

Summary

Caching is about storing the result of expensive work so you don't have to redo it. It exists at every layer of the stack: browser, CDN, application, database. Each layer has its own scope and trade-offs.

The key mental model: every cache trades freshness for speed. Your TTL and invalidation strategy determine how stale your users are willing to tolerate. Get that right, and a cache is the single cheapest performance win in distributed systems.

Next up: the specific patterns for how data flows into and out of your cache.

Cache-Aside Pattern

How helpful was this content?

Comments

0/2000

Sign in to join the discussion

Saved on this device only

Sign in to sync progress across devices