Key-Value Stores

Updated June 3, 2026
M
Magic Magnets Team
8 min read

If a relational database is a filing cabinet with meticulously labeled folders, a key-value store is a giant hash map. You hand it a key, it hands back the value. That's the entire API.

This simplicity is its superpower. There's no query planner, no join execution, no schema validation overhead. Just: compute the hash of the key, go to that memory address, return the value. The result is sub-millisecond latency at massive scale.

The Data Model

A key-value store is exactly what it sounds like:

SET user:session:abc123 → { userId: 42, role: "admin", expiresAt: 1717200000 } GET user:session:abc123 → { userId: 42, role: "admin", expiresAt: 1717200000 } DEL user:session:abc123 → OK

The key is always a string (or sometimes bytes). The value can be anything — a string, a number, a serialized JSON blob, binary data. The database doesn't care. It just stores and retrieves it.

There's no schema. No types. No relationships. No queries beyond key lookup. This narrow interface is what enables the extreme performance.


What Makes Them Fast

Three things:

1. In-memory storage: The dominant key-value stores (Redis, Memcached) keep all data in RAM. RAM access is roughly 100x faster than SSD. When you're reading a session token on every HTTP request, that difference adds up across millions of requests per second.

2. Single key lookups: There's no query planning, no table scans, no index traversal. A hash table lookup is O(1). Find the bucket, grab the value, done.

3. Minimal protocol overhead: Redis's RESP protocol (REdis Serialization Protocol) is a lightweight binary protocol designed to minimize parsing overhead. Compare this to PostgreSQL's wire protocol, which supports complex query planning and result set formatting.

The trade-off: you can only look up by key. If you need to find all sessions for a given user, you either structure your keys cleverly (sessions:user:42:*) or you keep a separate index yourself.

Quiz Time

What is the primary reason key-value stores achieve sub-millisecond latency compared to relational databases?


Use Cases

Caching

The canonical use case. Your application fetches a user profile from PostgreSQL — a query that takes 5ms. You then cache the result in Redis with a TTL (time-to-live). The next 10,000 requests for that profile take 0.3ms instead.

SET user:42:profile <json blob> EX 3600 # expires in 1 hour GET user:42:profile # returns cached value or nil

Cache invalidation is notoriously hard ("one of the two hard problems in computer science"), but for read-heavy data that changes infrequently — user profiles, product details, configuration — caching provides dramatic latency reductions.

Session Storage

When a user logs in, you create a session token (a random string), store the session data in Redis keyed by the token, and send the token to the browser as a cookie. On every subsequent request, you look up the token in Redis.

This is far better than storing sessions in a relational database, which adds a SQL query to every single request. Redis can serve millions of session lookups per second with single-digit millisecond latency.

Rate Limiting

Use an atomic increment + TTL to count requests per user per time window:

INCR rate:user:42:minute:1234 # increment counter for this minute EXPIRE rate:user:42:minute:1234 60 # auto-delete after 60 seconds

If the counter exceeds your limit (say, 100 requests/minute), reject the request. Because Redis operations are atomic, there are no race conditions even with concurrent requests.

Quiz Time

You need to implement a rate limiter that counts API requests per user per minute with no race conditions. Which Redis approach is correct?

Counters

View counts, like counts, download counts — any monotonically incrementing counter is a great fit. INCR in Redis is atomic and blazing fast.

Leaderboards

Redis's sorted sets (more on those shortly) make leaderboards trivially easy: add players with their scores, and Redis maintains the ranking in O(log n) time. Fetching the top 100 players is a single command.

Pub/Sub and Message Queues

Redis supports lightweight publish/subscribe messaging and queue patterns, making it useful as a simple task queue or event bus (though dedicated tools like Kafka are better for production-grade streaming).


Real Databases

Redis is the Swiss Army knife of key-value stores and the default choice for most use cases. It's in-memory, has a rich set of data structures beyond simple strings, and supports persistence via snapshots and append-only logs. Used by Twitter, GitHub, Stack Overflow, Instagram, and countless others. If you're adding a key-value store to your stack, Redis is almost certainly the right answer.

Memcached is a simpler, leaner caching layer. It's multi-threaded and can be faster than Redis for pure caching workloads, but it has no persistence, no data structures beyond strings, and no built-in clustering. Mostly used in legacy systems. New projects should choose Redis.

DynamoDB is Amazon's fully managed key-value (and document) database. Unlike Redis, it's disk-based and scales to any throughput automatically. It's excellent for durable key-value storage when you need persistence and don't want to manage infrastructure. Used heavily in AWS-native architectures. Latency is typically single-digit milliseconds (vs. Redis's sub-millisecond), but you get infinite scale with zero ops.

etcd is a distributed key-value store built specifically for storing small amounts of critical configuration data that must be strongly consistent. It's the backbone of Kubernetes — every piece of cluster state (pods, services, deployments) is stored in etcd. Not a general-purpose database; purpose-built for configuration and service discovery.

Quiz Time

Which key-value store is specifically designed for strongly consistent storage of cluster configuration rather than general-purpose caching?


Redis Beyond Strings

Redis gets called a "key-value store" but that undersells it. The value in Redis isn't just a string — it can be one of several rich data structures, each optimized for specific operations:

Strings

The basic type. Can store text, serialized JSON, or binary data. Supports atomic increment/decrement for counters.

Lists

Ordered sequences with O(1) push/pop from both ends. Perfect for queues (push to tail, pop from head) or activity feeds (push to head, trim to last 100).

LPUSH feed:user:42 "post:101" # push to front LRANGE feed:user:42 0 9 # get first 10 items

Hashes

A map within a value — perfect for storing objects where you want to update individual fields without fetching and re-serializing the entire blob.

HSET user:42 name "Alice" email "alice@example.com" credits 150 HINCRBY user:42 credits 50 # atomic field-level increment

Sets

Unordered collections of unique strings. Union, intersection, and difference operations are built in — useful for "users who liked post A and post B" type queries.

Sorted Sets (ZSets)

Like sets, but every member has a floating-point score. Redis maintains members in sorted order by score. This is the data structure behind leaderboards, priority queues, and range queries by time.

ZADD leaderboard 1450.5 "player:alice" ZADD leaderboard 1320.0 "player:bob" ZREVRANGE leaderboard 0 9 WITHSCORES # top 10 players
Quiz Time

A Redis sorted set (ZSet) maintains its members in insertion order.

HyperLogLog

An approximation algorithm for counting unique values with minimal memory. "How many unique IPs visited this page today?" — exact counting would require storing every IP. HyperLogLog gives you a 0.81% error estimate using just 12KB of memory.

Quiz Time

You want to count unique IP addresses visiting a page today. Your dataset has millions of visitors and memory is constrained. Which Redis data structure is the best fit?

Streams

An append-only log with consumer groups — Redis's answer to Kafka for lightweight streaming use cases.


Limitations to Know

  • No rich queries: You can only look up by exact key. Range queries are only possible with sorted sets on a score, not arbitrary fields.
  • Memory-bound: If your dataset grows larger than available RAM (for Redis), you need sharding or to switch to a disk-based store.
  • No ACID transactions across multiple keys: Redis has MULTI/EXEC for batching commands, but it's not equivalent to SQL transactions. It executes commands sequentially but doesn't roll back on failure.
  • Volatile by design: Redis persistence (RDB snapshots + AOF) adds durability, but it's not as reliable as a write-ahead log in PostgreSQL. For data you can't lose, Redis is a cache, not a primary store.
Quiz Time

Redis is safe to use as a primary (source of truth) store for financial transaction records.

Summary

Key-value stores trade query power for extreme performance. By reducing the entire API to GET, SET, and DELETE by key, they achieve sub-millisecond latency at massive scale. Redis is the dominant key-value store — an in-memory database with a rich set of data structures (strings, lists, hashes, sets, sorted sets) that make it ideal for caching, session storage, rate limiting, counters, and leaderboards. DynamoDB fills the same role when you need durable, serverless, infinitely scalable key-value storage on AWS. etcd is the specialized choice for distributed configuration. The key constraint: design your access patterns around key lookup — everything else requires either clever key structuring or a different database type.

How helpful was this content?

Comments

0/2000

Sign in to join the discussion

Saved on this device only

Sign in to sync progress across devices