Database Types
Updated June 3, 2026One of the most consequential decisions you'll make in a system design is: what kind of database do you reach for? Engineers get this wrong all the time, either defaulting to whatever they know or over-engineering a solution that calls for a plain old relational table.
The good news is there are really only a handful of database categories, and each one has a clear sweet spot. Once you understand the mental model, picking the right one becomes almost obvious.
The Big Picture
Databases aren't a monolith. They're a spectrum of trade-offs: structure vs flexibility, read speed vs write speed, query power vs operational simplicity. Here's the full taxonomy before we dive in:
| Type | Model | Famous Examples |
|---|---|---|
| Relational | Tables & rows | PostgreSQL, MySQL, SQLite |
| Document | JSON/BSON documents | MongoDB, Firestore, CouchDB |
| Key-Value | Key → value map | Redis, DynamoDB, etcd |
| Wide-Column | Column families | Cassandra, HBase, Bigtable |
| Graph | Nodes & edges | Neo4j, Amazon Neptune |
| Time-Series | Timestamped events | InfluxDB, TimescaleDB, Prometheus |
| Search | Inverted indexes | Elasticsearch, Algolia, Typesense |
Relational Databases
Think of a relational database as the responsible adult in the room. Data lives in tables with strict schemas. Rows relate to rows in other tables through foreign keys. You query it all with SQL — a language that's been refined over 50 years.
Why they're great: ACID guarantees, mature tooling, powerful joins, excellent for complex queries.
Real examples: PostgreSQL (the gold standard for most web apps), MySQL (ubiquitous in legacy systems), SQLite (embedded in every iOS and Android app on the planet).
Use when: You have structured, relational data. Orders, customers, products, invoices — anything that fits naturally into tables.
If you don't know which database to pick, start with PostgreSQL. Seriously. It handles JSON, full-text search, geospatial queries, and scales surprisingly far before you need to switch.
Which database type would you reach for first when building a typical web app with structured, relational data and no extreme scale requirements?
Document Databases
Instead of rows and tables, document databases store self-contained JSON (or BSON) documents. Each document can have a completely different shape — no rigid schema required.
Think of it like filing cabinets. A relational database is a spreadsheet where every row must have the same columns. A document database is a folder where each file can look completely different.
Real examples: MongoDB (most popular), Firestore (Firebase's serverless option), CouchDB (great offline sync story).
Use when: Data is naturally hierarchical or nested. Product catalogs (each product has different attributes), user profiles (some users have addresses, some don't), CMS content (articles with varying fields).
Watch out for: No native joins means you either embed data (which makes documents fat) or do multiple round-trips. Consistency guarantees are usually weaker too.
A document database is a better fit than a relational database when your data has varying fields across records (e.g., a product catalog where every product type has different attributes).
Key-Value Stores
This is the simplest possible data model: you give it a key, it gives you a value. That's it. No schema, no query language, no joins.
The simplicity is a feature, not a limitation. Because there's so little overhead, key-value stores are blazing fast — often operating in-memory with sub-millisecond latency.
Real examples: Redis (the Swiss Army knife of key-value — also a cache, queue, and pub/sub system), DynamoDB (AWS's fully managed key-value store that scales to any size), etcd (used by Kubernetes to store cluster state).
Use when: You need fast lookups by a single key. Caching, session storage, rate limiting counters, feature flags, leaderboards.
Don't use when: You need complex queries, range scans, or relationships.
What makes key-value stores like Redis exceptionally fast compared to other database types?
Wide-Column Databases
Wide-column stores are like relational databases that got stretched sideways. Data is organized into rows and columns, but unlike SQL, columns are grouped into "column families," and different rows can have completely different columns.
The key insight: reads and writes are optimized per column family, not per row. This makes wide-column databases exceptional for time-series-ish workloads and massive-scale analytics.
Real examples: Apache Cassandra (powers Discord's message storage at billions of messages), HBase (sits on top of HDFS, used by Facebook for messages at one point), Google Bigtable (the original wide-column store, now a managed GCP product).
Use when: You have billions of rows, heavy write throughput, and access patterns you know in advance. Think IoT sensor data, activity feeds, audit logs.
Trade-offs: Cassandra, for instance, has no joins, limited ad-hoc query support, and requires you to design your data model around your queries (not the other way around).
Apache Cassandra requires you to design your data model around your queries rather than around the data relationships.
Graph Databases
Graph databases model data as nodes (entities) and edges (relationships). Instead of joining tables, you traverse connections. This makes certain queries incredibly natural that would require nightmare-level SQL in a relational database.
Real examples: Neo4j (the most mature graph database), Amazon Neptune (managed, supports both Gremlin and SPARQL).
Use when: Relationships are the data. Social networks (friends of friends), fraud detection (suspicious account clusters), recommendation engines (users who bought X also bought Y), knowledge graphs.
Honest take: Graph databases have a narrower sweet spot than their marketing suggests. Most apps don't need them. But when you do need them — like building a LinkedIn-style "people you may know" feature — they're transformative.
Time-Series Databases
Time-series databases are purpose-built for append-heavy workloads where every record has a timestamp. They compress timestamped data extremely efficiently and provide built-in functions for aggregating over time windows.
Real examples: InfluxDB (popular for infrastructure metrics), TimescaleDB (PostgreSQL extension — great if you're already on Postgres), Prometheus (the standard for Kubernetes monitoring), VictoriaMetrics (Prometheus-compatible but more scalable).
Use when: You're storing metrics, monitoring data, financial tick data, IoT sensor readings — anything where time is the primary axis.
Why not just use PostgreSQL? You can, and TimescaleDB makes that easy. But raw Postgres will struggle with the write throughput and storage compression that native TSDB engines provide at scale.
Search Databases
Search databases are built around inverted indexes — the same structure that powers Google. They map every word to the documents containing it, enabling full-text search across millions of records in milliseconds.
Real examples: Elasticsearch (the default choice for logs, e-commerce search, and analytics), Algolia (hosted, great developer experience, optimized for user-facing search), Typesense (open-source Algolia alternative, simpler to self-host).
Use when: You need full-text search, fuzzy matching, faceted filtering, or relevance ranking. Amazon's product search, Netflix's title search, GitHub's code search.
Common mistake: Using Elasticsearch as a primary database. It's eventually consistent, doesn't support ACID transactions, and is operationally complex. Use it alongside your primary DB, not instead of it.
Which is the most common mistake engineers make when adopting Elasticsearch?
How to Choose
Here's a decision heuristic that covers 80% of real-world cases:
- Default to PostgreSQL for structured relational data with complex queries.
- Add Redis for caching, sessions, and rate limiting.
- Reach for MongoDB or Firestore when your data is naturally hierarchical and schema-less.
- Use Cassandra or DynamoDB only when you know you'll hit Postgres's write or scale ceiling.
- Add Elasticsearch when users need to search free text across your data.
- Use a TSDB if you're ingesting metrics, logs, or sensor data at high frequency.
- Try Neo4j only if relationships are genuinely the core of your problem.
The worst mistake is picking a database because it's trendy. Most apps don't need Cassandra or a graph database. Start boring. Optimize later.
Graph databases are a good fit for most applications because relationships between entities are always important.
Summary
There are seven major database categories, each with a distinct data model and performance profile. Relational databases (PostgreSQL) are the workhorse of most applications. Document stores (MongoDB) shine for flexible, hierarchical data. Key-value stores (Redis) are the fastest option for simple lookups. Wide-column stores (Cassandra) handle massive write throughput. Graph databases (Neo4j) are purpose-built for relationship-heavy queries. Time-series databases (InfluxDB) compress and query timestamped data efficiently. Search databases (Elasticsearch) power full-text search at scale. Match the database to your access pattern, not your familiarity.
Saved on this device only
Sign in to sync progress across devices