Block vs File vs Object Storage

Updated June 3, 2026
M
Magic Magnets Team
8 min read

When a system design interviewer asks "where do you store the images?" or "how does Dropbox persist files?", the answer depends entirely on understanding three fundamentally different storage abstractions: block, file, and object storage. They look similar on the surface — they all store bytes — but they differ in how they expose data, how they scale, and what they're actually good at.

Block Storage: Raw Volumes

algobase.dev
Three fundamentally different storage abstractions. Block storage (EBS) presents a raw disk to the OS — the application owns the filesystem, achieves microsecond random I/O, and is the only option for databases that need to control page layout, write-ahead logging, and fsync. It's expensive and attaches to a single instance. File storage (EFS/NFS) adds a filesystem protocol so multiple servers can concurrently mount the same filesystem — essential for ML training jobs where dozens of GPU workers need to read the same massive dataset simultaneously. Object storage (S3) abandons the filesystem hierarchy entirely: data is a blob accessed by a unique key over HTTP. No random writes, no directories — but unlimited scale and the cheapest cost per GB by far. This is where all user-uploaded media lives.
1 / 1

Block, file, and object storage — access patterns and use cases

Block storage presents itself to the operating system as a raw disk — a sequence of fixed-size blocks (typically 512 bytes or 4KB each). The OS formats it with a filesystem (ext4, NTFS, XFS) and treats it like a local disk.

Think of it like a blank hard drive that you plug in to a server. The storage system doesn't know or care about files, directories, or metadata. It just reads and writes blocks at specific offsets.

Examples: AWS EBS (Elastic Block Store), Google Persistent Disk, Azure Managed Disks

Characteristics:

  • Attached to a single instance (usually)
  • Very low latency — microsecond-level access
  • The OS/application controls everything: filesystem layout, caching, buffering
  • Supports random reads and writes efficiently

Who uses it: Databases. Relational databases like PostgreSQL and MySQL run on block storage because they need to control exactly how data is written to disk — they implement their own page management, write-ahead logging, and fsync behavior. Running a database on a network file system or object store would be a disaster. EBS io2 volumes on AWS, for example, offer provisioned IOPS specifically for database workloads.

Block storage doesn't scale horizontally. You can make the volume bigger (within limits), but you can't easily split it across machines. It's fundamentally a single-machine abstraction.

Quiz Time

Why do relational databases like PostgreSQL run on block storage rather than object or file storage?

File Storage: The Familiar Hierarchy

File storage organizes data into the hierarchical directory structure you've used your whole life: folders inside folders, files with names and paths. Unlike block storage, file storage speaks a filesystem protocol — NFS (Network File System) or SMB (Server Message Block) — so multiple machines can mount the same filesystem simultaneously.

Examples: AWS EFS (Elastic File System), Google Filestore, Azure Files, NFS servers, Dropbox (from the client's perspective)

Characteristics:

  • Shared access — many servers can read and write the same files concurrently
  • POSIX-compliant: supports file locking, permissions, directory traversal
  • Higher latency than block storage (it's a network call)
  • Scales better than block storage, but with limits

Who uses it: Shared code repositories, legacy enterprise applications that expect a filesystem, ML training jobs that need to read the same large dataset from many workers simultaneously. AWS EFS is popular for Kubernetes workloads that need shared persistent storage accessible from any pod.

The familiar filesystem abstraction is both the strength and the weakness of file storage. It's intuitive, but the POSIX semantics (locking, ordering guarantees) make it hard to build a truly distributed, massively scalable system on top of it.

Quiz Time

File storage is harder to scale massively than object storage primarily because of its POSIX semantics.

Object Storage: Flat Namespace for Blobs

Object storage throws out the filesystem hierarchy entirely. Data is stored as objects in a flat namespace, identified by a unique key. An object is a bundle of: the data itself (any blob, any size), metadata (key-value pairs), and a unique identifier.

There are no directories — you can simulate them with key prefixes like photos/2024/01/15/img_001.jpg, but the storage system treats this as just a key with slashes in it. There's no concept of a current directory or relative path.

Examples: AWS S3, Google Cloud Storage (GCS), Azure Blob Storage, Cloudflare R2, MinIO (self-hosted)

Characteristics:

  • Accessed via HTTP APIs (PUT, GET, DELETE — not POSIX calls)
  • Designed for massive scale — S3 stores exabytes of data
  • Eventual consistency (though S3 has offered strong consistency since 2021)
  • Cheap per GB — typically 10-20x cheaper than block storage
  • Built-in features: versioning, lifecycle policies, replication, public URL serving
  • Not suitable for random reads/writes within an object — you read/write the entire object

Who uses it: Everything that doesn't need a filesystem. Profile pictures, video files, backups, data lake storage, ML model artifacts, static website assets, log archives. Netflix stores its entire video catalog in S3. GitHub stores Git LFS objects in object storage. Every mobile app that lets you upload a photo is almost certainly putting it in object storage.

The key insight: object storage scales horizontally essentially without limit because there's no shared mutable state between objects. S3 can handle millions of requests per second across billions of objects because reads and writes to different objects are completely independent.

Quiz Time

Which statement best describes how object storage handles directories?

Performance, Cost, and Scalability Tradeoffs

Block StorageFile StorageObject Storage
LatencyMicrosecondsMilliseconds10s of milliseconds
ThroughputVery highModerateVery high (with parallelism)
Random accessExcellentGoodPoor (full object read/write)
ConcurrencySingle instanceMultiple instancesUnlimited
Cost (per GB)$$$$$$$$
Max scaleTBsTBs-PBsUnlimited (exabytes)
ProtocolsiSCSI, NVMeNFS, SMBHTTP (REST)

Cost is one of the starkest differences. AWS EBS gp3 costs around $0.08/GB/month. EFS costs around $0.30/GB/month. S3 Standard costs $0.023/GB/month. For a company storing petabytes of user media, the difference between block and object storage is the difference between a manageable infrastructure bill and a catastrophic one.

Quiz Time

AWS S3 Standard costs roughly the same per GB per month as AWS EBS gp3.

When to Use Each

Use block storage when:

  • Running a database (PostgreSQL, MySQL, Elasticsearch, Kafka)
  • You need low-latency, random I/O
  • Your application expects a local disk

Use file storage when:

  • Multiple servers need simultaneous read/write access to the same files
  • You're running legacy software that expects POSIX semantics
  • ML training jobs need shared access to large datasets
  • You need file locking or directory-level operations

Use object storage when:

  • Storing user-uploaded media (images, videos, documents)
  • Building a data lake for analytics
  • Serving static assets via CDN
  • Long-term archival and backups
  • Any blob that's written once and read many times
Quiz Time

A team needs multiple GPU training instances to read the same large dataset simultaneously during a training job. Which storage type fits best?

A Practical Architecture Pattern

algobase.dev
Most production systems use all three storage types simultaneously, matched to the access pattern of each piece of data. The app server queries PostgreSQL running on EBS for structured data — user accounts, orders, relationships — because relational databases need microsecond random I/O and full disk control. User-uploaded media (profile pictures, videos, documents) goes to S3 — written once, read many times, served through a CDN, and stored for a fraction of the cost of block storage. Shared ML training datasets live on EFS, mounted simultaneously by all GPU instances in the training cluster. The mistake is using one storage type for everything: images in a database bloats it and kills performance; a database on S3 is impossible because S3 doesn't support in-place writes.
1 / 1

Practical architecture — all three storage types working together

Most non-trivial systems use all three:

  1. PostgreSQL runs on EBS (block storage) — the database needs low-latency random I/O and full control over disk layout.
  2. User profile pictures are stored in S3 (object storage) — cheap, infinitely scalable, served directly to clients via CloudFront CDN.
  3. Shared ML training datasets live on EFS (file storage) — multiple GPU instances need concurrent read access during training.

The mistake engineers make is trying to use one storage type for everything. Putting images in a database works at small scale and becomes a catastrophe at large scale. Trying to run a database on object storage doesn't work at all — you can't do random in-place writes to an S3 object.

Quiz Time

Why does object storage scale horizontally without practical limits while block storage does not?

Summary

Block, file, and object storage solve different problems. Block storage is a raw disk abstraction — low latency, random access, single machine, expensive. Use it for databases and anything that needs OS-level filesystem control. File storage adds a shared filesystem — multiple machines, POSIX semantics, moderate cost. Use it when multiple servers need concurrent file access. Object storage abandons the filesystem model for a flat key-value API — infinitely scalable, cheap, HTTP-native, but no random access within objects. Use it for all blobs: media, backups, artifacts, static assets. In practice, mature systems use all three, matching each storage layer to the access pattern it's designed for.

Object Storage

How helpful was this content?

Comments

0/2000

Sign in to join the discussion

Saved on this device only

Sign in to sync progress across devices