Concurrency vs Parallelism
Updated June 3, 2026Concurrency and parallelism are two of the most conflated terms in computing. Developers use them interchangeably, but they mean fundamentally different things — and mixing them up leads to real architectural mistakes.
The clearest way to understand both is through an analogy you've definitely lived.
The Coffee Shop Analogy
Imagine a coffee shop with one barista.
A customer orders a latte. The barista starts the espresso machine (takes 25 seconds), and while it's brewing, takes the next customer's order, steams milk for another drink, and puts a pastry in the oven.
That's concurrency. One person, multiple tasks in progress, making progress on all of them by switching between them while waiting. No task runs at exactly the same instant — they're interleaved.
Now the shop gets busier. The owner hires three more baristas. All four are now working simultaneously — each one physically doing something at the same moment.
That's parallelism. Multiple workers literally executing at the same instant.
The crucial insight: concurrency is about structure, parallelism is about execution. You can have a concurrent system running on a single core (like our solo barista), and you can have parallelism without good concurrency design (four baristas with only one espresso machine between them — a bottleneck).
A solo barista switches between steaming milk, taking orders, and watching the espresso machine. Which concept does this illustrate?
Concurrency: Managing Many Tasks at Once
Concurrency is the design approach of handling many tasks at the same time, even if you can only make progress on one at a time. It's about dealing with lots of things happening at once.
The key insight is that most tasks spend a lot of time waiting. Waiting for a database query to return. Waiting for a network request. Waiting for disk I/O. During that wait time, a concurrent system can start working on something else.
Event Loops: Node.js
Concurrency — Node.js event loop, one thread interleaving many I/O waits
Node.js is single-threaded. It runs on one core, one call stack. And yet it handles tens of thousands of simultaneous connections efficiently. How?
The event loop. When Node.js makes a database call, it doesn't block the thread waiting for a response. It registers a callback and moves on to the next request. When the database responds, the callback gets queued and runs when the event loop is free.
This is concurrency without parallelism. One thread, many tasks in flight.
// This doesn't block — Node.js handles other requests while waiting
const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);It works brilliantly for I/O-bound workloads. It breaks down for CPU-bound work — if you're doing heavy computation, you block the event loop and everyone waits.
Node.js can handle tens of thousands of simultaneous connections on a single thread because it uses an event loop with non-blocking I/O.
Parallelism: Actually Doing Many Things at Once
Parallelism — multiple CPU cores, tasks truly simultaneous
Parallelism requires multiple CPU cores. Tasks don't take turns — they genuinely run simultaneously on different cores.
This is what you want for CPU-bound work: image processing, video transcoding, machine learning inference, data crunching. Throwing more cores at these problems makes them faster in a direct, proportional way.
Which type of workload benefits most from parallelism across multiple CPU cores?
Goroutines: Go's Approach
Go's goroutines are one of the more elegant solutions to getting both concurrency and parallelism. A goroutine is a lightweight thread managed by the Go runtime, not the OS. You can spin up thousands of them cheaply.
// Spawn a goroutine — extremely lightweight
go func() {
result := processImage(img)
channel <- result
}()The Go scheduler maps goroutines onto OS threads, which map onto CPU cores. So you get concurrent structure and parallel execution across all available cores, automatically.
OS Threads
Traditional threading models (Java, C++, Python's threading module) map directly to OS threads. These are heavier — each thread has its own stack, typically 1-8 MB. You can't practically run 100,000 OS threads, but you can run 100,000 goroutines.
Go goroutines and OS threads are essentially equivalent — both map one-to-one to CPU cores.
The Threading Models Compared
| Model | Language/Platform | I/O-bound | CPU-bound | Overhead |
|---|---|---|---|---|
| Event loop | Node.js, Python asyncio | Excellent | Poor | Very low |
| OS threads | Java, C++ | Good | Good | High |
| Green threads / goroutines | Go, Erlang | Excellent | Excellent | Low |
| Async/await | Python, Rust, C# | Excellent | Poor* | Low |
*Async/await is still fundamentally single-threaded without a thread pool for CPU work.
Async/await in Python or C# is sufficient for CPU-bound workloads without a separate thread pool.
Why This Matters for System Design
Understanding the distinction directly affects architectural decisions:
Choose concurrency (event-driven / async) when:
- Your service makes lots of external calls (databases, APIs, caches)
- You need to handle many simultaneous connections
- Tasks spend most of their time waiting
Choose parallelism when:
- You're doing CPU-intensive computation
- Tasks can be split into independent chunks
- You have more cores available than you're using
Real-world example: A web API server that hits a database is I/O-bound — Node.js or async Python handles this well with minimal threads. A video transcoding service is CPU-bound — you'd want a multi-process or multi-threaded worker fleet that saturates every available core.
The mistake to avoid: using parallelism to solve what is actually a concurrency problem. Spinning up 100 threads to handle 100 database requests just wastes memory. An event loop handling all 100 with non-blocking I/O uses a fraction of the resources.
A team decides to handle 100 concurrent database requests by spawning 100 OS threads. What is the main problem with this approach?
Key insight: Most web services are I/O-bound, not CPU-bound. Good concurrency design usually matters more than raw parallelism for typical backend services.
Summary
Concurrency is about structuring a program to handle many tasks at once — tasks can overlap in time even on a single core, by interleaving during wait periods. Parallelism is about actually executing multiple tasks simultaneously on multiple CPU cores. Node.js achieves concurrency with a single-threaded event loop. Go achieves both with lightweight goroutines mapped to OS threads. For I/O-bound workloads (most web services), good concurrency design is the priority. For CPU-bound workloads, parallelism across cores is what actually speeds things up.
How helpful was this content?
Comments
Sign in to join the discussion
Saved on this device only
Sign in to sync progress across devices