Concurrency vs Parallelism

Updated June 3, 2026

Magic Magnets Team

8 min read

Concurrency and parallelism are two of the most conflated terms in computing. Developers use them interchangeably, but they mean fundamentally different things — and mixing them up leads to real architectural mistakes.

The clearest way to understand both is through an analogy you've definitely lived.

The Coffee Shop Analogy

Imagine a coffee shop with one barista.

A customer orders a latte. The barista starts the espresso machine (takes 25 seconds), and while it's brewing, takes the next customer's order, steams milk for another drink, and puts a pastry in the oven.

That's concurrency. One person, multiple tasks in progress, making progress on all of them by switching between them while waiting. No task runs at exactly the same instant — they're interleaved.

Now the shop gets busier. The owner hires three more baristas. All four are now working simultaneously — each one physically doing something at the same moment.

That's parallelism. Multiple workers literally executing at the same instant.

The crucial insight: concurrency is about structure, parallelism is about execution. You can have a concurrent system running on a single core (like our solo barista), and you can have parallelism without good concurrency design (four baristas with only one espresso machine between them — a bottleneck).

Quiz Time

A solo barista switches between steaming milk, taking orders, and watching the espresso machine. Which concept does this illustrate?

Concurrency: Managing Many Tasks at Once

Concurrency is the design approach of handling many tasks at the same time, even if you can only make progress on one at a time. It's about dealing with lots of things happening at once.

The key insight is that most tasks spend a lot of time waiting. Waiting for a database query to return. Waiting for a network request. Waiting for disk I/O. During that wait time, a concurrent system can start working on something else.

Event Loops: Node.js

algobase.dev

Concurrency is about structure: handling many tasks at once even on a single core, by interleaving them during wait periods. Node.js is single-threaded — it runs on one CPU core, one call stack. Yet it handles tens of thousands of simultaneous connections efficiently. When Node.js makes a database call, it registers a callback and immediately moves on to handle the next request. When the database responds, the callback is queued and runs when the event loop is free. Tasks don't run simultaneously — they interleave. This is brilliant for I/O-bound work where most time is spent waiting for network or disk. It breaks down for CPU-bound work: if you do a heavy computation synchronously, you block the event loop and freeze all other requests until it completes.

1 / 1

Concurrency — Node.js event loop, one thread interleaving many I/O waits

Node.js is single-threaded. It runs on one core, one call stack. And yet it handles tens of thousands of simultaneous connections efficiently. How?

The event loop. When Node.js makes a database call, it doesn't block the thread waiting for a response. It registers a callback and moves on to the next request. When the database responds, the callback gets queued and runs when the event loop is free.

This is concurrency without parallelism. One thread, many tasks in flight.

// This doesn't block — Node.js handles other requests while waiting
const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);

It works brilliantly for I/O-bound workloads. It breaks down for CPU-bound work — if you're doing heavy computation, you block the event loop and everyone waits.

Quiz Time

Node.js can handle tens of thousands of simultaneous connections on a single thread because it uses an event loop with non-blocking I/O.

Parallelism: Actually Doing Many Things at Once

algobase.dev

Parallelism is about execution: multiple tasks literally running at the same instant on multiple CPU cores. This is what you need for CPU-bound work — video transcoding, image processing, machine learning inference, sorting large datasets. Throwing more cores at these problems delivers proportional speedup. Go's goroutines are an elegant implementation: goroutines are extremely lightweight (a few KB of stack vs 1-8 MB for OS threads) and the Go scheduler automatically maps them onto all available CPU cores. You can spawn 100,000 goroutines and they'll efficiently use all your cores. The mistake to avoid: using parallelism to solve an I/O-bound problem. Spinning up 1,000 threads to handle 1,000 database requests wastes 8 GB of stack memory — an async event loop handles this with microsecond context switching and a tiny memory footprint.

1 / 1

Parallelism — multiple CPU cores, tasks truly simultaneous

Parallelism requires multiple CPU cores. Tasks don't take turns — they genuinely run simultaneously on different cores.

This is what you want for CPU-bound work: image processing, video transcoding, machine learning inference, data crunching. Throwing more cores at these problems makes them faster in a direct, proportional way.

Quiz Time

Which type of workload benefits most from parallelism across multiple CPU cores?

Goroutines: Go's Approach

Go's goroutines are one of the more elegant solutions to getting both concurrency and parallelism. A goroutine is a lightweight thread managed by the Go runtime, not the OS. You can spin up thousands of them cheaply.

// Spawn a goroutine — extremely lightweight
go func() {
    result := processImage(img)
    channel <- result
}()

The Go scheduler maps goroutines onto OS threads, which map onto CPU cores. So you get concurrent structure and parallel execution across all available cores, automatically.

OS Threads

Traditional threading models (Java, C++, Python's threading module) map directly to OS threads. These are heavier — each thread has its own stack, typically 1-8 MB. You can't practically run 100,000 OS threads, but you can run 100,000 goroutines.

Quiz Time

Go goroutines and OS threads are essentially equivalent — both map one-to-one to CPU cores.

The Threading Models Compared

Model	Language/Platform	I/O-bound	CPU-bound	Overhead
Event loop	Node.js, Python asyncio	Excellent	Poor	Very low
OS threads	Java, C++	Good	Good	High
Green threads / goroutines	Go, Erlang	Excellent	Excellent	Low
Async/await	Python, Rust, C#	Excellent	Poor*	Low

*Async/await is still fundamentally single-threaded without a thread pool for CPU work.

Quiz Time

Async/await in Python or C# is sufficient for CPU-bound workloads without a separate thread pool.

Why This Matters for System Design

Understanding the distinction directly affects architectural decisions:

Choose concurrency (event-driven / async) when:

Your service makes lots of external calls (databases, APIs, caches)
You need to handle many simultaneous connections
Tasks spend most of their time waiting

Choose parallelism when:

You're doing CPU-intensive computation
Tasks can be split into independent chunks
You have more cores available than you're using

Real-world example: A web API server that hits a database is I/O-bound — Node.js or async Python handles this well with minimal threads. A video transcoding service is CPU-bound — you'd want a multi-process or multi-threaded worker fleet that saturates every available core.

The mistake to avoid: using parallelism to solve what is actually a concurrency problem. Spinning up 100 threads to handle 100 database requests just wastes memory. An event loop handling all 100 with non-blocking I/O uses a fraction of the resources.

Quiz Time

A team decides to handle 100 concurrent database requests by spawning 100 OS threads. What is the main problem with this approach?

Key insight: Most web services are I/O-bound, not CPU-bound. Good concurrency design usually matters more than raw parallelism for typical backend services.

Summary

Concurrency is about structuring a program to handle many tasks at once — tasks can overlap in time even on a single core, by interleaving during wait periods. Parallelism is about actually executing multiple tasks simultaneously on multiple CPU cores. Node.js achieves concurrency with a single-threaded event loop. Go achieves both with lightweight goroutines mapped to OS threads. For I/O-bound workloads (most web services), good concurrency design is the priority. For CPU-bound workloads, parallelism across cores is what actually speeds things up.

Push vs Pull Architecture

How helpful was this content?

Comments

0/2000

Saved on this device only