Vector Clocks

Updated June 3, 2026
M
Magic Magnets Team
9 min read

Imagine three friends (Alice, Bob, and Charlie) editing a shared document while offline. Alice adds a paragraph. Bob fixes a typo. Charlie deletes a sentence.

When they all reconnect, how do we merge their changes? We need to know: did Bob fix a typo in the paragraph Alice just added? Or was he fixing an older typo, completely unaware of Alice's new paragraph?

Lamport Timestamps can't answer this. To detect who knew what when, we need Vector Clocks.

The Core Concept

While Lamport Timestamps use a single integer to track time across a system, they fail to identify concurrent events (events that happened independently without knowing about each other).

A Vector Clock solves this by keeping an array (a vector) of counters, one for every single node in the system.

Quiz Time

Node A has vector [1,0,0]. It sends a message to Node B whose current vector is [0,2,0]. What is Node B's vector after receiving the message?

How Vector Clocks Work

algobase.dev
A vector clock is an array of counters, one slot per node. On a local event, a node increments only its own slot. When sending a message, it attaches its full vector. On receiving a message, the receiver merges vectors by taking the MAX of each slot, then increments its own slot. This means Node B's clock after receiving from A is MAX([0,0,0],[1,0,0]) + 1 at index B = [1,1,0]. The merge rule propagates causal history: if B received from A, B's clock now "knows" everything A knew. Any event with a lower counter in every slot happened before.
1 / 1

Vector clock update rule — local increment, message piggybacking, MAX merge

Instead of just knowing "I am at time 5," a node with a vector clock knows "I am at time 5, Alice is at time 2, and Bob is at time 4." It keeps a ledger of everyone's progress.

Let's say we have three servers: [A, B, C]. The vector clock looks like an array: [0, 0, 0].

  1. Local Events: When Server A does something, it increments its own slot in the vector. State becomes [1, 0, 0].
  2. Sending Messages: When A sends a message to B, it attaches its entire vector [1, 0, 0].
  3. Receiving Messages: When B receives the message, it does two things:
    • First, it increments its own slot: [1, 1, 0].
    • Second, it merges the incoming vector with its local vector by taking the maximum value for every single slot.

Detecting Causality and Conflict

Because we track everyone's state, we can compare two vector clocks to see what happened.

Let's compare Vector 1 (V1) and Vector 2 (V2).

  • If every number in V1 is less than or equal to the corresponding number in V2, then V1 happened before V2. (V2 knew about everything V1 did).
  • If some numbers in V1 are higher, but some numbers in V2 are higher (e.g., [2, 0, 1] vs [1, 1, 1]), then the events are Concurrent. They happened independently, and this indicates a Conflict.
Quiz Time

Two events have vectors [3,0,1] and [1,2,1]. What is their relationship?

Real-World Examples

algobase.dev
When two replicas accept writes independently (as in DynamoDB during a partition), their vectors diverge. Replica A's vector [2,0] means A has seen 2 events, B has seen 0. Replica B's vector [0,2] means the opposite. Comparing them: A's slot 2 > B's slot 2 in position 0, but A's slot 0 < B's slot 2 in position 1. Neither vector is fully less-than-or-equal-to the other, so these are concurrent writes. No data is lost: the reconciliation step merges both carts, giving {Shoes, Socks}. This is what the Dynamo paper calls "version vectors" — the mechanism behind DynamoDB's eventual consistency model.
1 / 1

Concurrent writes detected — diverged vectors trigger merge, no data lost

  • Amazon DynamoDB: Dynamo (the academic paper that inspired DynamoDB, Cassandra, and Riak) famously uses a variation of Vector Clocks called Version Vectors. If you add "Shoes" to your shopping cart on Server A, and simultaneously add "Socks" to your cart on Server B, the database uses version vectors to detect the conflict when the network heals, ensuring both items end up in your merged cart rather than one overwriting the other.

[!NOTE] Vector Clocks are essential for decentralized, leaderless databases where multiple nodes can accept writes for the same data simultaneously.

The Downside: Bloat

If you have 3 servers, a vector clock is an array of 3 integers. Easy. If you have a massive system with 10,000 clients dynamically joining and leaving, the vector clock becomes an array of 10,000 integers. Attaching a massive array to every single network message causes huge network overhead. Systems often have to use pruning techniques to keep the vectors small.

Summary

  • Vector clocks use an array of counters (one per node) to track causality in a distributed system.
  • Nodes update their own index and merge incoming vectors by taking the max of each slot.
  • They can definitively tell if Event A caused Event B, or if they were Concurrent.
  • They are widely used in distributed databases (like Dynamo and Cassandra) to detect and resolve data conflicts.
  • The main trade-off is the size of the vector payload growing as the number of nodes increases.

How helpful was this content?

Comments

0/2000

Sign in to join the discussion

Saved on this device only

Sign in to sync progress across devices