Lambda Architecture

Updated June 8, 2026
M
Magic Magnets Team
8 min read

Systems that process large volumes of data face a persistent tension between accuracy and latency.

Batch processing is accurate. It processes the complete dataset, so aggregations are exact. But a batch job that runs every 12 hours means your data is up to 12 hours stale.

Stream processing is fast. Events are processed within milliseconds of arriving. But handling late-arriving events, exactly-once semantics, and windowed aggregations is hard. Real-time systems often produce approximate results.

Lambda Architecture, proposed by Nathan Marz in 2011, resolves this by running both systems in parallel and merging their outputs at query time.

The Three Layers

Batch Layer

The batch layer is the source of truth. Every incoming event is appended to an immutable master dataset in HDFS or S3. Nothing is ever deleted or modified. On a schedule (every few hours or nightly), a Spark job reads the entire master dataset and recomputes all metrics from scratch. The results go into a batch view store (typically HBase or Cassandra).

Because the batch layer processes the complete historical dataset each run, its results are exact. There are no late-arriving events to worry about. If a record arrived late, it's in the dataset. The next batch run will include it.

The cost is latency. The batch view is always N hours behind the current moment, where N is however long your batch cycle takes.

Speed Layer

The speed layer covers the gap between the last batch run and right now. It reads events from the same Kafka stream as the batch layer, processes them in real-time using Flink or Kafka Streams, and writes results to a low-latency store like Redis.

The speed layer only needs to maintain state for the recent window, not all of history. When the batch layer catches up and recomputes accurate results for a time window, the speed layer can discard its data for that window. The batch layer's results supersede the speed layer's approximations.

Because the speed layer can't wait for late-arriving events before producing results, it may produce slightly inaccurate results for recent time windows. This is acceptable: users get a fast, approximately correct answer now, and a precise answer once the batch layer processes it.

Serving Layer

The serving layer handles queries. When a client asks "How many orders did we process in the last 24 hours?", the serving layer:

  1. Reads the batch view for the time range covered by the last batch run (exact results)
  2. Reads the real-time view for the gap between the batch run and now (approximate results)
  3. Merges the two results and returns the combined answer

The serving layer needs to understand the boundary between "covered by the last batch run" and "still only in the speed layer."

algobase.dev
Lambda architecture: every incoming event is written to both the batch layer (HDFS) and processed by the speed layer (Flink) simultaneously. The batch job runs on the full dataset periodically to produce exact aggregates. The speed layer tracks only the recent delta since the last batch run. The serving layer merges both views to answer queries.
1 / 1

Lambda architecture: events flow to both batch layer (HDFS) and speed layer (Flink) simultaneously. Serving layer merges batch view and real-time view for each query.

A Concrete Example

Consider a web analytics system tracking page views per article.

Incoming event: { article_id: 123, user_id: 456, timestamp: 2026-06-08T14:23:00Z }

This event is written to Kafka.

Batch layer (running at midnight): reads all events from today, counts page views per article, stores: { article_id: 123, date: "2026-06-08", views: 42,891 }

Speed layer (real-time): counts views since the last batch run (since midnight). At 2 PM, it has: { article_id: 123, views_since_midnight: 12,304 }

Serving layer query at 2 PM: "How many views did article 123 get today?"

It combines yesterday's batch result (exact) with the speed layer count (approximate): returns 42,891 + 12,304 = 55,195.

At midnight, the next batch run processes all of today's data. The batch view is updated with the exact total. The speed layer resets. The approximate results from the speed layer are discarded.

The Operational Problem

Lambda Architecture works. It delivers low latency and eventual accuracy. But it has a serious operational cost: you write every aggregation twice.

The batch job that computes "views per article" is a Spark SQL query. The speed layer job that computes the same metric is a Flink stateful aggregation. These are different codebases, different deployment pipelines, different monitoring setups, and different debugging workflows. They need to produce consistent results when merged, but they're written in different APIs with different semantics.

When requirements change ("also count views from mobile apps"), you update both systems. When there's a bug in the aggregation logic, you fix it in both systems. The engineering overhead is significant.

This is not a minor inconvenience. Teams that ran Lambda Architecture at scale (Twitter, Netflix) reported that maintaining two parallel codebases was a major drag on their data engineering teams.

When Lambda Architecture Made Sense

Lambda Architecture was the right answer circa 2014-2018 when:

  • Stream processing engines were immature and couldn't guarantee exactly-once semantics reliably
  • Reprocessing historical data through a stream processor was not practical at scale
  • The tools for doing both in one place didn't exist

Today, Flink provides exactly-once semantics and can replay Kafka history for backfill. Modern engines like Flink and Spark Structured Streaming have narrowed the gap. The Kappa Architecture (covered in the next article) addresses this by eliminating the batch layer entirely.

Lambda Architecture is still relevant when:

  • Your batch layer uses specialized tools that can't be replicated in a streaming engine (e.g., graph algorithms, ML training runs)
  • You need historical recomputation to run faster than your stream processor can replay events
  • Regulatory requirements mandate a batch reconciliation pass over the complete dataset

Summary

Lambda Architecture solves the batch-vs-stream trade-off by running both simultaneously. The batch layer recomputes exact results over the complete dataset on a schedule. The speed layer covers recent events with approximate results. The serving layer merges both views for each query. The result is low latency and eventual accuracy. The cost is operational: the same logic must be maintained in two separate systems, which is complex and error-prone. Modern streaming engines have reduced the need for this pattern, but it remains valid for specialized workloads that genuinely require both.

Kappa Architecture

How helpful was this content?

Comments

0/2000

Sign in to join the discussion

Saved on this device only

Sign in to sync progress across devices