WebRTC

Updated June 6, 2026
M
Magic Magnets Team
6 min read

Have you ever wondered how video calls on Discord or Google Meet are so fast? If you're building a video chat app, the obvious approach is to send video data from User A to your server, and then from your server to User B.

But sending heavy video data through a middleman adds latency. It's like passing notes through a teacher instead of whispering directly to your friend. It also costs server bandwidth for every minute of video.

Enter WebRTC (Web Real-Time Communication).

What is WebRTC?

WebRTC is an open standard that allows browsers and mobile apps to communicate directly with each other — peer-to-peer (P2P). Instead of bouncing data through a central server, WebRTC establishes a direct connection between two devices. Once connected, they stream audio, video, and arbitrary data directly with very low latency.

How WebRTC Works: Setup Requires Servers

Even though the final connection is peer-to-peer, you still need servers to set it up. There are three components:

1. Signaling

WebRTC doesn't specify how devices should find each other — you build this yourself. A signaling server (typically WebSocket-based) exchanges "Session Description Protocol" (SDP) messages between peers. This is basically the two sides agreeing on what video formats and codecs they support.

2. STUN (Session Traversal Utilities for NAT)

Most devices sit behind routers and don't know their public IP addresses. A STUN server echoes back: "Here is your public IP and port." The device then shares this public address with its peer via the signaling server.

3. TURN (Traversal Using Relays around NAT)

Sometimes corporate firewalls or strict NATs block direct peer-to-peer connections. When a direct path is impossible, WebRTC falls back to a TURN server, which relays data between peers. It's not true P2P anymore, but it guarantees the connection works.

algobase.dev
Before two peers can talk directly, they need to find each other. The signaling server handles this introduction phase. Peer A sends an SDP offer — a description of the codecs and media formats it supports. The signaling server forwards this to Peer B. Peer B responds with an SDP answer (confirming which formats work). Both peers also ask a STUN server for their public IP address, since most devices sit behind NAT and only know their private address. Once both sides have exchanged session descriptions and discovered their public addresses, the signaling server's job is done. It is no longer involved in the call.
1 / 1

WebRTC signaling — peers exchange SDP offers and discover public addresses via STUN

Quiz Time

What does a STUN server do in a WebRTC connection setup?

Direct P2P and TURN Fallback

Once peers have exchanged session descriptions and public addresses through signaling, WebRTC tries to establish a direct UDP connection between them. The ICE (Interactive Connectivity Establishment) protocol tries several candidate paths and picks the best one.

For most connections (roughly 80-85%), direct P2P succeeds. For the rest, TURN relay is the fallback.

algobase.dev
When both peers can reach each other's public address, WebRTC establishes a direct UDP connection — the peer-to-peer path. No server is in the middle. Audio and video frames travel directly from one device to the other, achieving sub-500ms latency and consuming no server bandwidth. However, about 15-20% of connections fail to go direct because corporate firewalls or symmetric NATs block the path. In those cases, WebRTC falls back to a TURN server that relays packets between peers. TURN servers do consume bandwidth, so providers like Twilio and Agora charge for TURN usage. The ICE protocol (Interactive Connectivity Establishment) handles the negotiation between direct, STUN-assisted, and TURN paths automatically.
1 / 1

P2P connection or TURN fallback — WebRTC picks the best available path automatically

Real-World Examples

WebRTC isn't just for video calls. Its low-latency data channels make it versatile:

  • Google Meet and Zoom (Web): Real-time video and audio streaming. Large group calls typically use an SFU (Selective Forwarding Unit) topology instead of pure P2P, since pure P2P breaks down with more than 4-5 participants.
  • Figma: WebRTC data channels for real-time multiplayer cursor movements. Faster than WebSockets because it uses UDP, so a dropped packet doesn't stall the entire stream.
  • Screen sharing tools: Low-latency video frames sent directly between devices without server round-trips.
Quiz Time

Why does pure P2P WebRTC break down in group video calls with many participants?

Why Use WebRTC

  • Very low latency: By bypassing intermediary servers and using UDP, WebRTC achieves sub-500ms latency.
  • Bandwidth savings: For a 1-on-1 call, your servers don't pay for video stream bandwidth.
  • End-to-end encryption: WebRTC mandates encryption by default (DTLS and SRTP).

The Trade-offs

Complex setup: Building signaling servers and deploying STUN/TURN infrastructure is notoriously difficult to get right. Most teams use a managed WebRTC service (Twilio, Agora, Daily) rather than operating their own.

Group calls require an SFU: Pure P2P breaks down quickly with more than 4-5 participants. If 10 people are in a call, each device would need to upload 9 separate video streams, which destroys bandwidth and battery life. Selective Forwarding Units (SFUs) aggregate and route streams server-side while keeping latency low.

TURN costs money: TURN servers carry actual video bandwidth. Providers charge per minute of relayed traffic.

Quiz Time

When does WebRTC fall back to a TURN server instead of using direct P2P?

Summary

WebRTC enables browsers and devices to communicate directly, peer-to-peer. It requires a signaling server to coordinate the initial connection and STUN/TURN servers to discover addresses and bypass firewalls. Once connected, it provides the lowest possible latency for video, audio, and data — making it the foundation of modern real-time communication apps like Discord and Google Meet. The main operational challenge is the infrastructure complexity: signaling servers, STUN/TURN deployment, and SFU architecture for group calls.

How helpful was this content?

Comments

0/2000

Sign in to join the discussion

Saved on this device only

Sign in to sync progress across devices