Skip to main content

Understanding WebRTC State Machines

Understanding WebRTC State Machines

Specific transports, aggregate states, and the Chrome/libwebrtc mental model

WebRTC is one of those technologies that appears deceptively simple on the surface. You call createOffer(), exchange some SDP, and suddenly two browsers are streaming video to each other. But beneath that simplicity lies a set of interlocking state machines that govern every aspect of the connection's lifecycle—from signaling and ICE candidate exchange to DTLS handshake completion.

If you've ever stared at a WebRTC debugging log wondering why connectionState reads "connecting" while iceConnectionState says "connected", this article is for you. The key insight, and the one the Chrome/libwebrtc codebase is built around, is that WebRTC state is organized into two distinct tiers: specific state machines that live on individual transports, and aggregate states on RTCPeerConnection that are derived—computed—from those transports.

This article walks through each layer in detail, explains how the aggregate states are calculated, and ties it all together with a narrative of a typical connection's lifetime.


The Two-Tier Architecture

The easiest mental model for reasoning about Chrome's WebRTC behavior is to think of it in two layers.

Specific state machines live on individual transports. Each ICE transport, each DTLS transport, and the SCTP/DataChannel layer maintain their own state, tracking their own progress through their respective protocols independently.

Aggregate states live on the RTCPeerConnection object and have no independent transition logic of their own. Instead, they are computed from the set of currently-relevant transports (the active transceivers plus the optional SCTP transport). Think of them as dashboards—read-only views that reduce the complexity of multiple transports into a single summary value using well-defined precedence rules.

This distinction matters enormously for debugging. When an aggregate state like connectionState reports "failed", the question is never "why did connectionState fail?"—it doesn't fail on its own. The question is "which specific transport entered the failed state, and why?"


The Specific State Machines

SDP/JSEP Signaling State

The signalingState property tracks the progress of the offer/answer exchange—and nothing else. It is a common misconception that signaling state reflects connectivity. It does not. A peer connection can be in signalingState: "stable" and have no working media path whatsoever; "stable" simply means that the last offer/answer round-trip has completed successfully.

The signaling states are:

StateMeaning
stableNo offer/answer in progress; the default and resting state.
have-local-offerA local offer has been set, awaiting a remote answer.
have-remote-offerA remote offer has been received, awaiting a local answer.
have-local-pranswerA local provisional answer has been set.
have-remote-pranswerA remote provisional answer has been received.
closedThe peer connection has been shut down.

Every state can transition to closed when close() is called. The key transitions to remember are: calling setLocalDescription(offer) from stable moves to have-local-offer, and receiving the remote answer returns to stable. The same pattern holds symmetrically for the answering side.

Key Takeaway: signalingState is purely about offer/answer progress. It tells you nothing about whether ICE has found a path or DTLS has completed its handshake.

ICE Transport State

The RTCIceTransport.state property is where the real connectivity story lives. Each ICE transport independently manages candidate pair checks, consent verification, and failure detection.

StateMeaning
newThe transport exists but has not begun checking candidates.
checkingCandidate pair checks are underway.
connectedA viable pair has been selected, though additional checks may still run.
completedGathering is done, end-of-candidates has been signaled, and a final pair is selected.
disconnectedA transient loss of connectivity; the transport is still trying to recover.
failedAll checks are exhausted and no working pair could be established.
closedThe transport has been shut down.

What makes the ICE state machine interesting—and occasionally surprising—are its back edges. Unlike a simple linear progression, ICE can move backwards:

  • connected → checking: When consent is revoked on the active pair, the transport drops back to re-check alternatives.
  • completed → checking: An ICE restart (triggered by renegotiation) resets a completed transport back to checking.
  • connected → disconnected: A transient network interruption moves the transport to disconnected, from which it may recover or eventually fail.
  • disconnected → checking: If new candidate pairs become available during a disconnected period, the transport re-enters checking.

These back edges are critical for understanding connection lifetime. A connection that appears "stable" in the completed state can regress to checking during an ICE restart, and a brief network blip can trigger a disconnected → checking → connected cycle without the user ever noticing.

Debugging Tip: The disconnected state is transient by design. If you see it in logs, wait before concluding there's a problem—ICE may be in the process of recovering. The failed state is the terminal one to watch for.

DTLS Transport State

The RTCDtlsTransport.state property tracks the DTLS handshake that secures the media path. It is comparatively simple—a mostly linear progression from new through connecting to connected, with failed as the error terminal.

StateMeaning
newThe DTLS handshake has not started.
connectingThe DTLS handshake is in progress.
connectedThe handshake completed and the fingerprint was verified.
failedThe handshake failed (e.g., fingerprint mismatch, DTLS alert).
closedThe transport has been shut down via close_notify or peer connection closure.

Failure in DTLS is almost always a security-related event: a certificate fingerprint mismatch (which can indicate a man-in-the-middle attempt or an SDP error), a DTLS alert, or a timeout during the handshake. Unlike ICE, DTLS has no recovery path from failure—once it fails, the transport is done.


The Aggregate (Derived) States

The aggregate states on RTCPeerConnection exist as a convenience for application developers. Rather than requiring you to iterate over every transport and compute the overall status yourself, the spec defines derivation rules that collapse the individual transport states into summary values. There are three aggregate states to understand.

Aggregate ICE Connectivity: iceConnectionState

The iceConnectionState property is computed from the set of all currently-relevant ICE transports using the following precedence rules, evaluated in order:

  1. closed — if the peer connection itself is closed.
  2. failed — if any ICE transport is in the failed state.
  3. disconnected — if any ICE transport is disconnected.
  4. new — if all ICE transports are new or closed (or none exist).
  5. checking — if any ICE transport is new or checking.
  6. completed — if all ICE transports are completed or closed.
  7. connected — if all ICE transports are connected, completed, or closed.

The precedence ordering is important. Notice that failed dominates: a single failed transport poisons the entire aggregate, regardless of how many other transports are healthy. Similarly, disconnected takes precedence over positive states. This is a "worst-case-wins" model.

One subtlety worth noting: when ICE transports are created or discarded due to signaling changes—bundle policy changes, RTCP multiplexing adjustments, or adding new media lines—the aggregate state can "jump" forward without any individual transport changing state. A new transceiver added mid-session can momentarily pull the aggregate back to checking even if existing transports are all connected.

Overall Session State: connectionState

The connectionState property is the highest-level health indicator on RTCPeerConnection. It combines the aggregate ICE connection state with the DTLS transport states. Its precedence rules are:

  1. closed — if the ICE aggregate is closed.
  2. failed — if the ICE aggregate is failed or any DTLS transport has failed.
  3. disconnected — if the ICE aggregate is disconnected.
  4. new — if the ICE aggregate is new and all DTLS transports are new or closed (or no transports exist).
  5. connected — if the ICE aggregate is connected and all DTLS transports are connected or closed.
  6. connecting — otherwise (the catch-all bucket).

This is the state that explains the puzzle from the introduction. connectionState can read "connecting" even when iceConnectionState shows "connected" because the DTLS handshake hasn't finished yet. The ICE path is established, but the secure channel over that path is still being negotiated. Since neither the failed nor disconnected rules apply, and DTLS isn't yet connected, the catch-all connecting state kicks in.

Key Takeaway: connectionState = "connected" means both ICE has found a working path and DTLS has completed its handshake. It is the definitive signal that the media channel is fully operational.

Aggregate ICE Gathering: iceGatheringState

Separate from connectivity, the iceGatheringState tracks candidate gathering progress across all transports. It follows a simple progression: newgatheringcomplete. It can return to gathering if new network interfaces or STUN/TURN servers become available. In practice, with trickle ICE, the gathering state often reaches complete well before the ICE connectivity state finishes checking all pairs.


A Connection's Lifetime

Tying all these state machines together, here is the typical progression of a successful WebRTC connection from offer to media flow:

LayerState Progression
Signalingstable → have-local-offer → stable (after answer)
ICE Gathering (agg.)new → gathering → complete
ICE Transportnew → checking → connected → completed
ICE Connection (agg.)new → checking → connected / completed
DTLS Transportnew → connecting → connected
Connection (agg.)new → connecting → connected

Notice the ordering: signaling completes first (the offer/answer exchange), then ICE gathering begins. As candidates trickle in, ICE connectivity checks start. Once ICE finds a viable pair (connected), DTLS begins its handshake over that path. Only when DTLS also reaches connected does the aggregate connectionState finally report "connected".

During this progression, there is an inevitable window where ICE is connected but DTLS is still negotiating. This is normal and expected. The aggregate connectionState will show "connecting" during this window—not because anything is wrong, but because the derivation rules correctly reflect that the full secure channel is not yet established.


Practical Debugging Advice

Armed with an understanding of the two-tier architecture, here are some practical principles for debugging WebRTC connections.

Always drill down from aggregate to specific. When connectionState or iceConnectionState shows something unexpected, don't try to reason about why the aggregate changed. Instead, inspect the individual RTCIceTransport and RTCDtlsTransport objects to find which specific transport triggered the change.

Respect the back edges. ICE is not a one-way street. Consent revocation, ICE restarts, and transient network issues can all cause state regressions. Build your connection-monitoring logic to handle backwards transitions gracefully, especially the connected → disconnected → checking → connected cycle.

Distinguish transient from terminal. The disconnected state is transient; failed is terminal. Don't show error UI on disconnected—give ICE a chance to recover. Reserve error handling for the failed state, or for disconnected states that persist beyond a reasonable timeout.

Remember what connectionState = "connected" actually means. It means ICE has a working path and DTLS has completed its handshake. If you need to know that media can flow securely, this is the state to watch—not iceConnectionState alone.

Watch for aggregate jumps from signaling changes. Adding a new transceiver or changing bundle policy mid-session can cause aggregate states to shift even when no individual transport changed. If your state-monitoring code triggers unexpectedly during renegotiation, this is likely why.


Conclusion

WebRTC's state management is a layered system. Specific state machines on individual transports do the real work of establishing connectivity and security. Aggregate states on RTCPeerConnection provide a convenient summary by applying precedence rules across the set of active transports. Understanding this two-tier architecture—and especially the derivation rules that connect them—is the key to reasoning confidently about WebRTC connection lifecycle, debugging unexpected state transitions, and building robust real-time communication applications.


References

  1. W3C, WebRTC: Real-Time Communication in Browsers (W3C Recommendation). https://www.w3.org/TR/webrtc/. Sections referenced: RTCSignalingState enum, RTCIceConnectionState derivation, RTCPeerConnectionState derivation, RTCIceGatheringState, and non-normative signaling transition diagrams.
  2. MDN Web Docs, RTCIceTransport: state property. https://developer.mozilla.org/en-US/docs/Web/API/RTCIceTransport/state. Referenced for per-transport ICE state definitions and back-edge transition descriptions.
  3. MDN Web Docs, RTCDtlsTransport: state property. https://developer.mozilla.org/en-US/docs/Web/API/RTCDtlsTransport/state. Referenced for DTLS transport state definitions and failure conditions.
  4. IETF RFC 8829, JavaScript Session Establishment Protocol (JSEP). https://www.rfc-editor.org/rfc/rfc8829. Background reference for the offer/answer model and SDP handling that drives signalingState transitions.
  5. IETF RFC 8445, Interactive Connectivity Establishment (ICE). https://www.rfc-editor.org/rfc/rfc8445. Background reference for ICE candidate pair checking, consent verification, and the state model underlying RTCIceTransport.
  6. IETF RFC 6347, Datagram Transport Layer Security Version 1.2. https://www.rfc-editor.org/rfc/rfc6347. Background reference for the DTLS handshake and alert mechanisms underlying RTCDtlsTransport.

This article was written in collaboration with GPT-5.2 and Claude Opus 4.5. The technical research and notes were developed with GPT-5.2, and the article was drafted and formatted with Claude Opus 4.5.

Popular posts from this blog

Troubleshooting TURN

  WebRTC applications use the ICE negotiation to discovery the best way to communicate with a remote party. I t dynamically finds a pair of candidates (IP address, port and transport, also known as “transport address”) suitable for exchanging media and data. The most important aspect of this is “dynamically”: a local and a remote transport address are found based on the network conditions at the time of establishing a session. For example, a WebRTC client that normally uses a server reflexive transport address to communicate with an SFU. when running inside the home office, may use a relay transport address over TCP when running inside an office network which limits remote UDP targets. The same configuration (defined as “iceServers” when creating an RTCPeerConnection will work in both cases, producing different outcomes.

Extracting RTP streams from network captures

I needed an efficient way to programmatically extract RTP streams from a network capture. In addition I wanted to: save each stream into a separate pcap file. extract SRTP-negotiated keys if present and available in the trace, associating them to the related RTP (or SRTP if the negotiation succeeded) stream. Some caveats: In normal conditions the negotiation of SRTP sessions happens via a secure transport, typically SIP over TLS, so the exchanged crypto information may not be available from a simple network capture. There are ways to extract RTP streams using Wireshark or tcpdump; it’s not necessary to do it programmatically. All this said I wrote a small tool ( https://github.com/giavac/pcap_tool ) that parses a network capture and tries to interpret each packet as either RTP/SRTP or SIP, and does two main things: save each detected RTP/SRTP stream into a dedicated pcap file, which name contains the related SSRC. print a summary of the crypto information exchanged, if available. With ...

Testing SIP platforms and pjsip

There are various levels of testing, from unit to component, from integration to end-to-end, not to mention performance testing and fuzzing. When developing or maintaining Real Time Communications (RTC or VoIP) systems,  all these levels (with the exclusion maybe of unit testing) are made easier by applications explicitly designed for this, like sipp . sipp has a deep focus on performance testing, or using a simpler term, load testing. Some of its features allow to fine tune properties like call rate, call duration, simulate packet loss, ramp up traffic, etc. In practical terms though once you have the flexibility to generate SIP signalling to negotiate sessions and RTP streams, you can use sipp for functional testing too. sipp can act as an entity generating a call, or receiving a call, which makes it suitable to surround the system under test and simulate its interactions with the real world. What sipp does can be generalised: we want to be able to simulate the real world tha...