Thursday 27 October 2022

About ICE negotiation

Disclaimer: I wrote this article on March 2022 while working with Subspace, and the original link is here: https://subspace.com/resources/ice-negotiation . This post in my personal blog is a way to ensure it doesn't get lost. There is nothing service-specific in it, I've made only minor edits and I hope it can be a good technical reference on the topic.



WebRTC is a set of protocols that allow applications, typically running on Web browsers, to exchange media (audio, video, data) with other entities.

Before media can flow, however, the WebRTC entities need to discover what type of connection is possible, and among the possible connections, what’s the best to be used. This needs to happen as fast as possible, so that users can perceive the service as instantaneous as possible.

WebRTC includes protocols like STUN and TURN that are designed to facilitate the establishment of connections when a direct connection is not possible. The typical case is a computer inside a home or office network, with a private IP address, and able to reach the public Internet only through an address translation (NAT).

STUN helps in discovering the IP address and port from where a computer enters the Internet, and in some circumstances that IP address and port can be used by other entities to reach that computer. STUN is also used for keeping such bindings alive.

TURN provides a way for two entities to communicate when they are behind two different symmetric NATs, or when one is behind a firewall that restricts outbound traffic to only some UDP or TCP ports. TURN uses STUN as the underlying protocol, adding requests, responses and indications to accomplish media relay.

STUN and TURN play a role in the ICE negotiation process. ICE, Interactive Connectivity Establishment, is a protocol that allows the dynamic discovery of the best way to establish a connection for entities that may be behind NAT.

All WebRTC clients use ICE before media can flow.

There are three main phases: the gathering of candidates, the connectivity checks, and the nomination of the candidate pairs to be used.

The ICE candidates are simply transport addresses (IP address, port and transport type) that can potentially be used to communicate (send and receive media) and that the ICE client collects and shares to the other party through some form of signalling.

There are three main types of ICE candidates: 'host', 'server reflexive' and 'relay'.

'host' candidates refer to transport addresses that are directly visible by the client, where the client can start listening for incoming connections or packets. Computers behind NAT may only have private IP addresses as 'host' candidates, but they are potentially usable if the other party belongs to the same network, or depending on the type of NAT.

'server reflexive' candidates are the ones discovered through the interaction with a STUN server. The client sends a Binding Request to the STUN server and receives a Binding Success Response with a MAPPED-ADDRESS containing the source IP address and port from where the request was received. There may be more than one level of NAT, and that transport address represents the outmost one.

'relay' candidates refer to allocations reserved on a TURN server. The ICE client requests an Allocation of a relay, and after successful authentication the TURN server provides a RELAYED-ADDRESS containing the transport address allocated on that server for the client.

These types of ICE candidates have different priorities, 'host' being at highest priority and 'relay' at lowest priority; this is a way to privilege direct interconnection when possible (but that not necessarily represents the best solution in terms of connection quality and in general of Quality Of Experience).

This diagrams shows a typical process where ICE candidates are gathered and sent to the other party:


In this example the candidates are communicated as soon as they are retrieved. This technique is called Trickle ICE and was designed to ensure that the connectivity checks can happen as soon as the candidates are available.

Without Trickle ICE, the WebRTC client would need to wait for all the candidates to be collected before sending an offer or an answer, increasing the session set up time.

The candidates are transmitted over a signalling system established between the two parties. This is outside of the WebRTC specifications and application-specific.

The WebRTC client will then receive the ICE candidates from the other party, and it will build a list of “candidate pairs”: each local candidate will be paired with each remote candidate.

After this operation the connectivity checks can begin.

Let’s see it with an example, assuming UDP as transport for all cases. The WebRTC client has a local host candidate, IP1:port1, and has received a remote host candidate, IP2:port2. It builds a “candidate pair” with the two: {IP1:port1, IP2:port2}.

The WebRTC client will start sending STUN Binding Requests with source IP1:port1 and destination IP2:port2. These requests use STUN short-term authentication, and contain a username and password that were previously exchanged when transmitting the candidates inside the SDP offer/answer.

If the Binding Request reaches the other party on IP2:port2, then the other party will authenticate the request, and respond with a Binding Success Response, including the MAPPED-ADDRESS attribute, containing the source of the request.

If the Binding Success Response reaches the WebRTC client, then it will identify the request by looking into the transaction ID and mark the check as Successful, and so suitable for exchanging media.

If for any reason the Binding Success Response is not received, then the candidate pair will remain in a In Progress state for some time, and then move to Failed: that pair cannot be used to exchange media.



It’s important to note that this is symmetrical: the other party too can start the connectivity check from IP2:port2 towards IP1:port1. To avoid a conflict, the parties assume the role of "controlling" or "controlled" agent. The controlling agent will be the one deciding which candidate pair will be used for exchanging media.

Before media can flow through a TURN server, a client must create a Permission. This is important during ICE connectivity checks.

For ICE candidates of type 'relay', the connectivity check will be performed sending Binding Requests that traverse the TURN server and reach the other party on the relay side. The Binding Requests will be carried by a Send Indication, destined to the remote candidate as peer address. The TURN server will only accept it and relay it to the destination if a Permission has been granted for that peer.



(NAT has been omitted in this diagram for simplicity)

Of course a TURN allocation must exist for the CreatePermission request to succeed, but that’s already been created during the candidate gathering phase.

If the candidate pair selected for exchanging media will be one with a local 'relay' candidate, then typically the WebRTC client binds a TURN Channel to the other party, and starts exchanging media using ChannelData messages, instead of Send/Data Indications.

There are more details that can be discussed, like managing timeouts, role conflicts, ICE Lite, etc, which we will address in other articles. One additional aspect is important here: we mentioned three types of candidates, host, server reflexive and relayed, but there’s a fourth one, "peer reflexive".

Peer reflexive candidates are not provided directly by a party during candidate exchange, but are instead discovered dynamically during the connectivity checks. Getting back to the previous example with a candidate pair {IP1:port1, IP2:port2}, depending on NAT conditions, the response to the Bind Request from IP1:port1 can be received by another source, IP3:port3.

The WebRTC client can verify IP3:port3 is sending a valid response by checking the transaction ID, and if successful it will dynamically add a remote candidate of type peer reflexive. Sometimes the peer reflexive candidate is the only one suitable and will be used to exchange media.

Chrome’s chrome://webrtc-internals, Firefox’s about:webrtc and Safari's "WebRTC Logging" will show the list of candidates and the pair that was selected, so those tools are of great value when troubleshooting.

If you take a trace on the computer running the WebRTC client, you’ll be able to see the STUN Binding Requests and Responses, and CreatePermission, Send/Data Indications for connectivity checks if unencrypted TURN is used. Wireshark will filter those messages for you if you use the ‘stun’ filter, and will also be able to interpret the Binding Request/Response carried inside Send/Data Indications (and also the RTP streams, but that’s for another article).

If you're interested about troubleshooting TURN sessions, take a look at this other article, "Troubleshooting TURN".

Monday 16 May 2022

Troubleshooting TURN

 

WebRTC applications use the ICE negotiation to discovery the best way to communicate with a remote party. It dynamically finds a pair of candidates (IP address, port and transport, also known as “transport address”) suitable for exchanging media and data.


The most important aspect of this is “dynamically”: a local and a remote transport address are found based on the network conditions at the time of establishing a session. For example, a WebRTC client that normally uses a server reflexive transport address to communicate with an SFU. when running inside the home office, may use a relay transport address over TCP when running inside an office network which limits remote UDP targets. The same configuration (defined as “iceServers” when creating an RTCPeerConnection will work in both cases, producing different outcomes.


This means that a certain portion of WebRTC sessions happen over TURN, i.e. they are relayed through a TURN service, when the choice is left to the client. ‘host’, ‘server reflexive’ and ‘relay’ candidates are left to compete with each other, and the best will win, with the caveat that ‘host’ candidates have the highest priority, and ‘relay’ the lowest. This prioritization originates from the logical assumption that a relayed connection may be less performant than a direct one.


There are cases though when using a TURN service is not optional, but mandatory; an RTCPeerConfiguration setting, ‘iceTransportPolicy’ allows this.


In any case, when TURN is used, it’s important to be able to troubleshoot the session establishment, and this article aims to provide some important guidelines.


These are the key points:

  • Acquiring the TURN settings

  • Confirming the reachability of the TURN server

  • Creating a relay allocation on the TURN server

  • Setting permissions for using the created allocations

  • Exchanging ICE connectivity checks over TURN

  • Exchanging media and/or data over TURN


Acquiring the TURN settings

While STUN servers are typically used without the need for authentication, it’s unlikely that a TURN service can. The resources involved in a TURN service are expensive, in particular in the case of highly scalable and distributed systems, and for this reason are only allowed for authenticated customers.


The required TURN settings are:

  • A URL (in the form ‘turn:<FQDN or IP address>:port)

  • An username

  • A password (called ‘credential’)


These are provided inside the ‘iceServers’ configuration structure passed to the RTCPeerConnection at the moment of creation.

Troubleshooting points

It’s important to verify that the TURN settings are correctly configured; in Chrome, open Developer Tools and check in the JavaScript code that the ‘iceServers’ structure contains valid values.


Check also the ‘iceTransportPolicy’ (which default value is ‘all’).

Confirming the reachability of the TURN server

When the ICE candidates gathering phase begins, the ICE client verifies that the TURN URL defines a reachable service by sending a STUN Binding Request towards the IP and port resolved from the ‘iceServer’ settings.


This request originates from the IP address and port that will be used to access the TURN service, and so it will check that it’s suitable for it.


If the STUN Binding Request is received by the TURN server, then it will respond with a STUN Binding Success, carrying an attribute (XOR-MAPPED-ADDRESS) that tells what source IP and port was seen by the server.


If the STUN Binding Success response is received by the client, then there’s proof that the TURN server is reachable.


For example:



Now it’s possible to negotiate a relay allocation.

Troubleshooting points

In the host running the WebRTC client, take a network trace and verify that the STUN Binding Request is addressed to the expected destination (in particular if the TURN URL required a DNS resolution and so there are multiple IP addresses that could be used).

Verify in the trace that the STUN Binding Success Response is received.

Creating a relay allocation on the TURN server

This is the key element: the client asks the TURN server to become a relay on its behalf.


Here the TURN protocol is used, and the client issues an Allocate Request towards the TURN server. This request must be authenticated, for the reasons discussed earlier, and so it’s challenged with a 401 Unauthenticated response, carrying a realm and a nonce.


The client will use the provided credentials (username and credential), together with the given realm and nonce, to compute a MESSAGE-INTEGRITY attribute and send again the Allocate Request with this attribute.


If the credentials are correct (and also the user is allowed to access the service), then the TURN service will reserve a transport address for that allocation: this is the relay transport address. An Allocate Success Response is transmitted to the client, with a XOR-RELAYED-ADDRESS attribute.


At this point the client has gained a ‘relay’ candidate and transmits it to the remote party through the signalling system in use (this is service-specific and not standardized).


Here’s an example of a successful allocation:



Note that a client may create more than one allocation for the same session; each one will be identified by a different source port, so it will be easily identifiable. You can filter them out with something like ‘stun and udp.port==PORT`, where PORT is the client source port for a transaction you’re interested in.

Troubleshooting points

In the host running the WebRTC client, take a network trace and confirm that there’s an Allocate Success Response.


Wrong credentials

In case of wrong credentials, instead of an Allocate Success Response you’ll see another 401 Unauthenticated response. In this case you must check that the credentials are correct, and the user is authorized to access the service.



Other errors for Allocate Request

Any other error for the Allocate Request will have a detailed error code (in a similar fashion as HTTP or SIP have), so take a note on that and search for its root cause.

Setting permissions for using the created allocations


For security reasons, before media or data is exchanged through the relay, the client must set specific permissions for the remote party.


Once the client has a valid relay allocation, every time it receives an ICE candidate from the remote it must set a permission for the remote IP address.


This is accomplished with a TURN CreatePermission Request. The allocation the permission refers to is implicit from the client source IP address and port. The TURN server will respond with a CreatePermission Success if the request is accepted; note that often a client receives ICE candidates with private or reserved IP addresses: in that case the TURN server will most probably reject the request with a 403 Forbidden response.


Example:



Troubleshooting points

In the host running the WebRTC client, take a network trace and confirm that there’s a CreatePermission Success for at least one of the remote candidates.


If no CreatePermission requests are sent, or none of them is successfully accepted, then no relaying will be possible.

Exchanging ICE connectivity checks over TURN

Once the TURN server is reached, a relay allocation reserved and a permission created, there are the conditions for exchanging ICE connectivity checks over TURN.


These are performed by sending STUN Binding Requests with short term credentials; the peculiarity with TURN is that these Binding Requests are encapsulated inside a TURN Send Indication, addressed to the remote peer.


Wireshark will nicely solve this encapsulation for you, and instead of showing a Send Indication will show you its content, the Binding Request.


The TURN server will relay the Binding Request to the remote peer, performing the relay for the first time. The expected outcome is that the remote entity will respond with a Binding Success, which the TURN server will encapsulate inside a Data Indication and deliver to the client.


If that happens, then the client has learned that the remote candidate is indeed reachable via TURN and that’s a suitable candidate pair for exchanging media and data.

Troubleshooting points

In the host running the WebRTC client, take a network trace and confirm that there are Binding Requests carried over TURN that receive a Binding Success.


If Binding Success responses are not received, then something is preventing it and the best way to investigate is to take network traces on the TURN server host, if possible. Those traces will tell you whether the Binding Requests are correctly leaving the TURN server towards the remote party and whether the Binding Success responses are being received or not.


It’s possible that the remote endpoint is simply unreachable from the TURN service, and in this case the ICE candidates pair will be marked as unusable.

Exchanging media and/or data over TURN


The last fundamental step is the actual exchange of packets through the relay. The typical type of packets is RTP.


Once the connectivity checks will be successful, if the client has elected the relay candidate as the one to be used, then RTP can start flowing. You’ll be able to see the RTP packets flowing in both directions, typically with video and audio multiplexed.


There are two ways for transmitting data:

  • Indications

  • Channels


A Send Indication carries the data (RTP) and destination from the client to the TURN server. The TURN server, granted the allocation exists and the permission allows it, will extract the data and send it to the destination from the allocated relay transport address.


When the data arrives from the remote peer to the relay transport address, then the TURN server, after performing the above checks, will encapsulate the data inside a Data Indication and send it to the client.


There is a more efficient way though to exchange data: the client can define a Channel (through the ChannelBind request), which associates a channel ID to a remote party. From that moment both the client and the TURN server can exchange data via ChannelData messages carrying just the channel ID and data, omitting the remote transport address. This reduces the network and computing overhead and it is typically chosen against the use of Indications.

Troubleshooting points

In the host running the WebRTC client, take a network trace and confirm that data is being sent from the client with Send Indications or ChannelData messages, and to the client with Data Indications and ChannelData messages.


In case of monodirectional media, it’s advisable to take network traces on the TURN server host to clarify whether the media is being exchanged or not on the relay side with the remote peer.

Encrypted TURN

It’s possible to use TURN over TLS, with all the data exchanged encrypted. In this case using Wireshark as described won’t allow you to see the details of the requests and responses, and troubleshooting is harder.


One possible approach is to first of all ensure that all the operations described previously happen correctly when using unencrypted TURN (over UDP or TCP). It’s very likely that the TURN service you are using is accessible over unencrypted UDP (default behavior): before moving to TLS ensure UDP works fine.


Wireshark will show you anyway the TLS connections established with the server, so that will confirm whether the connection was successful, the TLS session established, and some application data exchanged.

Useful tools

Wireshark

Wireshark is available for a variety of platforms; it’s a fundamental tool to understand what’s happening between the local WebRTC client and the remote server.


It comes with filters that detect the type of packets. You can use `stun` to filter out STUN and TURN packets, and even select specific TURN transactions, like `stun.type.method==0x0003` to show Allocate Request and Responses.


Saving a trace into a pcap file and making it available to others helps enormously the ability to troubleshoot.


Wireshark can be used for both capturing and just displaying captures.


There are cases where the dissectors, i.e. the interpreters of the packets, don’t recognize a TURN transaction. For example this happens when they happen over a non-default port (3478 for UDP and TCP, 5349 for TLS). To “help” Wireshark, right click on a packet, select “Decode As…” and set ‘STUN’ as protocol: it will correctly interpret all the packets using that non-default port.


The same applies for RTP: when signalling is not available to Wireshark, then UDP packets containing RTP may not be correctly interpreted. Use the same “Decode As…” method.

tcpdump

On the server side, any tool for packet capture would do, with tcpdump being a common solution.


Save the trace into a pcap file with the `-w` option, e.g. `tcpdump -n -v -w trace_1.pcap`, copy it to your machine and use Wireshark to display the packets.


WebRTC samples, Trickle ICE


This open source tool allows you to verify the browser can correctly gather `relay` candidates with the given TURN server details (URL, username, credential).


Before troubleshooting a client implementation, ensure that this tool can correctly access the TURN resources you’re referring to.


turnutils_uclient

The popular open source implementation of a TURN server, coturn, comes with a tool that simulates a client. A plethora of options are available, allowing you to test specific aspects of the TURN operations, e.g. using Send Indications or using Channels, etc.


Use `turnutils_uclient` to ensure the TURN service you want to use is accessible correctly with the given TURN settings, You’ll also get information about the round trip time and jitter.


Chrome webrtc-internals

When using Chrome, the best way to understand what’s happening is to open a tab on chrome://webrtc-internals/. It will show you all the information related to each RTCPeerConnection being managed by the browser at that point, including the list of ICE candidates, the details of the TURN server being used (except the credential for obvious reasons), including the iceTransportPolicy (‘all’ or ‘relay’), the chosen ICE candidates pair, statistics on media transfer, etc.


Search for `relay` candidates and verify the client is able to retrieve them from the TURN service, and whether they are selected as the candidate pair or not.


Conclusions

This article should provide a good checklist for troubleshooting the connection to a TURN service. There is much more to say, in particular for what concerns browsers different than Chrome and server-side investigations: I plan to write about it in the future.



Thursday 11 March 2021

[off topic] Differences between running and cycling

 I'm a passionate runner, and always considered cycling as something fun, e.g. mountain-biking, but difficult to practice regularly. There's a lot of overhead in cycling, like the preparation, bike maintenance, dealing with city traffic, etc.

Anyway about eight months ago I bought a road bike and felt in love with it. Soon after that I discovered Zwift and that gave an additional dimension to the sport: practice whenever you want from home, with accurate power measurements and a way to socialise with distant people. That was a game changer.

In five months I cycled 1600 virtual Km and climbed almost 17 virtual Km. Meanwhile my running performance, instead of degrading, improved, and that surprised me.

Anyway what I wanted to write about is a great article I read, "Physiological Differences Between Cycling and Running". It's a review of articles published in that area. Some conclusions are very interesting.

In general it seems sports medicine is still inconclusive for many aspects, and coaches may still have an advantage by following empirical/heuristic approaches in comparison with research-driven indications.

But more specifically, some notes from the conclusions:

- For the same person, VO2max depends on the speciality (i.e. runners achieve higher values on treadmill than cycle ergometer)

- There seems to be more physiological transfer from running to cycling than the other way around

- Pedalling cadence impacts the metabolic response during cycling, but also during a following run (at least in the short term)

- The Lactate Threshold is lower for athletes when not practicing their speciality, i.e. the Lactate Threshold depends on the training method

- Both female and male are impacted in the same way when comparing VO2max for running and cycling

- Triathletes have similar max Heart Rate when running and cycling, again pointing to the importance of the actual speciality used in training

- The position when cycling makes it harder to breathe

and probably other important elements that I wasn't able to fully grasp.


Wednesday 10 March 2021

Notes on STUN protocol

 Since I needed to see a few details in the handling of attributes in STUN responses, I thought of going through the whole STUN protocol RFC again and take notes on the most important parts.

I put my notes in some slides, and I'm sharing then in Slideshare in case somebody else may find them useful too:


Friday 19 February 2021

Extracting RTP streams from network captures

I needed an efficient way to programmatically extract RTP streams from a network capture.

In addition I wanted to:

  • save each stream into a separate pcap file.
  • extract SRTP-negotiated keys if present and available in the trace, associating them to the related RTP (or SRTP if the negotiation succeeded) stream.


Some caveats:

  • In normal conditions the negotiation of SRTP sessions happens via a secure transport, typically SIP over TLS, so the exchanged crypto information may not be available from a simple network capture.
  • There are ways to extract RTP streams using Wireshark or tcpdump; it’s not necessary to do it programmatically.


All this said I wrote a small tool (https://github.com/giavac/pcap_tool) that parses a network capture and tries to interpret each packet as either RTP/SRTP or SIP, and does two main things:

  • save each detected RTP/SRTP stream into a dedicated pcap file, which name contains the related SSRC.
  • print a summary of the crypto information exchanged, if available.


With those two elements, it’s then possible to decrypt an SRTP stream, depending on the availability of the exchanged crypto information, and also decode it into audio, depending on the codec.


Decryption and decoding is not part of my tool, but can be achieved easily with other tools, like pjsip’s pcaputil.

I might integrate that part into pcap_tool in the future. Again not because it’s strictly necessary, but to start getting more control on the parsing and manipulation. This may reveal to be useful in the future.


pcap_tool is available here for anybody interested in using it and may perhaps wish to change or extend some parts.


You can just clone it and build it as described in the README.


An example output:


./pcap_tool -d ../../trace_20210218_1.pcap


[…]


Extracted 1092 RTP frames

Detected RTP Stream: 0x7a2179fa Source port:22248 - Destination port:4000 - Packets: 544 (./stream-0x7a2179fa.pcap)

Detected RTP Stream: 0x772dc5d7 Source port:4000 - Destination port:22248 - Packets: 548 (./stream-0x772dc5d7.pcap)



source port: 22248 - tag: 3 - suite: AES_CM_128_HMAC_SHA1_80 - key: /1TI6DJWHk7fBJY1yBp7L51uEz1JJ2n6CcQAAsJM

-----

source port: 4000 - tag: 4 - suite: AES_CM_128_HMAC_SHA1_32 - key: mPytX24bRmyNgMaqQSxP8dMMqdkkmQeHgC2Ttb3v

source port: 4000 - tag: 3 - suite: AES_CM_128_HMAC_SHA1_80 - key: J1YS1owJDKAFdq5cRF+JtektYDf6IiowCAeijeal

source port: 4000 - tag: 2 - suite: AES_256_CM_HMAC_SHA1_32 - key: 5A9R8O8MCzbuGvJ08WWNJcNHsPaEcEp1ZDp5DunknZ+bZ2JQaVpZ2qmqraTmgQ==

source port: 4000 - tag: 1 - suite: AES_256_CM_HMAC_SHA1_80 - key: ZcZn1IY++2xsSIk/U1GsHSGp+OI/BYIocv/40ldJB28bcNeMmYzs4z4ozrNQ5Q==

-----


That network capture contained 2 SRTP streams, which have been saved separately into stream-0x7a2179fa.pcap and stream-0x772dc5d7.pcap files respectively.


For the negotiation it’s visible what the sender from port 22248 (owner of the 0x7a2179fa stream) used as crypto information, and looking at the same tag (3 in this case) it’s possible to see what crypto information was used by the sender of 0x772dc5d7 stream from port 4000.


With this it’s possible to decrypt (and decode since G.711 was used) with pjsip’s pcaputil with something like:


pcaputil -c AES_CM_128_HMAC_SHA1_80 -k /1TI6DJWHk7fBJY1yBp7L51uEz1JJ2n6CcQAAsJM stream-0x7a2179fa.pcap stream-0x7a2179fa.wav


and have the audio from that stream into a WAV file.


How to build pcaputil (in fact all pjsip’s applications) is widely documented but I also described it in the appendix of https://www.giacomovacca.com/2020/11/testing-sip-platforms-and-pjsip.html 


The call in the example was generated in fact with pjsua.





About ICE negotiation

Disclaimer: I wrote this article on March 2022 while working with Subspace, and the original link is here:  https://subspace.com/resources/i...