Friday, 4 December 2020

SIP - Connection reuse vs Persistent connection

It goes without saying that SIP solutions are impacted by NAT. So much that some scenarios required integration to RFC 3261, e.g. with RFC 3581, which defined the 'rport' attribute to be added in the Via header (integrating the 'received' attribute): with that information, responses could be routed to the source port of the related request, and not on the advertised port in the original Via.

That was called Symmetric Response, and applied to connection-less transports (UDP), while, as mentioned in RFC 6314, it's not necessary when using reliable transports (TCP in most cases): SIP responses can be sent back on the same connection on which the request arrived.

Also from RFC 3261, chapter 18, Transport layer, client behaviour:

"For reliable transports, the response is normally sent on the connection on which the request was received."

But the client needs to be prepared to receive the response on a new connection:

"[...] the transport layer MUST also be prepared to receive an incoming connection on the source IP address from which the request was sent and port number in the "sent-by" field."

That obviously would require the ability to create such connection from the server to the client.

Anyway when a reliable connection between two SIP entities is up, after a transaction is already concluded, there are two interesting opportunities:

- Use that same connection for more requests from the client
- Use that same connection for more requests from the server

where for "client" I refer to the entity that created the connection and sent the initial request, and "server" is the entity that accepted the connection and delivered the response(s).

The first case is mentioned in the same chapter on Transport layer:

"If a request is destined to an IP address, port, and transport to which an existing connection is open, it is RECOMMENDED that this connection be used to send the request, but another connection MAY be opened and used."

This is what's referred to as "Persistent connection", as mentioned in RFC 5923:

"The SIP protocol includes the notion of a persistent connection
   [...], which is a mechanisms to insure that
   responses to a request reuse the existing connection that is
   typically still available, as well as reusing the existing
   connections for other requests sent by the originator of the

The second case (using the same connection from future requests from the server) is instead the subject of RFC 5923, and it is defined as "Connection reuse".

Once the connection is up, it seems a good opportunistic approach to reuse it, but an important limitation is mandated:

"Unlike TCP, TLS connections can be reused to send requests in the backwards direction since each end can be authenticated when the connection is initially set up."

In other words, only TLS connections formed by exchanging certificates can be reused, because the identities have been mutually verified.

The way the client can tell the server that connection reuse is desired is with a new parameter to be added in the Via header: 'alias'.

In general, RFC 5923 at chapter 5 clarifies:

"The act of reusing a connection needs
   the desired property that requests get delivered in the backwards
   direction only if they would have been delivered to the same
   destination had connection reuse not been employed."

Last important bit, not to be left implicit: persistent connections don't imply connection reuse, as RFC 5923 clarifies:

"[...] Persistent connections do not
      imply connection reuse."

So this post is basically sharing my own notes on this topic, which maybe somebody else (including me) can find useful in the future.

Tuesday, 24 November 2020

Dissecting traces with DTMF tones

I'm sure I belong to the large group of people who love to analyse network traces with tools like Wireshark. Being able to see the details of a packet or datagram down to the level of the bits is not only extremely useful, but also fascinating.

Time ago I wrote a dissector for Wireshark, using the Lua interface, and that was fun (I see it's still available here). The official recommendation is to use Lua only for prototyping and testing, but when performances are not key and there isn't the intent to add the dissector to the official distribution, it's fast and effective.

In order to parse network traces with audio and extract it into the payload first, and then decode it into a WAV file, C is a viable solution. I wrote about a program that does that here and since it attracted some attention and feedback I wrote an updated version later.

More recently I wanted to identify programmatically the presence (and value) of DTMF tones - as RTP Events, RFC 2833 - in network traces. This time rather than using C, I wanted to integrate it with python, and scapy seemed a good choice.

scapy is quite complete, but interestingly it doesn't have a parser for the RTP Event extension. So I thought of mapping the raw content in the RTP payload to a structure, with the help of C types.

This it the core of the program:

def process_pcap(file_name, sut_ip):

  for (pkt_data, pkt_metadata,) in RawPcapReader(file_name):

    ether_pkt = Ether(pkt_data)

    # A little housekeeping to filter IPv4 UDP packets goes here

    if ether_pkt.haslayer(UDP):
      udp_pkt = ether_pkt[UDP]

    # Get the raw UDP packet into an RTP structure
    rtp_pkt = RTP(udp_pkt["Raw"].load)

    ptype = rtp_pkt.payload_type

    # Assume payload type 96 or 101 are used for RTP events
    if (ptype == 96 or ptype == 101):
      rtpevent_content = rtp_pkt.payload

    # map the payload into an RTPEvent object
    rtpevent_struct = RTPEvent.from_buffer_copy(rtpevent_content.load)

The RTPEvent class looks like this:

class RTPEvent(ctypes.BigEndianStructure):
  _fields_ = [
    ('event_id', ctypes.c_uint8),
    ('end_of_event', ctypes.c_uint8, 1),
    ('reserved', ctypes.c_uint8, 1),
    ('volume', ctypes.c_uint8, 6),
    ('duration', ctypes.c_uint16)

so once mapped, the rtpevent_struct object will have its DTMF-specific details, in particular with the digit contained in rtpevent_struct.event_id, and the indication whether it's the marker of end of the event in the end_of_event bit.

All the other information (source/destination IP address and port, timestamp, SSRC) is obviously available in the UDP and RTP portion, so it's easy to adapt to your needs and filter out the DTMF tones for the streams you're interesting in.

Monday, 23 November 2020

Kubernetes role-based authorisation for controller applications

There are many scenarios where an application running inside a Kubernetes environment may need to interact with its API.

For example, an application running inside a Pod may need to retrieve real time information about the availability of other applications' endpoints.

This may be a form of service discovery that integrates or extend the native Kubernetes internal service discovery. In most cases, DNS records are associated to a Service and provide the list of active Endpoints for that Service, with a proper TTL. There are situations though where those DNS records are not available, an application is not able to use them directly, or what's needed is more than the private IP addresses associated with the Endpoints.

If interacting with the Kubernetes API from inside an application is needed, then there are two main areas to consider: Authentication and Authorisation.

Every Pod has a Service Account associated to it, and applications running inside that Pod can use that Service Account. Without a specific configuration the Service Account will default to a generic namespace and generic authorisation.

It's possible instead to define a more specific Service Account, with fine grained permissions to access the API. This Service Account can then be linked to a Pod with a Role-based approach.

You can get a list of available Service Accounts with an intuitive:

# kubectl get serviceaccounts

which is likely to show you a single 'default' service.

Service Accounts may have a namespace scope. You can check what Service Accounts are associated to a pod with a command like:

# kubectl -n NAMESPACE get pods/PODNAME -o yaml | grep serviceAccountName

Service Account authentication can use the token reachable from inside a Pod, in the /var/run/secrets/ directory, under the namespace-specific directory, e.g. /var/run/secrets/

Those tokens are also visible as Mounts in the related containers.

For internal requests, Kubernetes provides a local default HTTPS endpoint at https://kubernetes.default.svc - so a way to discover the details of a Service Account for a given namespace could be:


# Point to the internal API server hostname

# Path to ServiceAccount token

# Read this Pod's namespace

# Read the ServiceAccount bearer token

# Reference the internal certificate authority (CA)

# Explore the API with TOKEN
curl --cacert ${CACERT} --header "Authorization: Bearer ${TOKEN}" -X GET ${APISERVER}/api/v1/namespaces/${NAMESPACE}/pods

Accessing the internal API programmatically

With common python libraries as (available on debian with the 'python3-kubernetes' package) it's extremely easy to automate the invocation of the internal APIs.

Most of the examples assume you're running your program as a user, and refer to the local kube config file, but when running inside a container it's possible to inherit the Service Account token associated with the hosting Pod.

For this, instead of using




Then you can instantiate your API client object:

v1 = client.CoreV1Api()

and either do a single request, like getting a list of all pods inside any namespace:

ret = v1.list_pod_for_all_namespaces(watch=False)

or a list of pods belonging to a namespace and matching a specific application label, like:

ret = v1.list_namespaced_pod(namespace, label_selector=app_name, watch=False)

or you can "watch" some resources, which basically means subscribing to such resource updates and getting a notification at each change:

w = watch.Watch()
for event in, _request_timeout=60):

Each event can be ADDED, DELETED and MODIFIED, and carries a rich set of information associated to the current status of the resource.

Defining your specific ServiceAccount

Before getting to that point, though, you need to define your non-default Service Account and assign specific permissions to it. To achieve this, the role-based approach can be used.

First of all define a ServiceAccount resource:

apiVersion: v1
kind: ServiceAccount
  labels: mycomponent
    name: mycomponent-serviceaccount

Then define a role associated to this resource:

kind: Role
  name: myrole
  namespace: mynamespace
  - apiGroups:
  - ""
    - pods
    - endpoints
    verbs: ["get", "list", "watch"]

This example adds the permission to get, list, or watch the list of pods and endpoints in the given namespace.

Create a role binding:

kind: RoleBinding
name: myrolebinding
namespace: mynamespace
  kind: Role
  name: myrole
  - kind: ServiceAccount
    name: mycomponent-serviceaccount
    namespace: mynamespace

Whenever a Role cannot be restricted to a namespace, for example if it needs to access cluster-wide resources like Nodes, then the ClusterRole resource is available.

References and other sources

"Access clusters using the Kubernetes API",

An interesting post about asynchronous watches with Python:

"Kubernetes Patterns", an ebook by Redhat:

Friday, 20 November 2020

Testing SIP platforms and pjsip

There are various levels of testing, from unit to component, from integration to end-to-end, not to mention performance testing and fuzzing.

When developing or maintaining Real Time Communications (RTC or VoIP) systems,  all these levels (with the exclusion maybe of unit testing) are made easier by applications explicitly designed for this, like sipp.

sipp has a deep focus on performance testing, or using a simpler term, load testing. Some of its features allow to fine tune properties like call rate, call duration, simulate packet loss, ramp up traffic, etc. In practical terms though once you have the flexibility to generate SIP signalling to negotiate sessions and RTP streams, you can use sipp for functional testing too.
sipp can act as an entity generating a call, or receiving a call, which makes it suitable to surround the system under test and simulate its interactions with the real world.

What sipp does can be generalised: we want to be able to simulate the real world that surrounds (or will surround) our system in Production. From this point of view sipp is not the only answer, and projects often use other tools, or a combination of other tools.

One simple and effective approach is re-using RTC applications and build the testing tool around them. When a system is built around an application, it's likely that the people working on it are familiar enough with the application to re-use it to mock the external world. This is often achieved with Asterisk or FreeSWITCH. They both expose an API for originating calls, and surely can play the role of called party (or "absorbers", or "parrots" depending on their main scope and terminology).
Kamailio also can be used to generate calls, even though its core focus on signalling makes it slightly more complex to use in generic cases.

Unless behavioural changes are put in place, such solutions imply compromising on the SIP stacks in use. Asterisk or FreeSWITCH won't make it too easy to generate an INVITE with a wrongly formatted SIP header, for example, while sipp is much more flexible, ad the SIP messages can be mocked down to the single character. What typically happens is that sipp is used to generate or receive calls when specific syntax requirements for the signalling are needed, while Asterisk and FreeSWITCH can be used in more permissive cases, where what's important is a generic session establishment.

When dealing with media (typically, RTP streams) is necessary, then sipp provides at least two methods: re-playing an RTP stream from a trace (pcap file), or encoding a WAV file into a stream. Recently sipp added the ability to play RTP Events separately (DTMF tones as in RFC 2833 -  I think the first patch with this functionality was this). sipp is not able to transcode or generate non-PCM streams, but still it can play a non-PCM stream with just some limitations, which covers most of typical cases.

Less generic scenarios where RTC applications like Asterisk and FreeSWITCH can be useful are the ones requiring SRTP (encrypted RTP). Even though sipp can be used to negotiate SRTP,  by adapting the SDP portion of the offer/answer, it doesn't provide a solution to generate SRTP streams.

In this case a very useful item to add to your toolbox is pjsip, which is a SIP stack library (used also by Asterisk and chan_pjsip being the current recommended SIP channel, as opposed to the older chan_sip) that exposes an API and also a command-line option (pjsua). pjsua can be used directly, with either command line arguments or a configuration file, or it's possible to use pjsip library to write programs with languages like python: this makes it very flexible and helps its integration in existing and new testing systems.

With pjsip, it's possible to generate calls that play audio and DTMF tones, in a similar way than sipp, but also encrypt RTP and establish SRTP streams.


The easiest approach is to build the pjsip project and use the pjsua binary (you can see a procedure in the Appendix).
pjsua accepts command-line arguments, but can receive arguments from a configuration file, which makes it easier to read. For example you could just

#  pjsua --config-file pjsua.cfg

where pjsua.cfg contains just the caller and callee:

A more sophisticated configuration file contains instructions on codecs and encryption, e.g.
--dis-codec GSM
--dis-codec H263
--dis-codec iLBC
--dis-codec G722
--dis-codec speex
--dis-codec pcmu
--dis-codec pcma
--dis-codec opus
--add-codec pcma
--play-file /some_audio.wav

Since I mentioned SRTP as a possible key element for using pjsip, let's look into the related options:


'use-srtp' can be 0, 1 or 2, and means "disabled", "optional" and "mandatory", respectively.

With "optional" pjsua offers both plain and encrypted RTP at the same time, and the callee entity can decide. With "mandatory" it will only offer SRTP, and the callee will have to either accept or reject.

'srtp-secure' refers to the use of TLS, and can also be 0, 1 or 2, meaning "not required", use "tls", or use "sips" respectively. Needless to say, in normal scenarios you want to protect the SRTP crypto information carried in the SDP, so you want to encrypt signalling too. SIP over TLS is the typical solution. For testing purposes you may prefer making it easier to check the content of signalling, and use 'srtp-secure=0'.

'no-vad' formally should be used to disable silence detection; in practice you want this option when generating a call from a machine that doesn't have a sound card.

Similarly, 'null-audio' disables the requirement to play the audio, required when the calls are generated from a host with no sound interfaces.

'dis-codec' is used to disable a codec from the negotiation, and 'add-codec' instead selects a codec to be added to the offer. This adds flexibility, and it's also worth noting that video codecs are available too.

Using pjsip library with python

It's possible to use the pjsip library's API with high level programming languages like python. This makes test automation quite versatile, and I remember seeing this approach as early as 2012, where the project I was working on had the client applications built on top of pjsip: it was extremely valuable to simulate programmatically the clients from linux machines.

Being designed for interactive applications, pjsip comes with a nice event-based model, so in principle you need to trigger the desired actions and register callback functions that will be called at the proper moment.

A complete reference to the python library can be found here.

In general, after you import the library:

import pjsua as pj

then the library is imported in an object, the configuration objects are populated, and a call is triggered, e.g.:

    lib = pj.Lib()
    media_cfg = pj.MediaConfig()
    media_cfg.no_vad = 0
    lib.init(log_cfg = pj.LogConfig(level=3, callback=log_cb), media_cfg=media_cfg)

    lib.set_codec_priority("GSM", 0)
    lib.set_codec_priority("iLBC", 0)
    lib.set_codec_priority("G722", 0)
    lib.set_codec_priority("speex", 0)
    lib.set_codec_priority("pcmu", 0)
    lib.set_codec_priority("pcma", 1)

    transport = lib.create_transport(pj.TransportType.UDP)
    acc = lib.create_account_for_transport(transport)

    call = acc.make_call(sys.argv[1], MyCallCallback(), hdr_list=custom_headers)

You can see that set_codec_priority to 0 is equivalent to the --dis-codec command line option.

MyCallCallback() is the callback function that will be invoked at each change of call state, with an event object passed as argument. You'll have something like:

class MyCallCallback(pj.CallCallback):
    def __init__(self, call=None):
        pj.CallCallback.__init__(self, call)

    def on_state(self):
        if == pj.CallState.CONFIRMED:
            # The call has been answered
            # Here you can create a player to generate audio into an RTP stream, send DTMF, log information, etc
            # You can even invoke other APIs to interact with more complex systems
    def on_media_state(self):
          global lib
          if == pj.MediaState.ACTIVE:
                 # Media is now flowing, so you can connect it to the internal conference object
                 # Connect the call to sound device
                 call_slot =
                 lib.conf_connect(call_slot, 0)
                 lib.conf_connect(0, call_slot)
                 print "on_media_state - MediaState ACTIVE"

As it can be expected, exceptions can be caught and errors displayed:

except pj.Error, e:
    print "Exception: " + str(e)
    lib = None

If you happen to need DTMF tones, pjsip offers the dial_dtmf() function, as part of the Call object, e.g.:"0")

Just remember that these calls are asynchronous, non-blocking: you need to add explicitly a delay to separate the beginning of a tone from other actions.
pjsip will generate proper RTP Event packets of the given duration, inside the existing RTP stream (and so they will have the same SSRC and proper timestamp reference).

I'll write about analysing pcap traces to extract information on RTP events in a separate article.

Wrap up

This article is somehow what I would have wanted to read on the topic some time ago, but I had to infer from various sources and after various experiments. I hope it will be useful to some of the readers.

Appendix - pjsua build and install

To build pjsua on debian you can do something like:

apt install python-dev gcc make gcc binutils build-essential libasound2-dev wget
tar -xvf 2.10.tar.gz
cd pjproject-2.10
./configure && make dep && make

The binary will be available at ./pjsip-apps/bin/pjsua-x86_64-unknown-linux-gnu, which of course you can link to something easier to use, or copy to a directory in the PATH.

SIP - Connection reuse vs Persistent connection

It goes without saying that SIP solutions are impacted by NAT. So much that some scenarios required integration to RFC 3261 , e.g. with RFC ...