Skip to main content

VoIP calls encoded with SILK: from RTP to WAV, update

Three and a half years ago (which really sounds like a lot of time!) I was working with a VoIP infrastructure using SILK. As it often happens to server-side developers/integrators, you have to prove whether the audio provided by a client or to a client is correctly encoded :-)

Wireshark is able to decode, and play, G.711 streams, but not SILK (or Opus - more on this later). So I thought of having my own tool handy, to generate a WAV file for a PCAP with RTP carrying SILK frames.

The first part requires extracting the SILK payload and writing it down into a bistream file. Then you have to decode the audio using the SILK SDK decoder, to get a raw audio file. From there to a WAV file it is very easy.

As I tried to describe in this previous post, I had to reverse engineer the test files contained in the SDK, to see what a SILK file looked like.

Since the SILK payload is not constant, all that was needed was to insert 2 Bytes with the length of the following SILK frame. At the beginning of the file you have to add a header containing "#!SILK_V3", and voilĂ .

This is accomplished by silk_rtp_to_bistream.c (from https://github.com/giavac/silk_rtp_to_bitstream), a small program based on libpcap that extracts the SILK payload from a PCAP and writes it properly into a bistream file.

Build the binary with:

gcc silk_rtp_to_bitstream.c -lpcap -o silk_rtp_to_bitstream

(you'll need libpcap-dev installed)

Create the bistream with:

./silk_rtp_to_bitstream input.pcap silk.bit

Now you can decode, using the SILK SDK, from bitstream into raw audio with:

$SILK_SDK/decoder silk.bit silk.raw

Raw audio to WAV can be done with sox:

sox -V -t raw -b 16 -e signed-integer -r 24000 silk.raw silk.wav

This works fine with single channel SILK at 8000 Hz.


More to come: an update on how to accomplish the same but for Opus.