Skip to main content

A gentle introduction to G.729

G.729 is an 8 Kpbs audio codec, standardized by ITU, and called also CS-ACELP: Coding of Speech using Coniugate-Structure Algebraic-Code-Excited Linear Prediction, as that’s the algorithm used for audio compression.

It has two extended versions: G.729A (optimized algorithm, slightly lower quality) and G.729B (extended features, higher quality), and it's popular in the VoIP world because combines very low bitrate with good quality (but alas is not royalty-free).

G.729A takes as input frames of voice of 10 msec of duration, sampled at 8KHz and with each sample having 16 bit:

This gives a frame size of 80 samples:
8000 sample/sec * 10 * 10e-3 sec/frame = 80 samples/frame

The output is 8 Kbps, so each encoded frame is represented by 10 Bytes:
8000 bit/sec * 10 * 10e-3 sec/frame = 80 bit/frame = 10 Bytes/frame

Quality
Considering its bit rate, G.729 has an excellent perceived quality (MOS).
Under normal network conditions G.729A has MOS 4.04 (while G.711 u-law, 64 kbps, has 4.45)
Under stressed network conditions G.729A has MOS 3.51 (while G.711 u-law, 64 kbps, has 4.13)
Perfect quality has MOS 5.

G.729 doesn’t support (reliably) DTMF (RFC 2833)

Algorithm delay and complexity
The delay between input and encoded output is 15 msec: 1 frame (10 msec) + 5 msec required by the look-ahead prediction algorithm.

Not surprising, such a low bitrate associated with high quality, G.729 has relatively high complexity, 15 (while G.711 has 1, and on the other extreme side, G.723.1 has 25).

VAD and CNG
G.729B has been extended with VAD (Voice Activity Detection, which causes silence suppression), and generates CNG (Comfort Noise Generation) packets. This helps the receiving end in two key elements:
1. Recover synchronization in condition of high latency network
2. Generate Comfort Noise (which in case of silence from the transmitting end, tells the receiver that the call is still up)

Next to come, a gentle description of ACELP and G.723.

Popular posts from this blog

Troubleshooting TURN

  WebRTC applications use the ICE negotiation to discovery the best way to communicate with a remote party. I t dynamically finds a pair of candidates (IP address, port and transport, also known as “transport address”) suitable for exchanging media and data. The most important aspect of this is “dynamically”: a local and a remote transport address are found based on the network conditions at the time of establishing a session. For example, a WebRTC client that normally uses a server reflexive transport address to communicate with an SFU. when running inside the home office, may use a relay transport address over TCP when running inside an office network which limits remote UDP targets. The same configuration (defined as “iceServers” when creating an RTCPeerConnection will work in both cases, producing different outcomes.

Extracting RTP streams from network captures

I needed an efficient way to programmatically extract RTP streams from a network capture. In addition I wanted to: save each stream into a separate pcap file. extract SRTP-negotiated keys if present and available in the trace, associating them to the related RTP (or SRTP if the negotiation succeeded) stream. Some caveats: In normal conditions the negotiation of SRTP sessions happens via a secure transport, typically SIP over TLS, so the exchanged crypto information may not be available from a simple network capture. There are ways to extract RTP streams using Wireshark or tcpdump; it’s not necessary to do it programmatically. All this said I wrote a small tool ( https://github.com/giavac/pcap_tool ) that parses a network capture and tries to interpret each packet as either RTP/SRTP or SIP, and does two main things: save each detected RTP/SRTP stream into a dedicated pcap file, which name contains the related SSRC. print a summary of the crypto information exchanged, if available. With ...

Testing SIP platforms and pjsip

There are various levels of testing, from unit to component, from integration to end-to-end, not to mention performance testing and fuzzing. When developing or maintaining Real Time Communications (RTC or VoIP) systems,  all these levels (with the exclusion maybe of unit testing) are made easier by applications explicitly designed for this, like sipp . sipp has a deep focus on performance testing, or using a simpler term, load testing. Some of its features allow to fine tune properties like call rate, call duration, simulate packet loss, ramp up traffic, etc. In practical terms though once you have the flexibility to generate SIP signalling to negotiate sessions and RTP streams, you can use sipp for functional testing too. sipp can act as an entity generating a call, or receiving a call, which makes it suitable to surround the system under test and simulate its interactions with the real world. What sipp does can be generalised: we want to be able to simulate the real world tha...