G.729 versus G.711
Friday, May 09, 2008
In the last article, we discussed the G.711 codec, which is often used to give you "toll quality" calling. There are situations where you do not have the bandwidth for G.711 calling. Perhaps you have a restricted upstream speeds on your Internet connection or you need more bandwidth for other applications.
Another common choice for encoding your voice into data is the G.729 codec. Unlike G.711, it is able to transmit voice very efficiently--at about 32 kilobits per second versus 87 kilobits per second for G.711. Unlike G.711, however, the human voice is synthesized by something called a vocoder.
The vocoder uses both a tone generator, a white noise generator, and a filter that shapes the sound as the throat, mouth, tongue, lips, and nasal cavities do. By itself, the vocoder produces intelligible speech, but it sounds like a robot is speaking.
Since that's clearly unacceptable, G.729 also uses samples of the actual human speech to set the vocoder settings properly. It also compares the actual voice from the synthetic voice to come up with a "code." The code along with the vocoder settings are what's sent to the remote end. The remote end takes the code and vocoder settings and plays the sound.
The result of all this work is voice quality that sounds similar to G.711 but at almost a third of the bandwidth requirements. It takes a lot of processing cycles to do all this, too, which is why some voice over IP gear limit the number of streams it is capable of processing G.729 for.
There are some challenges with G.729. While it works great for voice communication, it doesn't do very well for things like faxing, data, and transmitting touch tones. This is because G.729 is designed specifically to translate voice, not data.
So if G.729 doesn't support touch tones, how is it you can do touch tones on a VoIP line? We'll explain that next time.