CICD 210-060

CICD 210-060

Supporting Converged Voice Networks

VoIP Protocols

Let’s take a look at these Gateway Call Control Protocols. Here is where we get the definitions and an understanding of what they do. We have to have a protocol on our gateway that allows us to setup and teardown calls, understand how to route calls and that’s all done thorough these protocols.

Protocol Description
H.323 ITU standard protocol for interactive conferencing; evolved from H.320 ISDN standard; decentralized dialplan

IETF protocol for interactive and noninteractive conferencing; decentralized dialplan

MGCP IETF standard for PSTN gateway control; centralized dialplan
SCCP Cisco proprietary protocol used between Unified CM and IP phones or to control FXS ports

There are three primary protocols that we're going to use or choose from - H.323 which is an industry standard, SIP a Session Initiation Protocol or MGCP Media Gateway Control Protocol. These three protocols, they kind of differ a lot between how they function. H.323 is a standard that has been out there the longest. The ITU developed this and has kept up with and evolved with H.323. H.323[2] is really a suite of protocols. You use this sometimes with your instant messaging clients, you might use this with instant messaging video, so it does video and it does audio, lot of great entities built into H.323. SIP is the newest of the protocol, the Session Initiation Protocols that is out there. H.323 and SIP have one thing in common - they are a peer-to-peer protocol. In other words the routing intelligence lives in the endpoints, in our endpoints it's the gateways. MGCP, lot different, it is a client-server protocol. The Communications Manager is actually in control of the gateway. The routing intelligence lives in the communications manager, very different than H.323 and SIP. Now finally there is that Skinny Client Control Protocol - this is proprietary protocol that is used between the Communications Manager and the IP phones, that's how they communicate information back and forth to each other, so it's not really a gateway protocol, the other three are the Gateway Call Control Protocols.

Digital Signaling Processors

You're probably wondering, how does this gateway automatically convert analog voice to a digital signal and vice-versa, well it doesn't do it automatically. We need Digital Signal Processors and DSP's are actually used for several different things - transcoding, converting between one Codec to another; voice termination, going from analog to digital and vice-versa, so getting out to the PSTN and converting to PSTN format then when PSTN calls coming in our direction we need to convert it to our RTP strains. Conferencing, we can use DSP resources to setup conference calls and finally - media termination points. Media termination points are a way that we can terminate and re-originate a call and you might think, well, why I need to do that? Some protocols will send DTMF tones in band, some protocols send it out of band, so in order to convert between the two for example, we would need a media termination point to handle that particular operation. So DSP resources are very important. If I was you one of the best resources is Cisco's website, you can go to the for the feature navigator and in there you can find the proper IOS, you can find Router platforms and you should be able to search out DSP resources, so that you know how many you might need. There's a DSP calculator, or used to be DSP resource calculator on Cisco's website, but it'll tell you based on your platform what types of DSP chips you can purchase and what you need and can add to your system.

Voice Codecs

Codecs - stands for coder/decoder. What that does is takes that wave form of me speaking and it converts into digital format. That’s what we need to get it across our network and we can choose to just have it do that. That’s the G.711 standard gives us a 64K package or I could add compression to it using something like G.729 to give me an 8Kbps per second payload. Now we have lots of different options in between, we’ve got iLBC we have got iSAC. And depending upon the types of networks that you're running on and your gear if your gear supports that you can choose different Codec’s and you can see how much bandwidth they take up.

Codec G.711 G.722 iLBC iSAC G.729
Bandwidth without overhead 64 kb/s 64 kb/s 13.3 kb/s 10-32 kb/s 8 kb/s

Voice Activity Detection is something you can turn on in your system or turn off that will not send the sound of silence, so if we're talking about efficiently packaging up that voice traffic and compressing it, why not make sure we efficiently send sound and not silence, that's what Voice Activity Detection does. Now I mentioned that you can turn it on or turn it off, the only little "gotcha" that you might experience in your network is with that, first of all it indicates or detects "okay, nobody speaking don’t send any packets". Now as soon as I start talking again the router has to pick up on the fact that I started to talk it could clip the first syllable of the word as I begin speaking again. So if you have that problem you can turn that off, or Voice Activity Detection off. The other thing we want to also make sure happens is that we have Comfort Noise Generated and that’s the CNG. Comfort noise is that white gaussian noise that you hear instead of just dead silence which would make you think someone hung-up on you, if you get dead silence you think "hey what happened, we got disconnected". So there's different variance of G.729, we have G.729 Annex B, that's what adds Voice Activity Detection and comfort noise to that particular Codec.


RTP is the protocol that we use to transmit our voice stream across the network. Now there's really more to it than just that and why do we need it. Well we're using UDP, UDP remember doesn’t care if makes it there and doesn’t have any sequencing and time stamping, that’s what RTP adds to the mix. We use ports, for UDP in the range of 16,384 to 32,767. Along with that there's a tattletale protocol the Real-Time Transport Control Protocol, RTCP, this chatters about Packet Loss, Delay, Jitter, Packet Count it keeps track of all of that and we can view this information if we need to and it reports this about every 5 seconds. We also can secure this, we can secure RTP (SRTP) that encrypts the RTP data and uses AES, the Advanced Encryption Standard to do the Encryption portion, so we can kind of see this header kind of growing. Even if we choose the payload of using something like G.729 as our codec which is only 8K you could see how the header is increasing at Layer 2 and Layer 3 to support the transmission of this voice packet.