VoIP network planning guide
Page 2/15
1 CONTENT
1 CONTENT ... 2 2 SCOPE... 3 3 BANDWIDTH ... 4
3.1 Control data 4
3.2 Audio codec 5
3.3 Packet size and protocol overhead 5
3.3.1 UDP Protocol overhead 6
3.3.2 IP Protocol overhead 7
3.3.3 Network Protocol overhead 7
3.4 Jitter 8
3.5 Voice activity detection 8
3.6 Example 9
4 JITTER ... 10 5 PRIORITIES ... 11
5.1 Differentiated Services Code Point 12
6 NETWORK PROVIDER ... 14 7 LINKS... 15
2 SCOPE
This document is written for all users who are going to install Artist VoIP products.
Most of the information is in general true for VoIP, while some information is very Artist specific. In difference to other audio technologies (Analog IO, AES etc.) VoIP transmission needs accurate planning beforehand. If the network is not planned, it’s very likely that the installation will fail or the transmission is not reliable.
The reader must have basic knowledge of IP networks including IP addressing. The document covers the 3 most important topics of an installation (Bandwidth, Jitter and Priorities).
The following sample network shows a very typical VoIP installation where two locations are interconnected with a wide area network. While the local area networks are often fast and reliable, the WAN is the limiting factor and that’s where the focus is.
Page 4/15
3 BANDWIDTH
The VoIP transmission is based on two parts, control (signalization) data and audio data. The control data is very constant and can not be influenced by the user. The audio data depends on several factors:
- Configured audio codec - Configured packet size - Network type
- Jitter
- Voice activity
The total bandwidth is the sum of five parts:
bwTotal = bwControlData + bwAudioCodec
+ bwUdpProtocolOverhead
+ bwIpProtocolOverhead
+ bwNetworkOverhead
3.1 Control data
The bandwidth for the control data is quite constant.
bwControlData = 20 kb/s
3.2 Audio codec
Each audio codec has different properties and one of it is the audio data bandwidth.
This is the data which is used for the raw audio excluding any network protocol overhead.
Codec bwAudioCodec
All G.711 64 kb/s
PCM 8K 128 kb/s
All RARe 64 kb/s
G.722 64 kps PLC 64 kb/s G.722 48 kps PLC 48 kb/s
3.3 Packet size and protocol overhead
VoIP traffic is not streamed over the network, it is separated into packets. For each VoIP channel the user can individually select the size of the packets: 20ms, 40ms, 80ms or 160ms. Default is 20ms.
When the transmitter wants to send a packet, it has to wait until enough audio is available for sending, e.g. for a 40ms packet it has to wait 40ms. So the delay depends on the packet size.
Page 6/15
Small packets create less delay, but create more protocol overhead, because you need to send more packets for the same amount of audio. E.g. 20ms packets add 8 times the protocol overhead as 160ms packets.
3.3.1 UDP Protocol overhead
The audio data is encapsulated into UDP datagrams which adds protocol overhead.
For each packet UDP adds 64 Bits overhead resulting in additional bandwidth.
Audio packet size Packets / second
bwUdpProtocolOverhead
20ms 50 3,2 kb/s
40ms 25 1,6 kb/s
80ms 12,5 0,8 kb/s
160ms 6,25 0,4 kb/s
3.3.2 IP Protocol overhead
The UDP packets are encapsulated into IP datagrams. For each packet IP adds 160 Bits overhead.
Audio packet size Packets / second
bwIpProtocolOverhead
20ms 50 8 kb/s
40ms 25 4 kb/s
80ms 12,5 2 kb/s
160ms 6,25 1 kb/s
3.3.3 Network Protocol overhead
The network protocol depends on the network type. E.g. Ethernet is using the Ethernet protocol. Wide area networks are based on DSL, Cable, E1, T1 etc. and use other protocols and therefore create different overhead. Thus the same IP traffic results in different network traffic between the LAN and the WAN.
This chapter only handles Ethernet networks as an example. It gives an idea how network protocol overhead is calculated, so the reader is able to adjust the calculation for other network types.
The Ethernet protocol adds 144 Bits overhead for each Ethernet packet. One audio packet is encapsulated in one Ethernet packet.
Audio packet size Packets / second
bwNetworkOverhead
20ms 50 7,2 kb/s
40ms 25 3,6 kb/s
80ms 12,5 1,8 kb/s
160ms 6,25 0,9 kb/s
Page 8/15
3.4 Jitter
Jitter is the time variation of the VoIP packet transmission. Please see chapter 4. It has an influence on the bandwidth, since the transmission is not constant when there is network jitter. Regarding a limited timeframe, it can happen that there are less packets transmitted, resulting in a lower bandwidth. Of course the opposite can also happen, resulting in a temporary higher bandwidth. Theoretically the required temporary bandwidth could be infinite.
It’s nearly impossible to calculate the bandwidth variation caused by Jitter beforehand. As an advice it’s a good idea to have 25% bandwidth reserve. The user should check the Jitter and the bandwidth variation when the system is installed.
3.5 Voice activity detection
Artist VoIP channels have a configurable Voice activity detection (VAD) function. Per default it is enabled, which means that the audio transmission stops, when the audio drops below the Vox threshold. In this case the bandwidth is reduced to
bwControlData. See 3.1
The network should always be designed to transmit the bandwidth for a permanent audio signal, but VAD is a nice feature to reduce the data volume in practice.
3.6 Example
This chapter provides enough information to calculate all variants of VoIP configurations. As an example, this is the calculation for the VoIP channel default settings.
G.722 (64 kps), 20ms packet size, Ethernet:
bwTotal = bwControlData + bwAudioCodec
+ bwUdpProtocolOverhead
+ bwIpProtocolOverhead
+ bwNetworkOverhead
= 20 kb/s + 64 kb/s + 3,2 kb/s + 8 kb/s + 7,2 kb/s --- 102,4 kb/s
Page 10/15
4 JITTER
Jitter is the time variation of the VoIP packet transmission. Jitter is a typical problem of the connectionless networks or packet switched networks. Due to the information is divided into packets each packet can travel by a different path from the emitter to the receiver. Jitter is technically the measure of the variability over time of the latency across a network.
The solution is a jitter (receiver) buffer in order to equalize the variation. The receive buffer size can be configured for each VoIP channel in the Artist system. A larger receive buffer can handle greater jitter, but increases the delay.
Receiver buffer size Maximum jitter
80 ms 20
160 ms 40
320 ms 80
5 PRIORITIES
Often VoIP traffic will be transmitted together with other traffic in the same network.
That’s the main reason why the VoIP traffic is delayed and jittered. The bottlenecks in the network are switches, routers and wide area networks with limited bandwidth.
If the network is shared among VoIP and other data services, the administrator must plan / think about priorities.
IP traffic can use priorities which is called “quality of service (QoS)”. IP packets can be marked with “type of service (ToS)” bits in the IP header. In IP version 4 it is not mandatory that routers and switches support quality of service, but in professional equipment it is very common. When IP traffic is marked with higher quality of service, network equipment can switch / forward it faster than ordinary traffic.
Artist channels can be configured individually with a ToS field value in the VoIP property sheet. The configured value will be used in the IP header of all packets.
The interpretation of the ToS field by network equipment is not exactly specified, but a very common way is “Differentiated Service Code Point” (DSCP).
Page 12/15
5.1 Differentiated Services Code Point
Differentiated Services (DiffServ) is a new model in which traffic is treated by intermediate systems with relative priorities based on the type of services (ToS) field. Defined in RFC 2474 and RFC 2475, the DiffServ standard supersedes the original specification for defining packet priority described in RFC 791. DiffServ increases the number of definable priority levels by reallocating bits of an IP packet for priority marking.
The DiffServ architecture defines the DiffServ (DS) field, which supersedes the ToS field in IPv4 to make per-hop behavior (PHB) decisions about packet classification and traffic conditioning functions, such as metering, marking, shaping, and policing.
The RFCs do not dictate the way to implement PHBs; this is the responsibility of the vendor.
DS5 DS4 DS3 DS2 DS1 DS0 ECN ECN
DSCP—six bits (DS5-DS0) ECN—two bits, currently unused
DiffServ uses the most significant bits (DS5, DS4 and DS3) for priority setting.
Precedence Level
DS5 DS4 DS3 Description
7 111 Link layer and routing protocol
keep alive
Lowest latency and jitter
6 110 Used for IP routing protocols
5 101 Express Forwarding
Voice and Video
4 100 Controlled Load
(Streaming Multimedia)
3 011 Excellent Load
(Business Critical)
2 010 Standard (Spare)
1 001 Background
0 000 Best effort
DS2 and DS1 specify the drop probability; bit DS0 is always zero.
Drop probability
DS2 DS1 DS0
Low 010
Medium 100
High 110
Example for Precedence level 5, drop probability low:
101 010 00 = 0xA8 = 168
Page 14/15
6 NETWORK PROVIDER
Typically a network provider offers data services with specified network quality, often with different levels (and prices). E.g. it could look like this.
Quality level Basic Advanced Primary Voice Packet loss
(end to end)
1% 0,30% 0,20% 0,10%
Round trip delay (end to end)
- 80 ms 80 ms 60 ms
Jitter (end to end)
- - 30 ms 12 ms
A specified network should always be used, if possible. The administrator has the possibility to plan the network and beforehand and furthermore it doesn’t change when using it.
If the network is unspecified, you don’t know what happens in practice
7 LINKS
IP Protocol: http://en.wikipedia.org/wiki/IP
UDP Protocol:: http://en.wikipedia.org/wiki/User_Datagram_Protocol
“Understanding Jitter in Packet Voice Networks”
http://www.cisco.com/en/US/tech/tk652/tk698/technologies_tech_note09186a00800 945df.shtml
“Understanding Delay in Packet Voice Networks”
http://www.cisco.com/en/US/tech/tk652/tk698/technologies_white_paper09186a008 00a8993.shtml