Happened to read the Cisco article and below some notes and some additions from other readings.
Below are the sections to highlight
1. Converting analog to digital form
Human speech frequency is anywhere between 200/300 Hx - 2700/2800Hz. The equipment supports maximum of 4Khz.
Below is a description of Nyquist theorem. According to this, the sampling frequency should be twice as the maximum frequency of analog signal.
Suppose the highest frequency component, in hertz, for a given analog signal is fmax. According to the Nyquist Theorem, the sampling rate must be at least 2fmax, or twice the highest analog frequency component. The sampling in an analog-to-digital converter is actuated by a pulse generator (clock). If the sampling rate is less than 2fmax, some of the highest frequency components in the analog input signal will not be correctly represented in the digitized output. When such a digital signal is converted back to analog form by a digital-to-analog converter, false frequency components appear that were not in the original analog signal. This undesirable condition is a form of distortion called aliasing.
Analysing a single byte in a sample which represents a sample.
e.g. 0011 1001
1. first bit from left (0) represents where the signal belongs, up or down of the X axis. in otherwods, -ve or +ve
2. 2-4th from left , i.e. 011 - represents the column window on y axis . for e.g between 1 - 2 or 2-3 etc
3. 5th to 8th bit represents the actual value between the window on y axis as per 2-4th bit
an 8K sampling require 8K * 8 (bits per byte) = 64 Kbit / sec. 64 Kbit/Sec is a good bandwidth. So most of the WAN apps are using the G.729 codec which require only 8 Kbps bandwidth. This is basically codec compression.
Some of the terms are:
MOS (Mean Opinion Score) : This is the score used to map the quality of voice produced at destination end as compared to source and its value ranges from 0 to 5.
For e.g. G.711 MOS is 4.1, G.729 MOS is 3.92. ILBC is 4.1
Some calculation is like this:
Codec Payload size per packet = {(codec bit rate/sec * sample size)/1000} bits
Sample size means length of sample or clipped sample length among the 8k samples from 1 sec analog wave i.e. 20 ms or 30 ms.
this makes the total packet size as below
Total packet size = Codec Payload + 12 Byte (RTP) + 8 Byte (UDP) + 20 Byte (IP) + 4 Byte (FR)
12 Byte (RTP) + 8 Byte (UDP) => Layer 4 header size
20 Byte (IP) => Layer 3 header size
4 Byte (FR) => Layer 2 Header size
Now how many packets are required to transmit 1 second long of data?
- in 1 second there are 1000 ms.
- We take samples every 20 ms and packetise it to trasmit over the network
- This means that to send 1000 ms data, we need 1000/20 = 50 packets
Now whats size of 1 packet?
We have one packet containing 20 ms of data by default.
We take samples of 20 ms and putting it in one packet. in 20 ms, how many samples will be present? Every sample require 8 bit in binaries, i.e. a byte.
8000 samples / second means each 20 ms, we will have 8000/20 = 400 samples. each sample require 8 bits. Then it mean each 20 ms, we will have 400*8 = 3200 bits or 3.2 Kb.
RTP & RTCP : having a good comparison of bodyguard here. RTCP is a bodyguard which helps RTP packets to be re-arranged in a particular order.
Delay : Voice traffic is very sensible to delay. Cisco recommends maximum of 200ms of delay between source and destination. While ITU-T recommends it can be maximum of 150ms.
Compression: Two types of compression cRTP protocol, which is for compressing the headers. This is designed to reduce the IP/UDP/RTP headers to two bytes for most of the packets where no UDP checksums are being sent, or four bytes with Checksums. This follows RFC 2508 which is mainly depend on RFC 1144. cRTP specifies two formats :
- Compressed RTP (CR) => Used when IP, UDP, RTP headers remain consistent.
- Compressed UDP (CU) => used when there are large changes in the RTP timestamp or when the RTP payload type changes. The IP and UDP headers are compressed, RTP header is not.
PRI / BRI : PRI is typically used by Medium to large enterprises with Digital PBX telephone systems to provide digital access to PSTN. The B Channels may be used flexibly and re-assigned when necessary to meet special needs such as video conferences.
references:
https://learningnetwork.cisco.com/blogs/vip-perspectives/2014/11/20/first-date-with-voip
https://www.cisco.com/c/en/us/support/docs/quality-of-service-qos/qos-link-efficiency-mechanisms/22308-compression-qos.html
http://what-when-how.com/cisco-voice-over-ip-cvoice/voip-fundamentals-introducing-voice-over-ip-networks-part-2/
Below are the sections to highlight
1. Converting analog to digital form
Human speech frequency is anywhere between 200/300 Hx - 2700/2800Hz. The equipment supports maximum of 4Khz.
Below is a description of Nyquist theorem. According to this, the sampling frequency should be twice as the maximum frequency of analog signal.
Suppose the highest frequency component, in hertz, for a given analog signal is fmax. According to the Nyquist Theorem, the sampling rate must be at least 2fmax, or twice the highest analog frequency component. The sampling in an analog-to-digital converter is actuated by a pulse generator (clock). If the sampling rate is less than 2fmax, some of the highest frequency components in the analog input signal will not be correctly represented in the digitized output. When such a digital signal is converted back to analog form by a digital-to-analog converter, false frequency components appear that were not in the original analog signal. This undesirable condition is a form of distortion called aliasing.
Analysing a single byte in a sample which represents a sample.
e.g. 0011 1001
1. first bit from left (0) represents where the signal belongs, up or down of the X axis. in otherwods, -ve or +ve
2. 2-4th from left , i.e. 011 - represents the column window on y axis . for e.g between 1 - 2 or 2-3 etc
3. 5th to 8th bit represents the actual value between the window on y axis as per 2-4th bit
an 8K sampling require 8K * 8 (bits per byte) = 64 Kbit / sec. 64 Kbit/Sec is a good bandwidth. So most of the WAN apps are using the G.729 codec which require only 8 Kbps bandwidth. This is basically codec compression.
Some of the terms are:
MOS (Mean Opinion Score) : This is the score used to map the quality of voice produced at destination end as compared to source and its value ranges from 0 to 5.
For e.g. G.711 MOS is 4.1, G.729 MOS is 3.92. ILBC is 4.1
Some calculation is like this:
Codec Payload size per packet = {(codec bit rate/sec * sample size)/1000} bits
Sample size means length of sample or clipped sample length among the 8k samples from 1 sec analog wave i.e. 20 ms or 30 ms.
this makes the total packet size as below
Total packet size = Codec Payload + 12 Byte (RTP) + 8 Byte (UDP) + 20 Byte (IP) + 4 Byte (FR)
12 Byte (RTP) + 8 Byte (UDP) => Layer 4 header size
20 Byte (IP) => Layer 3 header size
4 Byte (FR) => Layer 2 Header size
Now how many packets are required to transmit 1 second long of data?
- in 1 second there are 1000 ms.
- We take samples every 20 ms and packetise it to trasmit over the network
- This means that to send 1000 ms data, we need 1000/20 = 50 packets
Now whats size of 1 packet?
We have one packet containing 20 ms of data by default.
We take samples of 20 ms and putting it in one packet. in 20 ms, how many samples will be present? Every sample require 8 bit in binaries, i.e. a byte.
8000 samples / second means each 20 ms, we will have 8000/20 = 400 samples. each sample require 8 bits. Then it mean each 20 ms, we will have 400*8 = 3200 bits or 3.2 Kb.
RTP & RTCP : having a good comparison of bodyguard here. RTCP is a bodyguard which helps RTP packets to be re-arranged in a particular order.
Delay : Voice traffic is very sensible to delay. Cisco recommends maximum of 200ms of delay between source and destination. While ITU-T recommends it can be maximum of 150ms.
Compression: Two types of compression cRTP protocol, which is for compressing the headers. This is designed to reduce the IP/UDP/RTP headers to two bytes for most of the packets where no UDP checksums are being sent, or four bytes with Checksums. This follows RFC 2508 which is mainly depend on RFC 1144. cRTP specifies two formats :
- Compressed RTP (CR) => Used when IP, UDP, RTP headers remain consistent.
- Compressed UDP (CU) => used when there are large changes in the RTP timestamp or when the RTP payload type changes. The IP and UDP headers are compressed, RTP header is not.
PRI / BRI : PRI is typically used by Medium to large enterprises with Digital PBX telephone systems to provide digital access to PSTN. The B Channels may be used flexibly and re-assigned when necessary to meet special needs such as video conferences.
references:
https://learningnetwork.cisco.com/blogs/vip-perspectives/2014/11/20/first-date-with-voip
https://www.cisco.com/c/en/us/support/docs/quality-of-service-qos/qos-link-efficiency-mechanisms/22308-compression-qos.html
http://what-when-how.com/cisco-voice-over-ip-cvoice/voip-fundamentals-introducing-voice-over-ip-networks-part-2/
No comments:
Post a Comment