What are the roles of the RTP timestamp and sequence numbers?
The timestamp is used to place the incoming audio and video packets in the correct timing order (playout delay compensation). The sequence number is mainly used to detect losses. Sequence numbers increase by one for each RTP packet transmitted, timestamps increase by the time "covered" by a packet. For video formats where a video frame is split across several RTP packets, several packets may have the same timestamp. In some cases such as carrying DTMF (touch tone) data (RFC 2833), RTP timestamps may not be monotonic.
What is the RTP timestamp in the RTCP sender report used for?
The RTP timestamp and NTP timestamps form a pair that identify the absolute time of a particular sample in the stream. For example, if the RTCP sender report contains an RTP timestamp of 1234 and an NTP timestamp indicating February 3, 10:14:15, it means that sample 1234 in the media stream occured exactly on February 3, 10:14:15.
How is the jitter computed?
If several packets, say, within a video frame, bear the same timestamp, it is advisable to only use the first packet in a frame to compute the jitter. (This issue may be addressed in a future version of the specification.)
Jitter is computed in timestamp units. For example, for an audio stream sampled at 8,000 Hz, the arrival time measured with the local clock is converted by multiplying the seconds by 8,000.
Steve Casner wrote:
For encodings such as MPEG that transmit data in a different order than it was sampled, this adds noise into the jitter calculation. I have heard handwavy arguments that this factor can be calculated out given that you know the shape of the noise, but my math isn't strong enough for that.
In many of the cases that we care about, the jitter introduced by MPEG will be small enough that when the network jitter is of the same order we don't have a problem anyway.
There is another problem for video in that all of the packets of a frame have the same timestamp because the whole frame is sampled at once. However, the dispersion in time of those packets really is all part of the network transfer process that the receiver must accommodate with its buffer.
It has been suggested that jitter be calculated only on the first packet of a video frame, or only on "I" frames for MPEG. However, that may color the results also because those packets may see transit delays different than the following packets see.
The main point to remember is that the primary function of the RTP timestamp is to represent the inherent notion of real time associated with the media. It also turns out to be useful for the jitter measure, but that is a secondary function.
The jitter value is not expected to be useful as an absolute value. It is more useful as a means of comparing the reception quality at two receiver or comparing the reception quality 5 minutes ago to now.
References:
https://www.cs.columbia.edu/~hgs/rtp/faq.html#timestamp-computed
No comments:
Post a Comment